Age | Commit message (Collapse) | Author |
|
Set eswitch inline-mode to be u8, not u16. Otherwise, errors below
$ devlink dev eswitch set pci/0000:08:00.0 mode switchdev \
inline-mode network
Error: Attribute failed policy validation.
kernel answers: Numerical result out of rang
netlink: 'devlink': attribute type 26 has an invalid length.
Fixes: f2f9dd164db0 ("netlink: specs: devlink: add the remaining command to generate complete split_ops")
Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240310164547.35219-1-witu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The logic for enabling the TX clock shift is inverse of enabling the RX
clock shift. The TX clock shift is disabled when DP83822_TX_CLK_SHIFT is
set. Correct the current behavior and always write the delay configuration
to ensure consistent delay settings regardless of bootloader configuration.
Reference: https://www.ti.com/lit/ds/symlink/dp83822i.pdf p. 69
Fixes: 8095295292b5 ("net: phy: DP83822: Add setting the fixed internal delay")
Signed-off-by: Tim Pambor <tp@osasysteme.de>
Link: https://lore.kernel.org/r/20240305110608.104072-1-tp@osasysteme.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-03-06 (igc, igb, ice)
This series contains updates to igc, igb, and ice drivers.
Vinicius removes double clearing of interrupt register which could cause
timestamp events to be missed on igc and igb.
Przemek corrects calculation of statistics which caused incorrect spikes
in reporting for ice driver.
* '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: fix stats being updated by way too large values
igb: Fix missing time sync events
igc: Fix missing time sync events
====================
Link: https://lore.kernel.org/r/20240306182617.625932-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, if there are multiple registrations of the same pin on the
same dpll device, following warnings are observed:
WARNING: CPU: 5 PID: 2212 at drivers/dpll/dpll_core.c:143 dpll_xa_ref_pin_del.isra.0+0x21e/0x230
WARNING: CPU: 5 PID: 2212 at drivers/dpll/dpll_core.c:223 __dpll_pin_unregister+0x2b3/0x2c0
The problem is, that in both dpll_xa_ref_dpll_del() and
dpll_xa_ref_pin_del() registration is only removed from list in case the
reference count drops to zero. That is wrong, the registration has to
be removed always.
To fix this, remove the registration from the list and free
it unconditionally, instead of doing it only when the ref reference
counter reaches zero.
Fixes: 9431063ad323 ("dpll: core: Add DPLL framework base functions")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The phy_get_internal_delay function could try to access to an empty
array in the case that the driver is calling phy_get_internal_delay
without defining delay_values and rx-internal-delay-ps or
tx-internal-delay-ps is defined to 0 in the device-tree.
This will lead to "unable to handle kernel NULL pointer dereference at
virtual address 0". To avoid this kernel oops, the test should be delay
>= 0. As there is already delay < 0 test just before, the test could
only be size == 0.
Fixes: 92252eec913b ("net: phy: Add a helper to return the index for of the internal delay")
Co-developed-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt@savoirfairelinux.com>
Signed-off-by: Enguerrand de Ribaucourt <enguerrand.de-ribaucourt@savoirfairelinux.com>
Signed-off-by: Kévin L'hôpital <kevin.lhopital@savoirfairelinux.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Devlink param for adjusting NPC MCAM high zone
area is in wrong param list and is not getting
activated on CN10KA silicon.
That patch fixes this issue.
Fixes: dd7842878633 ("octeontx2-af: Add new devlink param to configure maximum usable NIX block LFs")
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Sai Krishna <saikrishnag@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Apply the same fix than ones found in :
8d975c15c0cd ("ip6_tunnel: make sure to pull inner header in __ip6_tnl_rcv()")
1ca1ba465e55 ("geneve: make sure to pull inner header in geneve_rx()")
We have to save skb->network_header in a temporary variable
in order to be able to recompute the network_header pointer
after a pskb_inet_may_pull() call.
pskb_inet_may_pull() makes sure the needed headers are in skb->head.
syzbot reported:
BUG: KMSAN: uninit-value in __INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline]
BUG: KMSAN: uninit-value in INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline]
BUG: KMSAN: uninit-value in IP_ECN_decapsulate include/net/inet_ecn.h:302 [inline]
BUG: KMSAN: uninit-value in ip_tunnel_rcv+0xed9/0x2ed0 net/ipv4/ip_tunnel.c:409
__INET_ECN_decapsulate include/net/inet_ecn.h:253 [inline]
INET_ECN_decapsulate include/net/inet_ecn.h:275 [inline]
IP_ECN_decapsulate include/net/inet_ecn.h:302 [inline]
ip_tunnel_rcv+0xed9/0x2ed0 net/ipv4/ip_tunnel.c:409
__ipgre_rcv+0x9bc/0xbc0 net/ipv4/ip_gre.c:389
ipgre_rcv net/ipv4/ip_gre.c:411 [inline]
gre_rcv+0x423/0x19f0 net/ipv4/ip_gre.c:447
gre_rcv+0x2a4/0x390 net/ipv4/gre_demux.c:163
ip_protocol_deliver_rcu+0x264/0x1300 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x2b8/0x440 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_local_deliver+0x21f/0x490 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:461 [inline]
ip_rcv_finish net/ipv4/ip_input.c:449 [inline]
NF_HOOK include/linux/netfilter.h:314 [inline]
ip_rcv+0x46f/0x760 net/ipv4/ip_input.c:569
__netif_receive_skb_one_core net/core/dev.c:5534 [inline]
__netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5648
netif_receive_skb_internal net/core/dev.c:5734 [inline]
netif_receive_skb+0x58/0x660 net/core/dev.c:5793
tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1556
tun_get_user+0x53b9/0x66e0 drivers/net/tun.c:2009
tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2055
call_write_iter include/linux/fs.h:2087 [inline]
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0xb6b/0x1520 fs/read_write.c:590
ksys_write+0x20f/0x4c0 fs/read_write.c:643
__do_sys_write fs/read_write.c:655 [inline]
__se_sys_write fs/read_write.c:652 [inline]
__x64_sys_write+0x93/0xd0 fs/read_write.c:652
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
Uninit was created at:
__alloc_pages+0x9a6/0xe00 mm/page_alloc.c:4590
alloc_pages_mpol+0x62b/0x9d0 mm/mempolicy.c:2133
alloc_pages+0x1be/0x1e0 mm/mempolicy.c:2204
skb_page_frag_refill+0x2bf/0x7c0 net/core/sock.c:2909
tun_build_skb drivers/net/tun.c:1686 [inline]
tun_get_user+0xe0a/0x66e0 drivers/net/tun.c:1826
tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2055
call_write_iter include/linux/fs.h:2087 [inline]
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0xb6b/0x1520 fs/read_write.c:590
ksys_write+0x20f/0x4c0 fs/read_write.c:643
__do_sys_write fs/read_write.c:655 [inline]
__se_sys_write fs/read_write.c:652 [inline]
__x64_sys_write+0x93/0xd0 fs/read_write.c:652
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcf/0x1e0 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When rule policy is changed, ipv6 socket cache is not refreshed.
The sock's skb still uses a outdated route cache and was sent to
a wrong interface.
To avoid this error we should update fib node's version when
rule is changed. Then skb's route will be reroute checked as
route cache version is already different with fib node version.
The route cache is refreshed to match the latest rule.
Fixes: 101367c2f8c4 ("[IPV6]: Policy Routing Rules")
Signed-off-by: Shiming Cheng <shiming.cheng@mediatek.com>
Signed-off-by: Lena Wang <lena.wang@mediatek.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
soft reset
This driver has two separate reset sequence in different places:
- gpio/HW reset on start of ksz_switch_register()
- SW reset on start of ksz_setup()
The second one will overwrite drive strength configuration made in the
ksz_switch_register().
To fix it, move ksz_parse_drive_strength() from ksz_switch_register() to
ksz_setup().
Fixes: d67d7247f641 ("net: dsa: microchip: Add drive strength configuration")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240304135612.814404-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf, ipsec and netfilter.
No solution yet for the stmmac issue mentioned in the last PR, but it
proved to be a lockdep false positive, not a blocker.
Current release - regressions:
- dpll: move all dpll<>netdev helpers to dpll code, fix build
regression with old compilers
Current release - new code bugs:
- page_pool: fix netlink dump stop/resume
Previous releases - regressions:
- bpf: fix verifier to check bpf_func_state->callback_depth when
pruning states as otherwise unsafe programs could get accepted
- ipv6: avoid possible UAF in ip6_route_mpath_notify()
- ice: reconfig host after changing MSI-X on VF
- mlx5:
- e-switch, change flow rule destination checking
- add a memory barrier to prevent a possible null-ptr-deref
- switch to using _bh variant of of spinlock where needed
Previous releases - always broken:
- netfilter: nf_conntrack_h323: add protection for bmp length out of
range
- bpf: fix to zero-initialise xdp_rxq_info struct before running XDP
program in CPU map which led to random xdp_md fields
- xfrm: fix UDP encapsulation in TX packet offload
- netrom: fix data-races around sysctls
- ice:
- fix potential NULL pointer dereference in ice_bridge_setlink()
- fix uninitialized dplls mutex usage
- igc: avoid returning frame twice in XDP_REDIRECT
- i40e: disable NAPI right after disabling irqs when handling
xsk_pool
- geneve: make sure to pull inner header in geneve_rx()
- sparx5: fix use after free inside sparx5_del_mact_entry
- dsa: microchip: fix register write order in ksz8_ind_write8()
Misc:
- selftests: mptcp: fixes for diag.sh"
* tag 'net-6.8-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (63 commits)
net: pds_core: Fix possible double free in error handling path
netrom: Fix data-races around sysctl_net_busy_read
netrom: Fix a data-race around sysctl_netrom_link_fails_count
netrom: Fix a data-race around sysctl_netrom_routing_control
netrom: Fix a data-race around sysctl_netrom_transport_no_activity_timeout
netrom: Fix a data-race around sysctl_netrom_transport_requested_window_size
netrom: Fix a data-race around sysctl_netrom_transport_busy_delay
netrom: Fix a data-race around sysctl_netrom_transport_acknowledge_delay
netrom: Fix a data-race around sysctl_netrom_transport_maximum_tries
netrom: Fix a data-race around sysctl_netrom_transport_timeout
netrom: Fix data-races around sysctl_netrom_network_ttl_initialiser
netrom: Fix a data-race around sysctl_netrom_obsolescence_count_initialiser
netrom: Fix a data-race around sysctl_netrom_default_path_quality
netfilter: nf_conntrack_h323: Add protection for bmp length out of range
netfilter: nf_tables: mark set as dead when unbinding anonymous set with timeout
netfilter: nft_ct: fix l3num expectations with inet pseudo family
netfilter: nf_tables: reject constant set with timeout
netfilter: nf_tables: disallow anonymous set with timeout flag
net/rds: fix WARNING in rds_conn_connect_if_down
net: dsa: microchip: fix register write order in ksz8_ind_write8()
...
|
|
When auxiliary_device_add() returns error and then calls
auxiliary_device_uninit(), Callback function pdsc_auxbus_dev_release
calls kfree(padev) to free memory. We shouldn't call kfree(padev)
again in the error handling path.
Fix this by cleaning up the redundant kfree() and putting
the error handling back to where the errors happened.
Fixes: 4569cce43bc6 ("pds_core: add auxiliary_bus devices")
Signed-off-by: Yongzhi Liu <hyperlyzcs@gmail.com>
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20240306105714.20597-1-hyperlyzcs@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains fixes for net:
Patch #1 disallows anonymous sets with timeout, except for dynamic sets.
Anonymous sets with timeouts using the pipapo set backend makes
no sense from userspace perspective.
Patch #2 rejects constant sets with timeout which has no practical usecase.
This kind of set, once bound, contains elements that expire but
no new elements can be added.
Patch #3 restores custom conntrack expectations with NFPROTO_INET,
from Florian Westphal.
Patch #4 marks rhashtable anonymous set with timeout as dead from the
commit path to avoid that async GC collects these elements. Rules
that refers to the anonymous set get released with no mutex held
from the commit path.
Patch #5 fixes a UBSAN shift overflow in H.323 conntrack helper,
from Lena Wang.
netfilter pull request 24-03-07
* tag 'nf-24-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_conntrack_h323: Add protection for bmp length out of range
netfilter: nf_tables: mark set as dead when unbinding anonymous set with timeout
netfilter: nft_ct: fix l3num expectations with inet pseudo family
netfilter: nf_tables: reject constant set with timeout
netfilter: nf_tables: disallow anonymous set with timeout flag
====================
Link: https://lore.kernel.org/r/20240307021545.149386-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Jason Xing says:
====================
netrom: Fix all the data-races around sysctls
As the title said, in this patchset I fix the data-race issues because
the writer and the reader can manipulate the same value concurrently.
====================
Link: https://lore.kernel.org/r/20240304082046.64977-1-kerneljasonxing@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value because the
value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value because the
value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value because the
value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value because the
value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value because the
value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value because the
value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value because the
value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value because the
value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value because the
value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value because the
value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading the sysctl value
because the value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We need to protect the reader reading sysctl_netrom_default_path_quality
because the value can be changed concurrently.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
Steffen Klassert says:
====================
pull request (net): ipsec 2024-03-06
1) Clear the ECN bits flowi4_tos in decode_session4().
This was already fixed but the bug was reintroduced
when decode_session4() switched to us the flow dissector.
From Guillaume Nault.
2) Fix UDP encapsulation in the TX path with packet offload mode.
From Leon Romanovsky,
3) Avoid clang fortify warning in copy_to_user_tmpl().
From Nathan Chancellor.
4) Fix inter address family tunnel in packet offload mode.
From Mike Yu.
* tag 'ipsec-2024-03-06' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec:
xfrm: set skb control buffer based on packet offload as well
xfrm: fix xfrm child route lookup for packet offload
xfrm: Avoid clang fortify warning in copy_to_user_tmpl()
xfrm: Pass UDP encapsulation in TX packet offload
xfrm: Clear low order bits of ->flowi4_tos in decode_session4().
====================
Link: https://lore.kernel.org/r/20240306100438.3953516-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-03-06
We've added 5 non-merge commits during the last 1 day(s) which contain
a total of 5 files changed, 77 insertions(+), 4 deletions(-).
The main changes are:
1) Fix BPF verifier to check bpf_func_state->callback_depth when pruning
states as otherwise unsafe programs could get accepted,
from Eduard Zingerman.
2) Fix to zero-initialise xdp_rxq_info struct before running XDP program in
CPU map which led to random xdp_md fields, from Toke Høiland-Jørgensen.
3) Fix bonding XDP feature flags calculation when bonding device has no
slave devices anymore, from Daniel Borkmann.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
cpumap: Zero-initialise xdp_rxq_info struct before running XDP program
selftests/bpf: Fix up xdp bonding test wrt feature flags
xdp, bonding: Fix feature flags when there are no slave devs anymore
selftests/bpf: test case for callback_depth states pruning logic
bpf: check bpf_func_state->callback_depth when pruning states
====================
Link: https://lore.kernel.org/r/20240306220309.13534-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
UBSAN load reports an exception of BRK#5515 SHIFT_ISSUE:Bitwise shifts
that are out of bounds for their data type.
vmlinux get_bitmap(b=75) + 712
<net/netfilter/nf_conntrack_h323_asn1.c:0>
vmlinux decode_seq(bs=0xFFFFFFD008037000, f=0xFFFFFFD008037018, level=134443100) + 1956
<net/netfilter/nf_conntrack_h323_asn1.c:592>
vmlinux decode_choice(base=0xFFFFFFD0080370F0, level=23843636) + 1216
<net/netfilter/nf_conntrack_h323_asn1.c:814>
vmlinux decode_seq(f=0xFFFFFFD0080371A8, level=134443500) + 812
<net/netfilter/nf_conntrack_h323_asn1.c:576>
vmlinux decode_choice(base=0xFFFFFFD008037280, level=0) + 1216
<net/netfilter/nf_conntrack_h323_asn1.c:814>
vmlinux DecodeRasMessage() + 304
<net/netfilter/nf_conntrack_h323_asn1.c:833>
vmlinux ras_help() + 684
<net/netfilter/nf_conntrack_h323_main.c:1728>
vmlinux nf_confirm() + 188
<net/netfilter/nf_conntrack_proto.c:137>
Due to abnormal data in skb->data, the extension bitmap length
exceeds 32 when decoding ras message then uses the length to make
a shift operation. It will change into negative after several loop.
UBSAN load could detect a negative shift as an undefined behaviour
and reports exception.
So we add the protection to avoid the length exceeding 32. Or else
it will return out of range error and stop decoding.
Fixes: 5e35941d9901 ("[NETFILTER]: Add H.323 conntrack/NAT helper")
Signed-off-by: Lena Wang <lena.wang@mediatek.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
While the rhashtable set gc runs asynchronously, a race allows it to
collect elements from anonymous sets with timeouts while it is being
released from the commit path.
Mingi Cho originally reported this issue in a different path in 6.1.x
with a pipapo set with low timeouts which is not possible upstream since
7395dfacfff6 ("netfilter: nf_tables: use timestamp to check for set
element timeout").
Fix this by setting on the dead flag for anonymous sets to skip async gc
in this case.
According to 08e4c8c5919f ("netfilter: nf_tables: mark newset as dead on
transaction abort"), Florian plans to accelerate abort path by releasing
objects via workqueue, therefore, this sets on the dead flag for abort
path too.
Cc: stable@vger.kernel.org
Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane")
Reported-by: Mingi Cho <mgcho.minic@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Following is rejected but should be allowed:
table inet t {
ct expectation exp1 {
[..]
l3proto ip
Valid combos are:
table ip t, l3proto ip
table ip6 t, l3proto ip6
table inet t, l3proto ip OR l3proto ip6
Disallow inet pseudeo family, the l3num must be a on-wire protocol known
to conntrack.
Retain NFPROTO_INET case to make it clear its rejected
intentionally rather as oversight.
Fixes: 8059918a1377 ("netfilter: nft_ct: sanitize layer 3 and 4 protocol number in custom expectations")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This set combination is weird: it allows for elements to be
added/deleted, but once bound to the rule it cannot be updated anymore.
Eventually, all elements expire, leading to an empty set which cannot
be updated anymore. Reject this flags combination.
Cc: stable@vger.kernel.org
Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Anonymous sets are never used with timeout from userspace, reject this.
Exception to this rule is NFT_SET_EVAL to ensure legacy meters still work.
Cc: stable@vger.kernel.org
Fixes: 761da2935d6e ("netfilter: nf_tables: add set timeout API support")
Reported-by: lonial con <kongln9170@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Simplify stats accumulation logic to fix the case where we don't take
previous stat value into account, we should always respect it.
Main netdev stats of our PF (Tx/Rx packets/bytes) were reported orders of
magnitude too big during OpenStack reconfiguration events, possibly other
reconfiguration cases too.
The regression was reported to be between 6.1 and 6.2, so I was almost
certain that on of the two "preserve stats over reset" commits were the
culprit. While reading the code, it was found that in some cases we will
increase the stats by arbitrarily large number (thanks to ignoring "-prev"
part of condition, after zeroing it).
Note that this fixes also the case where we were around limits of u64, but
that was not the regression reported.
Full disclosure: I remember suggesting this particular piece of code to
Ben a few years ago, so blame on me.
Fixes: 2fd5e433cd26 ("ice: Accumulate HW and Netdev statistics over reset")
Reported-by: Nebojsa Stevanovic <nebojsa.stevanovic@gcore.com>
Link: https://lore.kernel.org/intel-wired-lan/VI1PR02MB439744DEDAA7B59B9A2833FE912EA@VI1PR02MB4397.eurprd02.prod.outlook.com
Reported-by: Christian Rohmann <christian.rohmann@inovex.de>
Link: https://lore.kernel.org/intel-wired-lan/f38a6ca4-af05-48b1-a3e6-17ef2054e525@inovex.de
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Fix "double" clearing of interrupts, which can cause external events
or timestamps to be missed.
The E1000_TSIRC Time Sync Interrupt Cause register can be cleared in two
ways, by either reading it or by writing '1' into the specific cause
bit. This is documented in section 8.16.1.
The following flow was used:
1. read E1000_TSIRC into 'tsicr';
2. handle the interrupts present into 'tsirc' and mark them in 'ack';
3. write 'ack' into E1000_TSICR;
As both (1) and (3) will clear the interrupt cause, if the same
interrupt happens again between (1) and (3) it will be ignored,
causing events to be missed.
Remove the extra clear in (3).
Fixes: 00c65578b47b ("igb: enable internal PPS for the i210")
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Fix "double" clearing of interrupts, which can cause external events
or timestamps to be missed.
The IGC_TSIRC Time Sync Interrupt Cause register can be cleared in two
ways, by either reading it or by writing '1' into the specific cause
bit. This is documented in section 8.16.1.
The following flow was used:
1. read IGC_TSIRC into 'tsicr';
2. handle the interrupts present in 'tsirc' and mark them in 'ack';
3. write 'ack' into IGC_TSICR;
As both (1) and (3) will clear the interrupt cause, if the same
interrupt happens again between (1) and (3) it will be ignored,
causing events to be missed.
Remove the extra clear in (3).
Fixes: 2c344ae24501 ("igc: Add support for TX timestamping")
Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de>
Tested-by: Kurt Kanzenbach <kurt@linutronix.de> # Intel i225
Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Naama Meir <naamax.meir@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Get rid of copy_mc flag in iov_iter which really only makes sense for
the core dumping code so move it out of the generic iov iter code and
make it coredump's problem. See the detailed commit description.
- Revert fs/aio: Make io_cancel() generate completions again
The initial fix here was predicated on the assumption that calling
ki_cancel() didn't complete aio requests. However, that turned out to
be wrong since the two drivers that actually make use of this set a
cancellation function that performs the cancellation correctly. So
revert this change.
- Ensure that the test for IOCB_AIO_RW always happens before the read
from ki_ctx.
* tag 'vfs-6.8-release.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
iov_iter: get rid of 'copy_mc' flag
fs/aio: Check IOCB_AIO_RW before the struct aio_kiocb conversion
Revert "fs/aio: Make io_cancel() generate completions again"
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"These should be the final fixes for the soc tree for 6.8, as usual
they mostly deal wtih dts files:
- Qualcomm fixes for pcie4 on sc8280xp, a revert of msm8996 mpm
support, sm6115 interconnect and sm8650 gpio.
- Two fixes for Tegra234 ethernet
- A Makefile fix to actually build the allwinner based orange pi zero
2w device tree
- Fixes for clocks and reset on imx8mp and a DSI display regression
on imx7.
The non-DT fixes are:
- Firmware fixes addressing a kernel panic in op-tee and a minor
regression in microchip/riscv.
- A defconfig change to bring back backlight support after a Kconfig
change"
* tag 'arm-fixes-6.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
firmware: microchip: Fix over-requested allocation size
tee: optee: Fix kernel panic caused by incorrect error handling
Revert "arm64: dts: qcom: msm8996: Hook up MPM"
arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed
arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed
arm64: dts: imx8mp: Fix LDB clocks property
arm64: dts: imx8mp: Fix TC9595 reset GPIO on DH i.MX8M Plus DHCOM SoM
MAINTAINERS: Use a proper mailinglist for NXP i.MX development
ARM: dts: imx7: remove DSI port endpoints
arm64: dts: allwinner: h616: Add Orange Pi Zero 2W to Makefile
ARM: imx_v6_v7_defconfig: Restore CONFIG_BACKLIGHT_CLASS_DEVICE
arm64: tegra: Fix Tegra234 MGBE power-domains
arm64: tegra: Set the correct PHY mode for MGBE
arm64: dts: qcom: sm6115: Fix missing interconnect-names
arm64: dts: qcom: sm8650-mtp: add gpio74 as reserved gpio
arm64: dts: qcom: sm8650-qrd: add gpio74 as reserved gpio
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"Fix potential use-after-frees in rk3288 and sun8i-ce"
* tag 'v6.8-p6' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: rk3288 - Fix use after free in unprepare
crypto: sun8i-ce - Fix use after free in unprepare
|
|
If connection isn't established yet, get_mr() will fail, trigger connection after
get_mr().
Fixes: 584a8279a44a ("RDS: RDMA: return appropriate error on rdma map failures")
Reported-and-tested-by: syzbot+d4faee732755bba9838e@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-03-05 (idpf, ice, i40e, igc, e1000e)
This series contains updates to idpf, ice, i40e, igc and e1000e drivers.
Emil disables local BH on NAPI schedule for proper handling of softirqs
on idpf.
Jake stops reporting of virtchannel RSS option which in unsupported on
ice.
Rand Deeb adds null check to prevent possible null pointer dereference
on ice.
Michal Schmidt moves DPLL mutex initialization to resolve uninitialized
mutex usage for ice.
Jesse fixes incorrect variable usage for calculating Tx stats on ice.
Ivan Vecera corrects logic for firmware equals check on i40e.
Florian Kauer prevents memory corruption for XDP_REDIRECT on igc.
Sasha reverts an incorrect use of FIELD_GET which caused a regression
for Wake on LAN on e1000e.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This flag is only set by one single user: the magical core dumping code
that looks up user pages one by one, and then writes them out using
their kernel addresses (by using a BVEC_ITER).
That actually ends up being a huge problem, because while we do use
copy_mc_to_kernel() for this case and it is able to handle the possible
machine checks involved, nothing else is really ready to handle the
failures caused by the machine check.
In particular, as reported by Tong Tiangen, we don't actually support
fault_in_iov_iter_readable() on a machine check area.
As a result, the usual logic for writing things to a file under a
filesystem lock, which involves doing a copy with page faults disabled
and then if that fails trying to fault pages in without holding the
locks with fault_in_iov_iter_readable() does not work at all.
We could decide to always just make the MC copy "succeed" (and filling
the destination with zeroes), and that would then create a core dump
file that just ignores any machine checks.
But honestly, this single special case has been problematic before, and
means that all the normal iov_iter code ends up slightly more complex
and slower.
See for example commit c9eec08bac96 ("iov_iter: Don't deal with
iter->copy_mc in memcpy_from_iter_mc()") where David Howells
re-organized the code just to avoid having to check the 'copy_mc' flags
inside the inner iov_iter loops.
So considering that we have exactly one user, and that one user is a
non-critical special case that doesn't actually ever trigger in real
life (Tong found this with manual error injection), the sane solution is
to just decide that the onus on handling the machine check lines on that
user instead.
Ergo, do the copy_mc_to_kernel() in the core dump logic itself, copying
the user data to a stable kernel page before writing it out.
Fixes: f1982740f5e7 ("iov_iter: Convert iterate*() to inline funcs")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Tong Tiangen <tongtiangen@huawei.com>
Link: https://lore.kernel.org/r/20240305133336.3804360-1-tongtiangen@huawei.com
Link: https://lore.kernel.org/all/4e80924d-9c85-f13a-722a-6a5d2b1c225a@huawei.com/
Tested-by: David Howells <dhowells@redhat.com>
Reviewed-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jens Axboe <axboe@kernel.dk>
Reported-by: Tong Tiangen <tongtiangen@huawei.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Mike Yu says:
====================
In the XFRM stack, whether a packet is forwarded to the IPv4
or IPv6 stack depends on the family field of the matched SA.
This does not completely work for IPsec packet offload in some
scenario, for example, sending an IPv6 packet that will be
encrypted and encapsulated as an IPv4 packet in HW.
Here are the patches to make IPsec packet offload work on the
mentioned scenario.
====================
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes
RISC-V firmware drivers for v6.9
A single minor fix for an oversized allocation due to sizeof() misuse by
yours truly that came in since I sent my last fixes PR.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
* tag 'riscv-firmware-for-v6.9' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
firmware: microchip: Fix over-requested allocation size
Link: https://lore.kernel.org/r/20240305-vicinity-dumpling-8943ef26f004@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into arm/fixes
A few more Qualcomm Arm64 DeviceTree fixes for v6.8
This reduces the link speed of the PCIe bus with WiFi-card connected on the
Lenovo ThinkPad X13s and the Qualcomm Compute Reference Device, avoid
link errors and initialization issues reported by users.
It also reverts the enablement of MPM on MSM8996, which is reported to
prevent boards on this platform from booting for some users.
* tag 'qcom-arm64-fixes-for-6.8-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux:
Revert "arm64: dts: qcom: msm8996: Hook up MPM"
arm64: dts: qcom: sc8280xp-x13s: limit pcie4 link speed
arm64: dts: qcom: sc8280xp-crd: limit pcie4 link speed
Link: https://lore.kernel.org/r/20240306031208.4218-1-andersson@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
This bug was noticed while re-implementing parts of the kernel
driver in userspace using spidev. The goal was to enable some
of the errata workarounds that Microchip describes in their
errata sheet [1].
Both the errata sheet and the regular datasheet of e.g. the KSZ8795
imply that you need to do this for indirect register accesses:
- write a 16-bit value to a control register pair (this value
consists of the indirect register table, and the offset inside
the table)
- either read or write an 8-bit value from the data storage
register (indicated by REG_IND_BYTE in the kernel)
The current implementation has the order swapped. It can be
proven, by reading back some indirect register with known content
(the EEE register modified in ksz8_handle_global_errata() is one of
these), that this implementation does not work.
Private discussion with Oleksij Rempel of Pengutronix has revealed
that the workaround was apparantly never tested on actual hardware.
[1] https://ww1.microchip.com/downloads/aemDocuments/documents/OTH/ProductDocuments/Errata/KSZ87xx-Errata-DS80000687C.pdf
Signed-off-by: Tobias Jakobi (Compleo) <tobias.jakobi.compleo@gmail.com>
Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Fixes: 7b6e6235b664 ("net: dsa: microchip: ksz8795: handle eee specif erratum")
Link: https://lore.kernel.org/r/20240304154135.161332-1-tobias.jakobi.compleo@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Older versions of GCC really want to know the full definition
of the type involved in rcu_assign_pointer().
struct dpll_pin is defined in a local header, net/core can't
reach it. Move all the netdev <> dpll code into dpll, where
the type is known. Otherwise we'd need multiple function calls
to jump between the compilation units.
This is the same problem the commit under fixes was trying to address,
but with rcu_assign_pointer() not rcu_dereference().
Some of the exports are not needed, networking core can't
be a module, we only need exports for the helpers used by
drivers.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Link: https://lore.kernel.org/all/35a869c8-52e8-177-1d4d-e57578b99b6@linux-m68k.org/
Fixes: 640f41ed33b5 ("dpll: fix build failure due to rcu_dereference_check() on unknown type")
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240305013532.694866-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When running an XDP program that is attached to a cpumap entry, we don't
initialise the xdp_rxq_info data structure being used in the xdp_buff
that backs the XDP program invocation. Tobias noticed that this leads to
random values being returned as the xdp_md->rx_queue_index value for XDP
programs running in a cpumap.
This means we're basically returning the contents of the uninitialised
memory, which is bad. Fix this by zero-initialising the rxq data
structure before running the XDP program.
Fixes: 9216477449f3 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap")
Reported-by: Tobias Böhm <tobias@aibor.de>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20240305213132.11955-1-toke@redhat.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Adjust the XDP feature flags for the bond device when no bond slave
devices are attached. After 9b0ed890ac2a ("bonding: do not report
NETDEV_XDP_ACT_XSK_ZEROCOPY"), the empty bond device must report 0
as flags instead of NETDEV_XDP_ACT_MASK.
# ./vmtest.sh -- ./test_progs -t xdp_bond
[...]
[ 3.983311] bond1 (unregistering): (slave veth1_1): Releasing backup interface
[ 3.995434] bond1 (unregistering): Released all slaves
[ 4.022311] bond2: (slave veth2_1): Releasing backup interface
#507/1 xdp_bonding/xdp_bonding_attach:OK
#507/2 xdp_bonding/xdp_bonding_nested:OK
#507/3 xdp_bonding/xdp_bonding_features:OK
#507/4 xdp_bonding/xdp_bonding_roundrobin:OK
#507/5 xdp_bonding/xdp_bonding_activebackup:OK
#507/6 xdp_bonding/xdp_bonding_xor_layer2:OK
#507/7 xdp_bonding/xdp_bonding_xor_layer23:OK
#507/8 xdp_bonding/xdp_bonding_xor_layer34:OK
#507/9 xdp_bonding/xdp_bonding_redirect_multi:OK
#507 xdp_bonding:OK
Summary: 1/9 PASSED, 0 SKIPPED, 0 FAILED
[ 4.185255] bond2 (unregistering): Released all slaves
[...]
Fixes: 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Message-ID: <20240305090829.17131-2-daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Commit 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY")
changed the driver from reporting everything as supported before a device
was bonded into having the driver report that no XDP feature is supported
until a real device is bonded as it seems to be more truthful given
eventually real underlying devices decide what XDP features are supported.
The change however did not take into account when all slave devices get
removed from the bond device. In this case after 9b0ed890ac2a, the driver
keeps reporting a feature mask of 0x77, that is, NETDEV_XDP_ACT_MASK &
~NETDEV_XDP_ACT_XSK_ZEROCOPY whereas it should have reported a feature
mask of 0.
Fix it by resetting XDP feature flags in the same way as if no XDP program
is attached to the bond device. This was uncovered by the XDP bond selftest
which let BPF CI fail. After adjusting the starting masks on the latter
to 0 instead of NETDEV_XDP_ACT_MASK the test passes again together with
this fix.
Fixes: 9b0ed890ac2a ("bonding: do not report NETDEV_XDP_ACT_XSK_ZEROCOPY")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Cc: Prashant Batra <prbatra.mail@gmail.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Message-ID: <20240305090829.17131-1-daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Eduard Zingerman says:
====================
check bpf_func_state->callback_depth when pruning states
This patch-set fixes bug in states pruning logic hit in mailing list
discussion [0]. The details of the fix are in patch #1.
The main idea for the fix belongs to Yonghong Song,
mine contribution is merely in review and test cases.
There are some changes in verification performance:
File Program Insns (DIFF) States (DIFF)
------------------------- ------------- --------------- --------------
pyperf600_bpf_loop.bpf.o on_event +15 (+0.42%) +0 (+0.00%)
strobemeta_bpf_loop.bpf.o on_event +857 (+37.95%) +60 (+38.96%)
xdp_synproxy_kern.bpf.o syncookie_tc +2892 (+30.39%) +109 (+36.33%)
xdp_synproxy_kern.bpf.o syncookie_xdp +2892 (+30.01%) +109 (+36.09%)
(when tested on a subset of selftests identified by
selftests/bpf/veristat.cfg and Cilium bpf object files from [4])
Changelog:
v2 [2] -> v3:
- fixes for verifier.c commit message as suggested by Yonghong;
- patch-set re-rerouted to 'bpf' tree as suggested in [2];
- patch for test_tcp_custom_syncookie is sent separately to 'bpf-next' [3].
- veristat results updated using 'bpf' tree as baseline and clang 16.
v1 [1] -> v2:
- patch #2 commit message updated to better reflect verifier behavior
with regards to checkpoints tree (suggested by Yonghong);
- veristat results added (suggested by Andrii).
[0] https://lore.kernel.org/bpf/9b251840-7cb8-4d17-bd23-1fc8071d8eef@linux.dev/
[1] https://lore.kernel.org/bpf/20240212143832.28838-1-eddyz87@gmail.com/
[2] https://lore.kernel.org/bpf/20240216150334.31937-1-eddyz87@gmail.com/
[3] https://lore.kernel.org/bpf/20240222150300.14909-1-eddyz87@gmail.com/
[4] https://github.com/anakryiko/cilium
====================
Link: https://lore.kernel.org/r/20240222154121.6991-1-eddyz87@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|