Age | Commit message (Collapse) | Author |
|
rds_page_remainder_alloc() obtains the current CPU with get_cpu() while
disabling preemption. Then the CPU number is used to access the per-CPU
data structure via per_cpu().
This can be optimized by relying on local_bh_disable() to provide a
stable CPU number/ avoid migration and then using this_cpu_ptr() to
retrieve the data structure.
Cc: Allison Henderson <allison.henderson@oracle.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-15-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
rds_page_remainder_alloc() is invoked from a preemptible context or a
tasklet. There is no need to disable interrupts for locking.
Use local_bh_disable() instead of local_irq_save() for locking.
Cc: Allison Henderson <allison.henderson@oracle.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-14-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
mptcp_delegated_actions is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT this data
structure requires explicit locking.
Add a local_lock_t to the data structure and use local_lock_nested_bh() for
locking. This change adds only lockdep coverage and does not alter the
functional behaviour for !PREEMPT_RT.
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Mat Martineau <martineau@kernel.org>
Cc: Geliang Tang <geliang@kernel.org>
Cc: mptcp@lists.linux.dev
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-13-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
sch_frag_data_storage is a per-CPU variable and relies on disabled BH
for its locking. Without per-CPU locking in local_bh_disable() on
PREEMPT_RT this data structure requires explicit locking.
Add local_lock_t to the struct and use local_lock_nested_bh() for locking.
This change adds only lockdep coverage and does not alter the functional
behaviour for !PREEMPT_RT.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-12-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
mirred_nest_level is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
Move mirred_nest_level to struct netdev_xmit as u8, provide wrappers.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://patch.msgid.link/20250512092736.229935-11-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ovs_frag_data_storage is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
Move ovs_frag_data_storage into the struct ovs_pcpu_storage which already
provides locking for the structure.
Cc: Aaron Conole <aconole@redhat.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: Ilya Maximets <i.maximets@ovn.org>
Cc: dev@openvswitch.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20250512092736.229935-10-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ovs_pcpu_storage is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
The data structure can be referenced recursive and there is a recursion
counter to avoid too many recursions.
Add a local_lock_t to the data structure and use
local_lock_nested_bh() for locking. Add an owner of the struct which is
the current task and acquire the lock only if the structure is not owned
by the current task.
Cc: Aaron Conole <aconole@redhat.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: Ilya Maximets <i.maximets@ovn.org>
Cc: dev@openvswitch.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20250512092736.229935-9-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
exec_actions_level is a per-CPU integer allocated at compile time.
action_fifos and flow_keys are per-CPU pointer and have their data
allocated at module init time.
There is no gain in splitting it, once the module is allocated, the
structures are allocated.
Merge the three per-CPU variables into ovs_pcpu_storage, adapt callers.
Cc: Aaron Conole <aconole@redhat.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: Ilya Maximets <i.maximets@ovn.org>
Cc: dev@openvswitch.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://patch.msgid.link/20250512092736.229935-8-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
nat_keepalive_sk_ipv[46] is a per-CPU variable and relies on disabled BH
for its locking. Without per-CPU locking in local_bh_disable() on
PREEMPT_RT this data structure requires explicit locking.
Use sock_bh_locked which has a sock pointer and a local_lock_t. Use
local_lock_nested_bh() for locking. This change adds only lockdep
coverage and does not alter the functional behaviour for !PREEMPT_RT.
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-7-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
system_page_pool is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
Make a struct with a page_pool member (original system_page_pool) and a
local_lock_t and use local_lock_nested_bh() for locking. This change
adds only lockdep coverage and does not alter the functional behaviour
for !PREEMPT_RT.
Cc: Andrew Lunn <andrew+netdev@lunn.ch>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-6-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
hmac_storage is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
Add a local_lock_t to the data structure and use
local_lock_nested_bh() for locking. This change adds only lockdep
coverage and does not alter the functional behaviour for !PREEMPT_RT.
Cc: David Ahern <dsahern@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-5-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The statistics are incremented with raw_cpu_inc() assuming it always
happens with bottom half disabled. Without per-CPU locking in
local_bh_disable() on PREEMPT_RT this is no longer true.
Use this_cpu_inc() on PREEMPT_RT for the increment to not worry about
preemption.
Cc: David Ahern <dsahern@kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-4-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
dst_cache::cache is a per-CPU variable and relies on disabled BH for its
locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT
this data structure requires explicit locking.
Add a local_lock_t to the data structure and use
local_lock_nested_bh() for locking. This change adds only lockdep
coverage and does not alter the functional behaviour for !PREEMPT_RT.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-3-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
With preemptible softirq and no per-CPU locking in local_bh_disable() on
PREEMPT_RT the consumer can be preempted while a skb is returned.
Avoid the race by disabling the recycle into the cache on PREEMPT_RT.
Cc: Jesper Dangaard Brouer <hawk@kernel.org>
Cc: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20250512092736.229935-2-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Felix Fietkau says:
===================
mt76 fix for 6.15
- disable napi on driver removal to fix warning
- fix multicast rx regression on mt7925
===================
Link: https://patch.msgid.link/3b526d06-b717-4d47-817c-a9f47b796a31@nbd.name/
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Subbaraya Sundeep says:
====================
octeontx2: Improve mailbox tracing
Octeontx2 VF,PF and AF devices communicate using hardware
shared mailbox region where VFs can only to talk to its PFs
and PFs can only talk to AF. AF does the entire resource management
for all PFs and VFs. The shared mbox region is used for synchronous
requests (requests from PF to AF or VF to PF) and async notifications
(notifications from AF to PFs or PF to VFs). Sending a request to AF
from VF involves various stages like
1. VF allocates message in shared region
2. Triggers interrupt to PF
3. PF upon receiving interrupt from VF will copy the message
from VF<->PF region to PF<->AF region
4. Triggers interrupt to AF
5. AF processes it and writes response in PF<->AF region
6. Triggers interrupt to PF
7. PF copies responses from PF<->AF region to VF<->PF region
8. Triggers interrupt to Vf
9. VF reads response in VF<->PF region
Due to various stages involved, Tracepoints are used in mailbox code for
debugging. Existing tracepoints need some improvements so that maximum
information can be inferred from trace logs during an issue.
This patchset tries to enhance existing tracepoints and also adds
a couple of tracepoints.
====================
Link: https://patch.msgid.link/1747136408-30685-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Apart from netdev interface Octeontx2 PF does the following:
1. Sends its own requests to AF and receives responses from AF.
2. Receives async messages from AF.
3. Forwards VF requests to AF, sends respective responses from AF to VFs.
4. Sends async messages to VFs.
This patch adds new tracepoint otx2_msg_status to display the status
of PF wrt mailbox handling.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/1747136408-30685-5-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This patch adds pcifunc which represents PF and VF device to the
tracepoints otx2_msg_alloc, otx2_msg_send, otx2_msg_process so that
it is easier to correlate which device allocated the message, which
device forwarded it and which device processed that message.
Also add message id in otx2_msg_send tracepoint to check which
message is sent at any point of time from a device.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/1747136408-30685-4-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Mailbox UP messages and CPT messages names are not being
displayed with their names in trace log files. Add those
messages too in otx2_mbox_id2name.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/1747136408-30685-3-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Use tracepoint instead of dev_dbg since the entire
mailbox code uses tracepoints for debugging.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/1747136408-30685-2-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Make sure that n_channels is set after allocating the
struct cfg80211_registered_device::int_scan_req member. Seen with
syzkaller:
UBSAN: array-index-out-of-bounds in net/mac80211/scan.c:1208:5
index 0 is out of range for type 'struct ieee80211_channel *[] __counted_by(n_channels)' (aka 'struct ieee80211_channel *[]')
This was missed in the initial conversions because I failed to locate
the allocation likely due to the "sizeof(void *)" not matching the
"channels" array type.
Reported-by: syzbot+4bcdddd48bb6f0be0da1@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/680fd171.050a0220.2b69d1.045e.GAE@google.com/
Fixes: e3eac9f32ec0 ("wifi: cfg80211: Annotate struct cfg80211_scan_request with __counted_by")
Signed-off-by: Kees Cook <kees@kernel.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://patch.msgid.link/20250509184641.work.542-kees@kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Update the for_each_netdev_in_bond_rcu macro to iterate through network
devices in the bond's network namespace instead of always using
init_net. This change is safe because:
1. **Bond-Slave Namespace Relationship**: A bond device and its slaves
must reside in the same network namespace. The bond device's
namespace is established at creation time and cannot change.
2. **Slave Movement Implications**: Any attempt to move a slave device
to a different namespace automatically removes it from the bond, as
per kernel networking stack rules.
This maintains the invariant that slaves must exist in the same
namespace as their bond.
This change is part of an effort to enable Link Aggregation (LAG) to
work properly inside custom network namespaces. Previously, the macro
would only find slave devices in the initial network namespace,
preventing proper bonding functionality in custom namespaces.
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Link: https://patch.msgid.link/20250513081922.525716-1-mbloch@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Depending on the data offset, skb_to_sgvec_nomark() may use
less scatterlist elements than what was forecasted by the
previous call to skb_cow_data().
It specifically happens when 'skbheadlen(skb) < offset', because
in this case we entirely skip the skb's head, which would have
required its own scatterlist element.
For this reason, it doesn't make sense to check that
skb_to_sgvec_nomark() returns the same value as skb_cow_data(),
but we can rather check for errors only, as it happens in
other parts of the kernel.
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
When debugging a 'no route to host' error it can be beneficial
to know the address of the unreachable destination.
Print it along the debugging text.
While at it, add a missing parenthesis in a different debugging
message inside ovpn_peer_endpoints_update().
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
The keepalive worker is cancelled before calling
unregister_netdevice_queue(), therefore it will never
hit a situation where the reg_state can be different
than NETDEV_REGISTERED.
For this reason, checking reg_state is useless and the
condition can be removed.
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
To increase code coverage, extend the ovpn selftests with the following
cases:
* connect UDP peers using a mix of IPv6 and IPv4 at the transport layer
* run full test with tunnel MTU equal to transport MTU (exercising
IP layer fragmentation)
* ping "LAN IP" served by VPN peer ("LAN behind a client" test case)
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
ndo_start_xmit is basically expected to always return NETDEV_TX_OK.
However, in case of error, it was currently returning NET_XMIT_DROP,
which is not a valid netdev_tx_t return value, leading to
misinterpretation.
Change ndo_start_xmit to always return NETDEV_TX_OK to signal back
to the caller that the packet was handled (even if dropped).
Effects of this bug can be seen when sending IPv6 packets having
no peer to forward them to:
$ ip netns exec ovpn-server oping -c20 fd00:abcd:220:201::1
PING fd00:abcd:220:201::1 (fd00:abcd:220:201::1) 56 bytes of data.00:abcd:220:201 :1
ping_send failed: No buffer space available
ping_sendto: No buffer space available
ping_send failed: No buffer space available
ping_sendto: No buffer space available
...
Fixes: c2d950c4672a ("ovpn: add basic interface creation/destruction/management routines")
Reported-by: Gert Doering <gert@greenie.muc.de>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/5
Tested-by: Gert Doering <gert@greenie.muc.de>
Acked-by: Gert Doering <gert@greenie.muc.de>
Link: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg31591.html
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
getaddrinfo() may fail with error code different from EAI_FAIL
or EAI_NONAME, however in this case we still try to free the
results object, thus leading to a crash.
Fix this by bailing out on any possible error.
Fixes: 959bc330a439 ("testing/selftests: add test tool and scripts for ovpn module")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
When routing a packet to a LAN behind a peer, ovpn needs to
inspect the route entry that brought the packet there in the
first place.
If this packet is truly routable, the route entry provides the
GW to be used when looking up the VPN peer to send the packet to.
However, the route entry is currently dropped before entering
the ovpn xmit function, because the IFF_XMIT_DST_RELEASE priv_flag
is enabled by default.
Clear the IFF_XMIT_DST_RELEASE flag during interface setup to allow
the route entry (skb's dst) to survive and thus be inspected
by the ovpn routing logic.
Fixes: a3aaef8cd173 ("ovpn: implement peer lookup logic")
Reported-by: Gert Doering <gert@greenie.muc.de>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/2
Tested-by: Gert Doering <gert@greenie.muc.de>
Acked-by: Gert Doering <gert@greenie.muc.de> # as a primary user
Link: https://www.mail-archive.com/openvpn-devel@lists.sourceforge.net/msg31583.html
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
IPv6 user packets (sent over the tunnel) may be larger than
the outgoing interface MTU after encapsulation.
When this happens ovpn should allow the kernel to fragment
them because they are "locally generated".
To achieve the above, we must set skb->ignore_df = 1
so that ip6_fragment() can be made aware of this decision.
Failing to do so will result in ip6_fragment() dropping
the packet thinking it was "routed".
No change is required in the IPv4 path, because when
calling udp_tunnel_xmit_skb() we already pass the
'df' argument set to 0, therefore the resulting datagram
is allowed to be fragmented if need be.
Fixes: 08857b5ec5d9 ("ovpn: implement basic TX path (UDP)")
Reported-by: Gert Doering <gert@greenie.muc.de>
Closes: https://github.com/OpenVPN/ovpn-net-next/issues/3
Tested-by: Gert Doering <gert@greenie.muc.de>
Acked-by: Gert Doering <gert@greenie.muc.de> # as primary user
Link: https://mail-archive.com/openvpn-devel@lists.sourceforge.net/msg31577.html
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
Sabrina put quite some effort in reviewing the ovpn module
during its official submission to netdev.
For this reason she obtain extensive knowledge of the module
architecture and implementation.
Make her an official reviewer, so that I can be supported
in reviewing and acking new patches.
Acked-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
Lee Trager says:
====================
eth: fbnic: Add devlink dev flash support
fbnic supports updating firmware using signed PLDM images. PLDM images are
written into the flash. Flashing does not interrupt the operation of the
device.
V4: https://lore.kernel.org/netdev/20250510002851.3247880-1-lee@trager.us/T/#t
V3: https://lore.kernel.org/lkml/20241111043058.1251632-1-lee@trager.us/T/
V2: https://lore.kernel.org/all/20241022013941.3764567-1-lee@trager.us/
====================
Link: https://patch.msgid.link/20250512190109.2475614-1-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support to update the CMRT and control firmware as well as the UEFI
driver on fbnic using devlink dev flash.
Make sure the shutdown / quiescence paths like suspend take the devlink
lock to prevent them from interrupting the FW flashing process.
Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250512190109.2475614-6-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add three new mailbox messages to support PLDM upgrades:
* FW_START_UPGRADE - Enables driver to request starting a firmware upgrade
by specifying the component to be upgraded and its
size.
* WRITE_CHUNK - Allows firmware to request driver to send a chunk of
data at the specified offset.
* FINISH_UPGRADE - Allows firmware to cancel the upgrade process and
return an error.
Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250512190109.2475614-5-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Extend fbnic mailbox to support multiple concurrent completion messages at
once. This enables fbnic to support running multiple operations at once
which depend on a response from firmware via the mailbox.
Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250512190109.2475614-4-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
fbnic supports applying firmware which may not be rolled back. This is
implemented in firmware however it is useful for the driver to know the
minimum supported firmware version. This will enable the driver validate
new firmware before it is sent to the NIC. If it is too old the driver can
provide a clear message that the version is too old.
Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250512190109.2475614-3-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Not all drivers require send_package_data or send_component_table when
updating firmware. Instead of forcing drivers to implement a stub allow
these functions to go undefined.
Signed-off-by: Lee Trager <lee@trager.us>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20250512190109.2475614-2-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If ntuple filters count is modified followed by
unicast filters count using devlink then the ntuple count
set by user is ignored and all the ntuple filters are
being reallocated. Fix this by storing the ntuple count
set by user. Without this patch, say if user tries
to modify ntuple count as 8 followed by ucast filter count as 4
using devlink commands then ntuple count is being reverted to
default value 16 i.e, not retaining user set value 8.
Fixes: 39c469188b6d ("octeontx2-pf: Add ucast filter count configurability via devlink.")
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/1747054357-5850-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Enabling of the temperature sensor was moved from mv88q2xxx_hwmon_probe to
mv88q222x_config_init with the consequence that the sensor is only
usable when the PHY is configured. Enable the sensor in
mv88q2xxx_hwmon_probe as well to fix this.
Signed-off-by: Dimitri Fedrau <dima.fedrau@gmail.com>
Link: https://patch.msgid.link/20250512-marvell-88q2xxx-hwmon-enable-at-probe-v4-1-9256a5c8f603@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Ensure that the hdr_trans_tlv command is included in the broadcast wtbl to
prevent the IPv6 and multicast packet from being dropped by the chip.
Cc: stable@vger.kernel.org
Fixes: cb1353ef3473 ("wifi: mt76: mt7925: integrate *mlo_sta_cmd and *sta_cmd")
Reported-by: Benjamin Xiao <fossben@pm.me>
Tested-by: Niklas Schnelle <niks@kernel.org>
Signed-off-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Link: https://lore.kernel.org/lkml/EmWnO5b-acRH1TXbGnkx41eJw654vmCR-8_xMBaPMwexCnfkvKCdlU5u19CGbaapJ3KRu-l3B-tSUhf8CCQwL0odjo6Cd5YG5lvNeB-vfdg=@pm.me/
Link: https://patch.msgid.link/20250509010421.403022-1-mingyen.hsieh@mediatek.com
Signed-off-by: Felix Fietkau <nbd@nbd.name>
|
|
A warning on driver removal started occurring after commit 9dd05df8403b
("net: warn if NAPI instance wasn't shut down"). Disable tx napi before
deleting it in mt76_dma_cleanup().
WARNING: CPU: 4 PID: 18828 at net/core/dev.c:7288 __netif_napi_del_locked+0xf0/0x100
CPU: 4 UID: 0 PID: 18828 Comm: modprobe Not tainted 6.15.0-rc4 #4 PREEMPT(lazy)
Hardware name: ASUS System Product Name/PRIME X670E-PRO WIFI, BIOS 3035 09/05/2024
RIP: 0010:__netif_napi_del_locked+0xf0/0x100
Call Trace:
<TASK>
mt76_dma_cleanup+0x54/0x2f0 [mt76]
mt7921_pci_remove+0xd5/0x190 [mt7921e]
pci_device_remove+0x47/0xc0
device_release_driver_internal+0x19e/0x200
driver_detach+0x48/0x90
bus_remove_driver+0x6d/0xf0
pci_unregister_driver+0x2e/0xb0
__do_sys_delete_module.isra.0+0x197/0x2e0
do_syscall_64+0x7b/0x160
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Tested with mt7921e but the same pattern can be actually applied to other
mt76 drivers calling mt76_dma_cleanup() during removal. Tx napi is enabled
in their *_dma_init() functions and only toggled off and on again inside
their suspend/resume/reset paths. So it should be okay to disable tx
napi in such a generic way.
Found by Linux Verification Center (linuxtesting.org).
Fixes: 2ac515a5d74f ("mt76: mt76x02: use napi polling for tx cleanup")
Cc: stable@vger.kernel.org
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Tested-by: Ming Yen Hsieh <mingyen.hsieh@mediatek.com>
Link: https://patch.msgid.link/20250506115540.19045-1-pchelkin@ispras.ru
Signed-off-by: Felix Fietkau <nbd@nbd.name>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:
- Add proper pahole version dependency to CONFIG_GENDWARFKSYMS to avoid
module loading errors
- Fix UAPI header tests for the OpenRISC architecture
- Add dependency on the libdw package in Debian and RPM packages
- Disable -Wdefault-const-init-unsafe warnings on Clang
- Make "make clean ARCH=um" also clean the arch/x86/ directory
- Revert the use of -fmacro-prefix-map=, which causes issues with
debugger usability
* tag 'kbuild-fixes-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: fix typos "module.builtin" to "modules.builtin"
Revert "kbuild, rust: use -fremap-path-prefix to make paths relative"
Revert "kbuild: make all file references relative to source root"
kbuild: fix dependency on sorttable
init: remove unused CONFIG_CC_CAN_LINK_STATIC
um: let 'make clean' properly clean underlying SUBARCH as well
kbuild: Disable -Wdefault-const-init-unsafe
kbuild: rpm-pkg: Add (elfutils-devel or libdw-devel) to BuildRequires
kbuild: deb-pkg: Add libdw-dev:native to Build-Depends-Arch
usr/include: openrisc: don't HDRTEST bpf_perf_event.h
kbuild: Require pahole <v1.28 or >v1.29 with GENDWARFKSYMS on X86
|
|
Michael Kelley says:
====================
hv_netvsc: Fix error "nvsp_rndis_pkt_complete error status: 2"
Starting with commit dca5161f9bd0 in the 6.3 kernel, the Linux driver
for Hyper-V synthetic networking (netvsc) occasionally reports
"nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
that Hyper-V has rejected a network packet transmit request from the
guest, and the outgoing network packet is dropped. Higher level
network protocols presumably recover and resend the packet so there is
no functional error, but performance is slightly impacted. Commit
dca5161f9bd0 is not the cause of the error -- it only added reporting
of an error that was already happening without any notice. The error
has presumably been present since the netvsc driver was originally
introduced into Linux.
This patch set fixes the root cause of the problem, which is that the
netvsc driver in Linux may send an incorrectly formatted VMBus message
to Hyper-V when transmitting the network packet. The incorrect
formatting occurs when the rndis header of the VMBus message crosses a
page boundary due to how the Linux skb head memory is aligned. In such
a case, two PFNs are required to describe the location of the rndis
header, even though they are contiguous in guest physical address
(GPA) space. Hyper-V requires that two PFNs be in a single "GPA range"
data struture, but current netvsc code puts each PFN in its own GPA
range, which Hyper-V rejects as an error in the case of the rndis
header.
The incorrect formatting occurs only for larger packets that netvsc
must transmit via a VMBus "GPA Direct" message. There's no problem
when netvsc transmits a smaller packet by copying it into a pre-
allocated send buffer slot because the pre-allocated slots don't have
page crossing issues.
After commit 14ad6ed30a10 in the 6.14 kernel, the error occurs much
more frequently in VMs with 16 or more vCPUs. It may occur every few
seconds, or even more frequently, in a ssh session that outputs a lot
of text. Commit 14ad6ed30a10 subtly changes how skb head memory is
allocated, making it much more likely that the rndis header will cross
a page boundary when the vCPU count is 16 or more. The changes in
commit 14ad6ed30a10 are perfectly valid -- they just had the side
effect of making the netvsc bug more prominent.
One fix is to check for adjacent PFNs in vmbus_sendpacket_pagebuffer()
and just combine them into a single GPA range. Such a fix is very
contained. But conceptually it is fixing the problem at the wrong
level. So this patch set takes the broader approach of maintaining
the already known grouping of contiguous PFNs at a higher level in
the netvsc driver code, and propagating that grouping down to the
creation of the VMBus message to send to Hyper-V. Maintaining the
grouping fixes this problem, and has the added benefit of allowing
netvsc_dma_map() to make fewer calls to dma_map_single() to do bounce
buffering in CoCo VMs.
Patch 1 is a preparatory change to allow vmbus_sendpacket_mpb_desc()
to specify multiple GPA ranges. In current code
vmbus_sendpacket_mpb_desc() is used only by the storvsc synthetic SCSI
driver, and it always creates a single GPA range.
Patch 2 updates the netvsc driver to use vmbus_sendpacket_mpb_desc()
instead of vmbus_sendpacket_pagebuffer(). Because the higher levels of
netvsc still don't group contiguous PFNs, this patch is functionally
neutral. The VMBus message to Hyper-V still has many GPA ranges, each
with a single PFN. But it lays the groundwork for the next patch.
Patch 3 changes the higher levels of netvsc to preserve the already
known grouping of contiguous PFNs. When the contiguous groupings are
passed to vmbus_sendpacket_mpb_desc(), GPA ranges containing multiple
PFNs are produced, as expected by Hyper-V. This is point at which the
core problem is fixed.
Patches 4 and 5 remove code that is no longer necessary after the
previous patches.
These changes provide a net reduction of about 65 lines of code, which
is an added benefit.
These changes have been tested in normal VMs, in SEV-SNP and TDX CoCo
VMs, and in Dv6-series VMs where the netvsp implementation is in the
OpenHCL paravisor instead of the Hyper-V host.
These changes are built against kernel version 6.15-rc6.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217503
====================
Link: https://patch.msgid.link/20250513000604.1396-1-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With the netvsc driver changed to use vmbus_sendpacket_mpb_desc()
instead of vmbus_sendpacket_pagebuffer(), the latter has no remaining
callers. Remove it.
Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-6-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
init_page_array() now always creates a single page buffer array entry
for the rndis message, even if the rndis message crosses a page
boundary. As such, the number of page buffer array entries used for
the rndis message must no longer be tracked -- it is always just 1.
Remove the rmsg_pgcnt field and use "1" where the value is needed.
Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-5-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Starting with commit dca5161f9bd0 ("hv_netvsc: Check status in
SEND_RNDIS_PKT completion message") in the 6.3 kernel, the Linux
driver for Hyper-V synthetic networking (netvsc) occasionally reports
"nvsp_rndis_pkt_complete error status: 2".[1] This error indicates
that Hyper-V has rejected a network packet transmit request from the
guest, and the outgoing network packet is dropped. Higher level
network protocols presumably recover and resend the packet so there is
no functional error, but performance is slightly impacted. Commit
dca5161f9bd0 is not the cause of the error -- it only added reporting
of an error that was already happening without any notice. The error
has presumably been present since the netvsc driver was originally
introduced into Linux.
The root cause of the problem is that the netvsc driver in Linux may
send an incorrectly formatted VMBus message to Hyper-V when
transmitting the network packet. The incorrect formatting occurs when
the rndis header of the VMBus message crosses a page boundary due to
how the Linux skb head memory is aligned. In such a case, two PFNs are
required to describe the location of the rndis header, even though
they are contiguous in guest physical address (GPA) space. Hyper-V
requires that two rndis header PFNs be in a single "GPA range" data
struture, but current netvsc code puts each PFN in its own GPA range,
which Hyper-V rejects as an error.
The incorrect formatting occurs only for larger packets that netvsc
must transmit via a VMBus "GPA Direct" message. There's no problem
when netvsc transmits a smaller packet by copying it into a pre-
allocated send buffer slot because the pre-allocated slots don't have
page crossing issues.
After commit 14ad6ed30a10 ("net: allow small head cache usage with
large MAX_SKB_FRAGS values") in the 6.14-rc4 kernel, the error occurs
much more frequently in VMs with 16 or more vCPUs. It may occur every
few seconds, or even more frequently, in an ssh session that outputs a
lot of text. Commit 14ad6ed30a10 subtly changes how skb head memory is
allocated, making it much more likely that the rndis header will cross
a page boundary when the vCPU count is 16 or more. The changes in
commit 14ad6ed30a10 are perfectly valid -- they just had the side
effect of making the netvsc bug more prominent.
Current code in init_page_array() creates a separate page buffer array
entry for each PFN required to identify the data to be transmitted.
Contiguous PFNs get separate entries in the page buffer array, and any
information about contiguity is lost.
Fix the core issue by having init_page_array() construct the page
buffer array to represent contiguous ranges rather than individual
pages. When these ranges are subsequently passed to
netvsc_build_mpb_array(), it can build GPA ranges that contain
multiple PFNs, as required to avoid the error "nvsp_rndis_pkt_complete
error status: 2". If instead the network packet is sent by copying
into a pre-allocated send buffer slot, the copy proceeds using the
contiguous ranges rather than individual pages, but the result of the
copying is the same. Also fix rndis_filter_send_request() to construct
a contiguous range, since it has its own page buffer array.
This change has a side benefit in CoCo VMs in that netvsc_dma_map()
calls dma_map_single() on each contiguous range instead of on each
page. This results in fewer calls to dma_map_single() but on larger
chunks of memory, which should reduce contention on the swiotlb.
Since the page buffer array now contains one entry for each contiguous
range instead of for each individual page, the number of entries in
the array can be reduced, saving 208 bytes of stack space in
netvsc_xmit() when MAX_SKG_FRAGS has the default value of 17.
[1] https://bugzilla.kernel.org/show_bug.cgi?id=217503
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217503
Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-4-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
netvsc currently uses vmbus_sendpacket_pagebuffer() to send VMBus
messages. This function creates a series of GPA ranges, each of which
contains a single PFN. However, if the rndis header in the VMBus
message crosses a page boundary, the netvsc protocol with the host
requires that both PFNs for the rndis header must be in a single "GPA
range" data structure, which isn't possible with
vmbus_sendpacket_pagebuffer(). As the first step in fixing this, add a
new function netvsc_build_mpb_array() to build a VMBus message with
multiple GPA ranges, each of which may contain multiple PFNs. Use
vmbus_sendpacket_mpb_desc() to send this VMBus message to the host.
There's no functional change since higher levels of netvsc don't
maintain or propagate knowledge of contiguous PFNs. Based on its
input, netvsc_build_mpb_array() still produces a separate GPA range
for each PFN and the behavior is the same as with
vmbus_sendpacket_pagebuffer(). But the groundwork is laid for a
subsequent patch to provide the necessary grouping.
Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-3-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
vmbus_sendpacket_mpb_desc() is currently used only by the storvsc driver
and is hardcoded to create a single GPA range. To allow it to also be
used by the netvsc driver to create multiple GPA ranges, no longer
hardcode as having a single GPA range. Allow the calling driver to
specify the rangecount in the supplied descriptor.
Update the storvsc driver to reflect this new approach.
Cc: <stable@vger.kernel.org> # 6.1.x
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Link: https://patch.msgid.link/20250513000604.1396-2-mhklinux@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
cpsw->slaves[slave_no].phy should be equal to netdev->phydev, because it
is assigned from phy_attach_direct(). The latter is indirectly called
from the two identically named cpsw_slave_open() functions, one in
cpsw.c and another in cpsw_new.c.
Thus, the driver should not need custom logic to find the PHY, the core
can find it, and phy_do_ioctl_running() achieves exactly that.
However, that is only the case for cpsw_new and for the cpsw driver in
dual EMAC mode. This is explained in more detail in the previous commit.
Thus, allow the simpler core logic to execute for cpsw_new, and move
cpsw_ndo_ioctl() to cpsw.c.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250512114422.4176010-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|