Age | Commit message (Collapse) | Author |
|
In preparation for complex matcher support, add function for
converting definer fname to str, which will be used in following
patches.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-3-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for complex matcher support, make function
mlx5hws_table_ft_set_next_ft() non-static and expose it in header.
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Vlad Dogaru <vdogaru@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1746992290-568936-2-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These tests:
"SOCK_STREAM ioctl(SIOCOUTQ) 0 unsent bytes"
"SOCK_SEQPACKET ioctl(SIOCOUTQ) 0 unsent bytes"
output: "Unexpected 'SIOCOUTQ' value, expected 0, got 64 (CLIENT)".
They test that the SIOCOUTQ ioctl reports 0 unsent bytes after the data
have been received by the other side. However, sometimes there is a delay
in updating this "unsent bytes" counter, and the test fails even though
the counter properly goes to 0 several milliseconds later.
The delay occurs in the kernel because the used buffer notification
callback virtio_vsock_tx_done(), called upon receipt of the data by the
other side, doesn't update the counter itself. It delegates that to
a kernel thread (via vsock->tx_work). Sometimes that thread is delayed
more than the test expects.
Change the test to poll SIOCOUTQ until it returns 0 or a timeout occurs.
Signed-off-by: Konstantin Shkolnyy <kshk@linux.ibm.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Fixes: 18ee44ce97c1 ("test/vsock: add ioctl unsent bytes test")
Link: https://patch.msgid.link/20250507151456.2577061-1-kshk@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit ce6cb8113c84 ("tools: ynl-gen: individually free previous
values on double set"), specifying the "multi-attr" property raises an
error unless the "nested-attributes" property is specified as well:
File "tools/net/ynl/./pyynl/ynl_gen_c.py", line 1147, in _load_nested_sets
child = self.pure_nested_structs.get(nested)
^^^^^^
UnboundLocalError: cannot access local variable 'nested' where it is not associated with a value
This appears to be a bug since there are existing specs which omit
"nested-attributes" on "multi-attr" attributes. Also, according to
Documentation/userspace-api/netlink/specs.rst, multi-attr "is the
recommended way of implementing arrays (no extra nesting)", suggesting
that nesting should even be avoided in favor of multi-attr.
Fix the indentation of the if-block introduced by the commit to avoid
the error.
Fixes: ce6cb8113c84 ("tools: ynl-gen: individually free previous values on double set")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/d6b58684b7e5bfb628f7313e6893d0097904e1d1.1746940107.git.lukas@wunner.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix several build errors when CONFIG_MODULES=n, including the following:
../arch/x86/kernel/alternative.c:195:25: error: incomplete definition of type 'struct module'
195 | for (int i = 0; i < mod->its_num_pages; i++) {
Fixes: 872df34d7c51 ("x86/its: Use dynamic thunks for indirect branches")
Cc: stable@vger.kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- fprobe: Fix RCU warning message in list traversal
fprobe_module_callback() using hlist_for_each_entry_rcu() traverse
the fprobe list but it locks fprobe_mutex() instead of rcu lock
because it is enough. So add lockdep_is_held() to avoid warning.
- tracing: eprobe: Add missing trace_probe_log_clear for eprobe
__trace_eprobe_create() uses trace_probe_log but forgot to clear it
at exit. Add trace_probe_log_clear() calls.
- tracing: probes: Fix possible race in trace_probe_log APIs
trace_probe_log APIs are used in probe event (dynamic_events,
kprobe_events and uprobe_events) creation. Only dynamic_events uses
the dyn_event_ops_mutex mutex to serialize it. This makes kprobe and
uprobe events to lock the same mutex to serialize its creation to
avoid race in trace_probe_log APIs.
* tag 'probes-fixes-v6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: probes: Fix a possible race in trace_probe_log APIs
tracing: add missing trace_probe_log_clear for eprobes
tracing: fprobe: Fix RCU warning message in list traversal
|
|
When bridged ports and standalone ports share a VLAN, e.g. via VLAN
uppers, or untagged traffic with a vlan unaware bridge, the ASIC will
still try to forward traffic to known FDB entries on standalone ports.
But since the port VLAN masks prevent forwarding to bridged ports, this
traffic will be dropped.
This e.g. can be observed in the bridge_vlan_unaware ping tests, where
this breaks pinging with learning on.
Work around this by enabling the simplified EAP mode on switches
supporting it for standalone ports, which causes the ASIC to redirect
traffic of unknown source MAC addresses to the CPU port.
Since standalone ports do not learn, there are no known source MAC
addresses, so effectively this redirects all incoming traffic to the CPU
port.
Fixes: ff39c2d68679 ("net: dsa: b53: Add bridge support")
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://patch.msgid.link/20250508091424.26870-1-jonas.gorski@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Since the shared trace_probe_log variable can be accessed and
modified via probe event create operation of kprobe_events,
uprobe_events, and dynamic_events, it should be protected.
In the dynamic_events, all operations are serialized by
`dyn_event_ops_mutex`. But kprobe_events and uprobe_events
interfaces are not serialized.
To solve this issue, introduces dyn_event_create(), which runs
create() operation under the mutex, for kprobe_events and
uprobe_events. This also uses lockdep to check the mutex is
held when using trace_probe_log* APIs.
Link: https://lore.kernel.org/all/174684868120.551552.3068655787654268804.stgit@devnote2/
Reported-by: Paul Cacheux <paulcacheux@gmail.com>
Closes: https://lore.kernel.org/all/20250510074456.805a16872b591e2971a4d221@kernel.org/
Fixes: ab105a4fb894 ("tracing: Use tracing error_log with probe events")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
Raju Rangoju says:
====================
amd-xgbe: add support for AMD Renoir
Add support for a new AMD Ethernet device called "Renoir". It has a new
PCI ID, add this to the current list of supported devices in the
amd-xgbe devices. Also, the BAR1 addresses cannot be used to access the
PCS registers on Renoir platform, use the indirect addressing via SMN
instead.
====================
Link: https://patch.msgid.link/20250509155325.720499-1-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support for new pci device id 0x1641 to register
Renoir device with PCIe.
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509155325.720499-6-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
A new version of XPCS access routines have been introduced, add the
support to xgbe_pci_probe() to use these routines.
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509155325.720499-5-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add the necessary support to enable Renoir ethernet device. Since the
BAR1 address cannot be used to access the XPCS registers on Renoir, use
the smn functions.
Some of the ethernet add-in-cards have dual PHY but share a single MDIO
line (between the ports). In such cases, link inconsistencies are
noticed during the heavy traffic and during reboot stress tests. Using
smn calls helps avoid such race conditions.
Suggested-by: Sudheesh Mavila <sudheesh.mavila@amd.com>
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509155325.720499-4-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Reorganize the xgbe_pci_probe() code path to convert if/else statements
to switch case to help add future code. This helps code look cleaner.
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509155325.720499-3-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The xgbe_{read/write}_mmd_regs_v* functions have common code which can
be moved to helper functions. Add new helper functions to calculate the
mmd_address for v1/v2 of xpcs access.
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509155325.720499-2-Raju.Rangoju@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Jakub Kicinski says:
====================
tools: ynl-gen: support sub-types for binary attributes
Binary attributes have sub-type annotations which either indicate
that the binary object should be interpreted as a raw / C array of
a simple type (e.g. u32), or that it's a struct.
Use this information in the C codegen instead of outputting void *
for all binary attrs. It doesn't make a huge difference in the genl
families, but in classic Netlink there is a lot more structs.
v1: https://lore.kernel.org/20250508022839.1256059-1-kuba@kernel.org
====================
Link: https://patch.msgid.link/20250509154213.1747885-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Support using a struct pointer for binary attrs. Len field is maintained
because the structs may grow with newer kernel versions. Or, which matters
more, be shorter if the binary is built against newer uAPI than kernel
against which it's executed. Since we are storing a pointer to a struct
type - always allocate at least the amount of memory needed by the struct
per current uAPI headers (unused mem is zeroed). Technically users should
check the length field but per modern ASAN checks storing a short object
under a pointer seems like a bad idea.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250509154213.1747885-4-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We auto-indent if statements (increase the indent of the subsequent
line by 1), do the same thing for else branches without a block.
There hasn't been any else branches before but we're about to add one.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250509154213.1747885-3-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Sub-type annotation on binary attributes may indicate that the attribute
carries an array of simple types (also referred to as "C array" in docs).
Support rendering them as such in the C user code. For example for u32,
instead of:
struct {
u32 arr;
} _len;
void *arr;
render:
struct {
u32 arr;
} _count;
__u32 *arr;
Note that count is the number of elements while len was the length in bytes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250509154213.1747885-2-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Mina Almasry says:
====================
Device memory TCP TX
The TX path had been dropped from the Device Memory TCP patch series
post RFCv1 [1], to make that series slightly easier to review. This
series rebases the implementation of the TX path on top of the
net_iov/netmem framework agreed upon and merged. The motivation for
the feature is thoroughly described in the docs & cover letter of the
original proposal, so I don't repeat the lengthy descriptions here, but
they are available in [1].
Full outline on usage of the TX path is detailed in the documentation
included with this series.
Test example is available via the kselftest included in the series as well.
The series is relatively small, as the TX path for this feature largely
piggybacks on the existing MSG_ZEROCOPY implementation.
Patch Overview:
---------------
1. Documentation & tests to give high level overview of the feature
being added.
1. Add netmem refcounting needed for the TX path.
2. Devmem TX netlink API.
3. Devmem TX net stack implementation.
4. Make dma-buf unbinding scheduled work to handle TX cases where it gets
freed from contexts where we can't sleep.
5. Add devmem TX documentation.
6. Add scaffolding enabling driver support for netmem_tx. Add helpers, driver
feature flag, and docs to enable drivers to declare netmem_tx support.
7. Guard netmem_tx against being enabled against drivers that don't
support it.
8. Add devmem_tx selftests. Add TX path to ncdevmem and add a test to
devmem.py.
Testing:
--------
Testing is very similar to devmem TCP RX path. The ncdevmem test used
for the RX path is now augemented with client functionality to test TX
path.
* Test Setup:
Kernel: net-next with this RFC and memory provider API cherry-picked
locally.
Hardware: Google Cloud A3 VMs.
NIC: GVE with header split & RSS & flow steering support.
Performance results are not included with this version, unfortunately.
I'm having issues running the dma-buf exporter driver against the
upstream kernel on my test setup. The issues are specific to that
dma-buf exporter and do not affect this patch series. I plan to follow
up this series with perf fixes if the tests point to issues once they're
up and running.
Special thanks to Stan who took a stab at rebasing the TX implementation
on top of the netmem/net_iov framework merged. Parts of his proposal [2]
that are reused as-is are forked off into their own patches to give full
credit.
[1] https://lore.kernel.org/netdev/20240909054318.1809580-1-almasrymina@google.com/
[2] https://lore.kernel.org/netdev/20240913150913.1280238-2-sdf@fomichev.me/T/#m066dd407fbed108828e2c40ae50e3f4376ef57fd
Cc: sdf@fomichev.me
Cc: asml.silence@gmail.com
Cc: dw@davidwei.uk
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Victor Nogueira <victor@mojatatu.com>
Cc: Pedro Tammela <pctammela@mojatatu.com>
Cc: Samiullah Khawaja <skhawaja@google.com>
Cc: Kuniyuki Iwashima <kuniyu@amazon.com>
v14: https://lore.kernel.org/netdev/20250429032645.363766-1-almasrymina@google.com/
v13: https://lore.kernel.org/netdev/20250425204743.617260-1-almasrymina@google.com/
v12: https://lore.kernel.org/netdev/20250423031117.907681-1-almasrymina@google.com/
v11: https://lore.kernel.org/netdev/20250423031117.907681-1-almasrymina@google.com/
v10: https://lore.kernel.org/netdev/20250417231540.2780723-1-almasrymina@google.com/
v9: https://lore.kernel.org/netdev/20250415224756.152002-1-almasrymina@google.com/
v8: https://lore.kernel.org/netdev/20250308214045.1160445-1-almasrymina@google.com/
v7: https://lore.kernel.org/netdev/20250227041209.2031104-1-almasrymina@google.com/
v6: https://lore.kernel.org/netdev/20250222191517.743530-1-almasrymina@google.com/
v5: https://lore.kernel.org/netdev/20250220020914.895431-1-almasrymina@google.com/
v4: https://lore.kernel.org/netdev/20250203223916.1064540-1-almasrymina@google.com/
v3: https://patchwork.kernel.org/project/netdevbpf/list/?series=929401&state=*
RFC v2: https://patchwork.kernel.org/project/netdevbpf/list/?series=920056&state=*
====================
Link: https://patch.msgid.link/20250508004830.4100853-1-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support for devmem TX in ncdevmem.
This is a combination of the ncdevmem from the devmem TCP series RFCv1
which included the TX path, and work by Stan to include the netlink API
and refactored on top of his generic memory_provider support.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-10-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We should not enable netmem TX for drivers that don't declare support.
Check for driver netmem TX support during devmem TX binding and fail if
the driver does not have the functionality.
Check for driver support in validate_xmit_skb as well.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-9-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Use netmem_dma_*() helpers in gve_tx_dqo.c DQO-RDA paths to
enable netmem TX support in that mode.
Declare support for netmem TX in GVE DQO-RDA mode.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Harshitha Ramamurthy <hramamurthy@google.com>
Link: https://patch.msgid.link/20250508004830.4100853-8-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Drivers need to make sure not to pass netmem dma-addrs to the
dma-mapping API in order to support netmem TX.
Add helpers and netmem_dma_*() helpers that enables special handling of
netmem dma-addrs that drivers can use.
Document in netmem.rst what drivers need to do to support netmem TX.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-7-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add documentation outlining the usage and details of the devmem TCP TX
API.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-6-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Augment dmabuf binding to be able to handle TX. Additional to all the RX
binding, we also create tx_vec needed for the TX path.
Provide API for sendmsg to be able to send dmabufs bound to this device:
- Provide a new dmabuf_tx_cmsg which includes the dmabuf to send from.
- MSG_ZEROCOPY with SCM_DEVMEM_DMABUF cmsg indicates send from dma-buf.
Devmem is uncopyable, so piggyback off the existing MSG_ZEROCOPY
implementation, while disabling instances where MSG_ZEROCOPY falls back
to copying.
We additionally pipe the binding down to the new
zerocopy_fill_skb_from_devmem which fills a TX skb with net_iov netmems
instead of the traditional page netmems.
We also special case skb_frag_dma_map to return the dma-address of these
dmabuf net_iovs instead of attempting to map pages.
The TX path may release the dmabuf in a context where we cannot wait.
This happens when the user unbinds a TX dmabuf while there are still
references to its netmems in the TX path. In that case, the netmems will
be put_netmem'd from a context where we can't unmap the dmabuf, Resolve
this by making __net_devmem_dmabuf_binding_free schedule_work'd.
Based on work by Stanislav Fomichev <sdf@fomichev.me>. A lot of the meat
of the implementation came from devmem TCP RFC v1[1], which included the
TX path, but Stan did all the rebasing on top of netmem/net_iov.
Cc: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Kaiyuan Zhang <kaiyuanz@google.com>
Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-5-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add bind-tx netlink call to attach dmabuf for TX; queue is not
required, only ifindex and dmabuf fd for attachment.
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250508004830.4100853-4-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently net_iovs support only pp ref counts, and do not support a
page ref equivalent.
This is fine for the RX path as net_iovs are used exclusively with the
pp and only pp refcounting is needed there. The TX path however does not
use pp ref counts, thus, support for get_page/put_page equivalent is
needed for netmem.
Support get_netmem/put_netmem. Check the type of the netmem before
passing it to page or net_iov specific code to obtain a page ref
equivalent.
For dmabuf net_iovs, we obtain a ref on the underlying binding. This
ensures the entire binding doesn't disappear until all the net_iovs have
been put_netmem'ed. We do not need to track the refcount of individual
dmabuf net_iovs as we don't allocate/free them from a pool similar to
what the buddy allocator does for pages.
This code is written to be extensible by other net_iov implementers.
get_netmem/put_netmem will check the type of the netmem and route it to
the correct helper:
pages -> [get|put]_page()
dmabuf net_iovs -> net_devmem_[get|put]_net_iov()
new net_iovs -> new helpers
Signed-off-by: Mina Almasry <almasrymina@google.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250508004830.4100853-3-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Later patches in the series adds TX net_iovs where there is no pp
associated, so we can't rely on niov->pp->mp_ops to tell what is the
type of the net_iov.
Add a type enum to the net_iov which tells us the net_iov type.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250508004830.4100853-2-almasrymina@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Oleksij Rempel says:
====================
address EEE regressions on KSZ switches since v6.9 (v6.14+)
This patch series addresses a regression in Energy Efficient Ethernet
(EEE) handling for KSZ switches with integrated PHYs, introduced in
kernel v6.9 by commit fe0d4fd9285e ("net: phy: Keep track of EEE
configuration").
The first patch updates the DSA driver to allow phylink to properly
manage PHY EEE configuration. Since integrated PHYs handle LPI
internally and ports without integrated PHYs do not document MAC-level
LPI support, dummy MAC LPI callbacks are provided.
The second patch removes outdated EEE workarounds from the micrel PHY
driver, as they are no longer needed with correct phylink handling.
This series addresses the regression for mainline and kernels starting
from v6.14. It is not easily possible to fully fix older kernels due
to missing infrastructure changes.
Tested on KSZ9893 hardware.
====================
Link: https://patch.msgid.link/20250504081434.424489-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The KSZ9477 PHY driver contained workarounds for broken EEE capability
advertisements by manually masking supported EEE modes and forcibly
disabling EEE if MICREL_NO_EEE was set.
With proper MAC-side EEE handling implemented via phylink, these quirks
are no longer necessary. Remove MICREL_NO_EEE handling and the use of
ksz9477_get_features().
This simplifies the PHY driver and avoids duplicated EEE management logic.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: stable@vger.kernel.org # v6.14+
Link: https://patch.msgid.link/20250504081434.424489-3-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Phylink expects MAC drivers to provide LPI callbacks to properly manage
Energy Efficient Ethernet (EEE) configuration. On KSZ switches with
integrated PHYs, LPI is internally handled by hardware, while ports
without integrated PHYs have no documented MAC-level LPI support.
Provide dummy mac_disable_tx_lpi() and mac_enable_tx_lpi() callbacks to
satisfy phylink requirements. Also, set default EEE capabilities during
phylink initialization where applicable.
Since phylink can now gracefully handle optional EEE configuration,
remove the need for the MICREL_NO_EEE PHY flag.
This change addresses issues caused by incomplete EEE refactoring
introduced in commit fe0d4fd9285e ("net: phy: Keep track of EEE
configuration"). It is not easily possible to fix all older kernels, but
this patch ensures proper behavior on latest kernels and can be
considered for backporting to stable kernels starting from v6.14.
Fixes: fe0d4fd9285e ("net: phy: Keep track of EEE configuration")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Cc: stable@vger.kernel.org # v6.14+
Link: https://patch.msgid.link/20250504081434.424489-2-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
b53 supported switches support configuring ageing time between 1 and
1,048,575 seconds, so add an appropriate setter.
This allows b53 to pass the FDB learning test for both vlan aware and
vlan unaware bridges.
Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20250510092211.276541-1-jonas.gorski@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As mlx4 has implemented skb_tx_timestamp() in mlx4_en_xmit(), the
SOFTWARE flag is surely needed when users are trying to get timestamp
information.
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250510093442.79711-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Recent devlink change added validation of an integer value
via NLA_POLICY_VALIDATE_FN, for sparse enums. Handle this
in policy dump. We can't extract any info out of the callback,
so report only the type.
Fixes: 429ac6211494 ("devlink: define enum for attr types of dynamic attributes")
Reported-by: syzbot+01eb26848144516e7f0a@syzkaller.appspotmail.com
Link: https://patch.msgid.link/20250509212751.1905149-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux
Tony Nguyen says:
====================
Prepare for Intel IPU E2000 (GEN3)
This is the first part in introducing RDMA support for idpf.
----------------------------------------------------------------
Tatyana Nikolova says:
To align with review comments, the patch series introducing RDMA
RoCEv2 support for the Intel Infrastructure Processing Unit (IPU)
E2000 line of products is going to be submitted in three parts:
1. Modify ice to use specific and common IIDC definitions and
pass a core device info to irdma.
2. Add RDMA support to idpf and modify idpf to use specific and
common IIDC definitions and pass a core device info to irdma.
3. Add RDMA RoCEv2 support for the E2000 products, referred to as
GEN3 to irdma.
This first part is a 5 patch series based on the original
"iidc/ice/irdma: Update IDC to support multiple consumers" patch
to allow for multiple CORE PCI drivers, using the auxbus.
Patches:
1) Move header file to new name for clarity and replace ice
specific DSCP define with a kernel equivalent one in irdma
2) Unify naming convention
3) Separate header file into common and driver specific info
4) Replace ice specific DSCP define with a kernel equivalent
one in ice
5) Implement core device info struct and update drivers to use it
----------------------------------------------------------------
v1: https://lore.kernel.org/20250505212037.2092288-1-anthony.l.nguyen@intel.com
IWL reviews:
[v5] https://lore.kernel.org/20250416021549.606-1-tatyana.e.nikolova@intel.com
[v4] https://lore.kernel.org/20250225050428.2166-1-tatyana.e.nikolova@intel.com
[v3] https://lore.kernel.org/20250207194931.1569-1-tatyana.e.nikolova@intel.com
[v2] https://lore.kernel.org/20240824031924.421-1-tatyana.e.nikolova@intel.com
[v1] https://lore.kernel.org/20240724233917.704-1-tatyana.e.nikolova@intel.com
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/linux:
iidc/ice/irdma: Update IDC to support multiple consumers
ice: Replace ice specific DSCP mapping num with a kernel define
iidc/ice/irdma: Break iidc.h into two headers
iidc/ice/irdma: Rename to iidc_* convention
iidc/ice/irdma: Rename IDC header file
====================
Link: https://patch.msgid.link/20250509200712.2911060-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Stefan Wahren says:
====================
net: vertexcom: mse102x: Improve RX handling
This series is the second part of two series for the Vertexcom driver.
It contains some improvements for the RX handling of the Vertexcom MSE102x.
====================
Link: https://patch.msgid.link/20250509120435.43646-1-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The function mse102x_rx_pkt_spi is used in two cases:
* initial polling to re-arm RX interrupt
* level based RX interrupt handler
Both of them doesn't need an open-coded retry mechanism.
In the first case the function can be called again, if the return code
is IRQ_NONE. This keeps the error behavior during netdev open.
In the second case the proper retry would be handled implicit by
the SPI interrupt. So drop the retry code and simplify the receive path.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://patch.msgid.link/20250509120435.43646-7-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The MSE102x doesn't provide any interrupt register, so the only way
to handle the level interrupt is to fetch the whole packet from
the MSE102x internal buffer via SPI. So in cases the interrupt
handler fails to do this, it should return IRQ_NONE. This allows
the core to disable the interrupt in case the issue persists
and prevent an interrupt storm.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Link: https://patch.msgid.link/20250509120435.43646-6-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After removal of the invalid command counter only a relevant debug
message is left, which can be cumbersome. So add a new flag to debugfs,
which indicates whether the driver has ever received a valid CMD.
This helps to differentiate between general and temporary receive
issues.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250509120435.43646-5-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are several reasons for an invalid command response
by the MSE102x:
* SPI line interferences
* MSE102x is in reset or has no firmware
* MSE102x is busy
* no packet in MSE102x receive buffer
So the counter for invalid command isn't very helpful without
further context. So drop the confusing statistics counter,
but keep the debug messages about "unexpected response" in order
to debug possible hardware issues.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250509120435.43646-4-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The example of the initial DT binding of the Vertexcom MSE 102x suggested
a IRQ_TYPE_EDGE_RISING, which is wrong. So warn everyone to fix their
device tree to level based IRQ.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250509120435.43646-3-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
According to the MSE102x documentation the trigger type is a
high level.
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://patch.msgid.link/20250509120435.43646-2-wahrenst@gmx.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It has been reported that when under a bridge with stp_state=1, the logs
get spammed with this message:
[ 251.734607] fsl_dpaa2_eth dpni.5 eth0: Couldn't decode source port
Further debugging shows the following info associated with packets:
source_port=-1, switch_id=-1, vid=-1, vbid=1
In other words, they are data plane packets which are supposed to be
decoded by dsa_tag_8021q_find_port_by_vbid(), but the latter (correctly)
refuses to do so, because no switch port is currently in
BR_STATE_LEARNING or BR_STATE_FORWARDING - so the packet is effectively
unexpected.
The error goes away after the port progresses to BR_STATE_LEARNING in 15
seconds (the default forward_time of the bridge), because then,
dsa_tag_8021q_find_port_by_vbid() can correctly associate the data plane
packets with a plausible bridge port in a plausible STP state.
Re-reading IEEE 802.1D-1990, I see the following:
"4.4.2 Learning: (...) The Forwarding Process shall discard received
frames."
IEEE 802.1D-2004 further clarifies:
"DISABLED, BLOCKING, LISTENING, and BROKEN all correspond to the
DISCARDING port state. While those dot1dStpPortStates serve to
distinguish reasons for discarding frames, the operation of the
Forwarding and Learning processes is the same for all of them. (...)
LISTENING represents a port that the spanning tree algorithm has
selected to be part of the active topology (computing a Root Port or
Designated Port role) but is temporarily discarding frames to guard
against loops or incorrect learning."
Well, this is not what the driver does - instead it sets
mac[port].ingress = true.
To get rid of the log spam, prevent unexpected data plane packets to
be received by software by discarding them on ingress in the LISTENING
state.
In terms of blame attribution: the prints only date back to commit
d7f9787a763f ("net: dsa: tag_8021q: add support for imprecise RX based
on the VBID"). However, the settings would permit a LISTENING port to
forward to a FORWARDING port, and the standard suggests that's not OK.
Fixes: 640f763f98c2 ("net: dsa: sja1105: Add support for Spanning Tree Protocol")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250509113816.2221992-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
__netdev_update_features() expects the netdevice to be ops-locked, but
it gets called recursively on the lower level netdevices to sync their
features, and nothing locks those.
This commit fixes that, with the assumption that it shouldn't be possible
for both higher-level and lover-level netdevices to require the instance
lock, because that would lead to lock dependency warnings.
Without this, playing with higher level (e.g. vxlan) netdevices on top
of netdevices with instance locking enabled can run into issues:
WARNING: CPU: 59 PID: 206496 at ./include/net/netdev_lock.h:17 netif_napi_add_weight_locked+0x753/0xa60
[...]
Call Trace:
<TASK>
mlx5e_open_channel+0xc09/0x3740 [mlx5_core]
mlx5e_open_channels+0x1f0/0x770 [mlx5_core]
mlx5e_safe_switch_params+0x1b5/0x2e0 [mlx5_core]
set_feature_lro+0x1c2/0x330 [mlx5_core]
mlx5e_handle_feature+0xc8/0x140 [mlx5_core]
mlx5e_set_features+0x233/0x2e0 [mlx5_core]
__netdev_update_features+0x5be/0x1670
__netdev_update_features+0x71f/0x1670
dev_ethtool+0x21c5/0x4aa0
dev_ioctl+0x438/0xae0
sock_ioctl+0x2ba/0x690
__x64_sys_ioctl+0xa78/0x1700
do_syscall_64+0x6d/0x140
entry_SYSCALL_64_after_hwframe+0x4b/0x53
</TASK>
Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250509072850.2002821-1-cratiu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Most PHY drivers default to a 2ns delay if internal delay is requested
and no value is specified. Having a default value makes sense, as it
allows a Device Tree to only care about board design (whether there are
delays on the PCB or not), and not whether the delay is added on the MAC
or the PHY side when needed.
Whether the delays are actually applied is controlled by the
DP83867_RGMII_*_CLK_DELAY_EN flags, so the behavior is only changed in
configurations that would previously be rejected with -EINVAL.
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Link: https://patch.msgid.link/e2509b248a11ee29ea408a50c231da4c1fa0863b.1746612711.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The check that intended to handle "rgmii" PHY mode differently to the
RGMII modes with internal delay never worked as intended:
- added in commit 2a10154abcb7 ("net: phy: dp83867: Add TI dp83867 phy"):
logic error caused the condition to always evaluate to true
- changed in commit a46fa260f6f5 ("net: phy: dp83867: Fix warning check
for setting the internal delay"): now the condition incorrectly
evaluates to false for rgmii-txid
- removed in commit 2b892649254f ("net: phy: dp83867: Set up RGMII TX
delay")
Around the time of the removal, commit c11669a2757e ("net: phy: dp83867:
Rework delay rgmii delay handling") started clearing the delay enable
flags in RGMIICTL. The change attempted to preserve the historical
behavior of not disabling internal delays with "rgmii" PHY mode and also
documented this in a comment, but due to a conflict between "Set up
RGMII TX delay" and "Rework delay rgmii delay handling", the behavior
dp83867_verify_rgmii_cfg() warned about (and that was also described in
a comment in dp83867_config_init()) disappeared in the following merge
of net into net-next in commit b4b12b0d2f02
("Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net").
While is doesn't appear that this breaking change was intentional, it
has been like this since 2019, and the new behavior to disable the delays
with "rgmii" PHY mode is generally desirable - in particular with MAC
drivers that have to fix up the delay mode, resulting in the PHY driver
not even seeing the same mode that was specified in the Device Tree.
Remove the obsolete check and comment.
Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com>
Link: https://patch.msgid.link/8a286207cd11b460bb0dbd27931de3626b9d7575.1746612711.git.matthias.schiffer@ew.tq-group.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There is a situation where after THALT is set high, TGO stays high as
well. Because jiffies are never updated, as we are in a context with
interrupts disabled, we never exit that loop and have a deadlock.
That deadlock was noticed on a sama5d4 device that stayed locked for days.
Use retries instead of jiffies so that the timeout really works and we do
not have a deadlock anymore.
Fixes: e86cd53afc590 ("net/macb: better manage tx errors")
Signed-off-by: Mathieu Othacehe <othacehe@gnu.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250509121935.16282-1-othacehe@gnu.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Document support for the GBETH IP found on the Renesas RZ/V2N (R9A09G056)
SoC. The GBETH controller on the RZ/V2N SoC is functionally identical to
the one found on the RZ/V2H(P) (R9A09G057) SoC, so `renesas,rzv2h-gbeth`
will be used as a fallback compatible.
Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://patch.msgid.link/20250507173551.100280-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Hangbin Liu says:
====================
selftests: net: configure rp_filter in setup_ns
Some distributions enable rp_filter globally by default, which can interfere
with various test cases. To address this, many tests explicitly disable
rp_filter within their scripts.
To avoid duplication and ensure consistent behavior across tests, this patch
moves the rp_filter configuration into setup_ns, applied immediately after a
new namespace is created. This change ensures that all namespace-based tests
inherit the appropriate rp_filter settings, simplifying individual test
scripts and improving maintainability.
====================
Link: https://patch.msgid.link/20250508081910.84216-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove the rp_filter configuration from MPTCP tests, as it is now handled
by setup_ns.
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20250508081910.84216-7-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|