Age | Commit message (Collapse) | Author |
|
This commit introduces a new vmtest.sh runner for vsock.
It uses virtme-ng/qemu to run tests in a VM. The tests validate G2H,
H2G, and loopback. The testing tools from tools/testing/vsock/ are
reused. Currently, only vsock_test is used.
VMCI and hyperv support is included in the config file to be built with
the -b option, though not used in the tests.
Only tested on x86.
To run:
$ make -C tools/testing/selftests TARGETS=vsock
$ tools/testing/selftests/vsock/vmtest.sh
or
$ make -C tools/testing/selftests TARGETS=vsock run_tests
Example runs (after make -C tools/testing/selftests TARGETS=vsock):
$ ./tools/testing/selftests/vsock/vmtest.sh
1..3
ok 0 vm_server_host_client
ok 1 vm_client_host_server
ok 2 vm_loopback
SUMMARY: PASS=3 SKIP=0 FAIL=0
Log: /tmp/vsock_vmtest_m7DI.log
$ ./tools/testing/selftests/vsock/vmtest.sh vm_loopback
1..1
ok 0 vm_loopback
SUMMARY: PASS=1 SKIP=0 FAIL=0
Log: /tmp/vsock_vmtest_a1IO.log
$ mkdir -p ~/scratch
$ make -C tools/testing/selftests install TARGETS=vsock INSTALL_PATH=~/scratch
[... omitted ...]
$ cd ~/scratch
$ ./run_kselftest.sh
TAP version 13
1..1
# timeout set to 300
# selftests: vsock: vmtest.sh
# 1..3
# ok 0 vm_server_host_client
# ok 1 vm_client_host_server
# ok 2 vm_loopback
# SUMMARY: PASS=3 SKIP=0 FAIL=0
# Log: /tmp/vsock_vmtest_svEl.log
ok 1 selftests: vsock: vmtest.sh
Future work can include vsock_diag_test.
Because vsock requires a VM to test anything other than loopback, this
patch adds vmtest.sh as a kselftest itself. This is different than other
systems that have a "vmtest.sh", where it is used as a utility script to
spin up a VM to run the selftests as a guest (but isn't hooked into
kselftest).
Signed-off-by: Bobby Eshleman <bobbyeshleman@gmail.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250609-vsock-vmtest-v10-1-7f37198e1cd4@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The len parameter is considered optional so it can be NULL so it cannot
be used for skipping to next entry of EIR_SERVICE_DATA.
Fixes: 8f9ae5b3ae80 ("Bluetooth: eir: Add helpers for managing service data")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Invert the FINAL_PUT bit so that test_bit_acquire and clear_bit_unlock
can be used instead of smp_mb.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
igc already supports enabling MAC Merge for FPE. This patch adds
support for preemptible queues in mqprio.
Tested preemption with mqprio by:
1. Enable FPE:
ethtool --set-mm enp1s0 pmac-enabled on tx-enabled on verify-enabled on
2. Enable preemptible queue in mqprio:
mqprio num_tc 4 map 0 1 2 3 0 0 0 0 0 0 0 0 0 0 0 0 \
queues 1@0 1@1 1@2 1@3 \
fp P P P E
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Changes:
1. Introduce tx_enabled flag to control preemptible queue. tx_enabled
is set via mmsv module based on multiple factors, including link
up/down status, to determine if FPE is active or inactive.
2. Add priority field to TXDCTL for express queue to improve data
fetch performance.
3. Block preemptible queue setup in taprio unless reverse-tsn-txq-prio
private flag is set. Encourages adoption of standard queue priority
scheme for new features.
4. Hardware-padded frames from preemptible queues result in incorrect
mCRC values, as padding bytes are excluded from the computation. Pad
frames to at least 60 bytes using skb_padto() before transmission to
ensure the hardware includes padding in the mCRC calculation.
Tested preemption with taprio by:
1. Enable FPE:
ethtool --set-mm enp1s0 pmac-enabled on tx-enabled on verify-enabled on
2. Enable private flag to reverse TX queue priority:
ethtool --set-priv-flags enp1s0 reverse-txq-prio on
3. Enable preemptible queue in taprio:
taprio num_tc 4 map 0 1 2 3 0 0 0 0 0 0 0 0 0 0 0 0 \
queues 1@0 1@1 1@2 1@3 \
fp P P P E
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Co-developed-by: Chwee-Lin Choong <chwee.lin.choong@intel.com>
Signed-off-by: Chwee-Lin Choong <chwee.lin.choong@intel.com>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
By default, igc assigns TX hw queue 0 the highest priority and queue 3
the lowest. This is opposite of most NICs, where TX hw queue 3 has the
highest priority and queue 0 the lowest.
mqprio in igc already uses TX arbitration unconditionally to reverse TX
queue priority when mqprio is enabled. The TX arbitration logic does not
require a private flag, because mqprio was added recently and no known
users depend on the default queue ordering, which differs from the typical
convention.
taprio does not use TX arbitration, so it inherits the default igc TX
queue priority order. This causes tc command inconsistencies when
configuring frame preemption with taprio compared to mqprio in igc.
Other tc command inconsistencies and configuration issues already exist
when using taprio on igc compared to other network controllers. These
issues are described in a later section.
To harmonize TX queue priority behavior between taprio and mqprio, and
to fix these issues without breaking long-standing taprio use cases,
this patch adds a new private flag, called reverse-tsn-txq-prio, to
reverse the TX queue priority. It makes queue 3 the highest and queue 0
the lowest, reusing the TX arbitration logic already used by mqprio.
Users must set the private flag when enabling frame preemption with
taprio to follow the standard convention. Doing so promotes adoption of
the correct priority model for new features while preserving
compatibility with legacy configurations.
This new private flag addresses:
1. Non-standard socket -> tc -> TX hw queue mapping for taprio in igc
Without the private flag:
- taprio maps (socket -> tc -> TX hardware queue) differently on igc
compared to other network controllers
- On igc, mqprio maps tc differently from taprio, since mqprio already
uses TX arbitration
The following examples compare taprio configuration on igc and other
network controllers:
a) On other NICs (TX hw queue 3 is highest priority):
taprio num_tc 4 map 0 1 2 3 .... \
queues 1@0 1@1 1@2 1@3
Mapping translates to:
socket 0 -> tc 0 -> queue 0
socket 3 -> tc 3 -> queue 3
This is the normal mapping that respects the standard convention:
higher socket number -> higher tc -> higher priority TX hw queue
b) On igc (TX hw queue 0 is highest priority by default):
taprio num_tc 4 map 3 2 1 0 .... \
queues 1@0 1@1 1@2 1@3
Mapping translates to:
socket 0 -> tc 3 -> queue 3
socket 3 -> tc 0 -> queue 0
This igc tc mapping example is based on Intel's TSN validation test
case, where a higher socket priority maps to a higher priority queue.
It respects the mapping:
higher socket number -> higher priority TX hw queue
but breaks the expected ordering:
higher tc -> higher priority TX hw queue
as defined in [Ref1]. This custom mapping complicates common taprio
setup across NICs.
2. Non-standard frame preemption mapping for taprio in igc
Without the private flag:
- Compared to other network controllers, taprio on igc must flip the
expected fp sequence, since express traffic is expected to map to the
highest priority queue and preemptible traffic to lower ones
- On igc, frame preemption configuration for mqprio differs from taprio,
since mqprio already uses TX arbitration
The following examples compare taprio frame preemption configuration on
igc and other network controllers:
a) On other NICs (TX hw queue 3 is highest priority):
taprio num_tc 4 map ..... \
queues 1@0 1@1 1@2 1@3 \
fp P P P E
Mapping translates to:
tc0, tc1, tc2 -> preemptible -> queue 0, 1, 2
tc3 -> express -> queue 3
This is the normal mapping that respects the standard convention:
higher tc -> express traffic -> higher priority TX hw queue
lower tc -> preemptible traffic -> lower priority TX hw queue
b) On igc (TX hw queue 0 is highest priority by default):
taprio num_tc 4 map ...... \
queues 1@0 1@1 1@2 1@3 \
fp E P P P
Mapping translates to:
tc0 -> express -> queue 0
tc1, tc2, tc3 -> preemptible -> queue 1, 2, 3
This inversion respects the mapping of:
express traffic -> higher priority TX hw queue
but breaks the expected ordering:
higher tc -> express traffic
as defined in [Ref1] where higher tc indicates higher priority. In
this case, the lower tc0 is assigned to express traffic. This custom
mapping further complicates common preemption setup across NICs.
Tests were performed on taprio with the following combinations, where
two apps send traffic simultaneously on different queues:
Private Flag Traffic Sent By Traffic Sent By
----------------------------------------------------------------
enabled iperf3 (queue 3) iperf3 (queue 0)
disabled iperf3 (queue 0) iperf3 (queue 3)
enabled iperf3 (queue 3) real-time app (queue 0)
disabled iperf3 (queue 0) real-time app (queue 3)
enabled real-time app (queue 3) iperf3 (queue 0)
disabled real-time app (queue 0) iperf3 (queue 3)
enabled real-time app (queue 3) real-time app (queue 0)
disabled real-time app (queue 0) real-time app (queue 3)
Private flag is controlled with:
ethtool --set-priv-flags enp1s0 reverse-tsn-txq-prio <on|off>
[Ref1]
IEEE 802.1Q clause 8.6.8 Transmission selection:
"For a given Port and traffic class, frames are selected from the
corresponding queue for transmission if and only if:
...
b) For each queue corresponding to a numerically higher value of traffic
class supported by the Port, the operation of the transmission selection
algorithm supported by that queue determines that there is no frame
available for transmission."
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Previously, TX arbitration prioritized queues based on the TC they were
mapped to. A queue mapped to TC 3 had higher priority than one mapped to
TC 0.
To improve code reuse for upcoming patches and align with typical NIC
behavior, this patch updates the logic to prioritize higher queue numbers
when mqprio is used. As a result, queue 0 becomes the lowest priority and
queue 3 becomes the highest.
This patch also introduces igc_tsn_is_tc_to_queue_priority_ordered() to
preserve the original TC-based priority rule and reject configurations
where a higher TC maps to a lower queue offset.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Refactor TXDCTL macro handling to use FIELD_PREP and GENMASK macros.
This prepares the code for adding a new TXDCTL priority field in an
upcoming patch.
Verified that the macro values remain unchanged before and after
refactoring.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Rename macros to use the DCTL prefix for consistency with existing
macros that reference the same register. This prepares for an upcoming
patch that adds new fields to TXDCTL.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Move and consolidate TXDCTL and RXDCTL macros in preparation for
upcoming TXDCTL changes. This improves organization and readability.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Faizal Rahim <faizal.abdul.rahim@linux.intel.com>
Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When using publicly available tools like 'mdio-tools' to read/write data
from/to network interface and its PHY via C45 (clause 45) mdiobus,
there is no verification of parameters passed to the ioctl and
it accepts any mdio address.
Currently there is support for 32 addresses in kernel via PHY_MAX_ADDR define,
but it is possible to pass higher value than that via ioctl.
While read/write operation should generally fail in this case,
mdiobus provides stats array, where wrong address may allow out-of-bounds
read/write.
Fix that by adding address verification before C45 read/write operation.
While this excludes this access from any statistics, it improves security of
read/write operation.
Fixes: 4e4aafcddbbf ("net: mdio: Add dedicated C45 API to MDIO bus drivers")
Signed-off-by: Jakub Raczynski <j.raczynski@samsung.com>
Reported-by: Wenjing Shan <wenjing.shan@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When using publicly available tools like 'mdio-tools' to read/write data
from/to network interface and its PHY via mdiobus, there is no verification of
parameters passed to the ioctl and it accepts any mdio address.
Currently there is support for 32 addresses in kernel via PHY_MAX_ADDR define,
but it is possible to pass higher value than that via ioctl.
While read/write operation should generally fail in this case,
mdiobus provides stats array, where wrong address may allow out-of-bounds
read/write.
Fix that by adding address verification before read/write operation.
While this excludes this access from any statistics, it improves security of
read/write operation.
Fixes: 080bb352fad00 ("net: phy: Maintain MDIO device and bus statistics")
Signed-off-by: Jakub Raczynski <j.raczynski@samsung.com>
Reported-by: Wenjing Shan <wenjing.shan@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The nl80211_parse_connkeys() function currently uses kfree() to release
the 'result' structure in error handling paths. However, if an error
occurs due to result->def being less than 0, the 'result' structure may
contain sensitive information.
To prevent potential leakage of sensitive data, replace kfree() with
kfree_sensitive() when freeing 'result'. This change aligns with the
approach used in its caller, nl80211_join_ibss(), enhancing the overall
security of the wireless subsystem.
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Link: https://patch.msgid.link/20250523110156.4017111-1-zilin@seu.edu.cn
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ath/ath
Jeff Johnson says:
==================
ath.git updates for v6.16-rc2
Fix a handful of both build and stability issues across multiple drivers.
==================
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The changes I made in
wifi: iwlwifi: don't warn if the NIC is gone in resume
conflicted with a major rework done in this area.
The merge de-facto removed my changes.
Re-add them.
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Link: https://patch.msgid.link/20250603091754.70182-1-emmanuel.grumbach@intel.com
Fixes: 06c4b2036818 ("Merge tag 'iwlwifi-next-2025-05-15' of https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-next")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
This reverts commit 4fcfcbe45734 ("wifi: mwifiex: Fix HT40 bandwidth
issue.")
That commit introduces a regression, when HT40 mode is enabled,
received packets are lost, this was experience with W8997 with both
SDIO-UART and SDIO-SDIO variants. From an initial investigation the
issue solves on its own after some time, but it's not clear what is
the reason. Given that this was just a performance optimization, let's
revert it till we have a better understanding of the issue and a proper
fix.
Cc: Jeff Chen <jeff.chen_1@nxp.com>
Cc: stable@vger.kernel.org
Fixes: 4fcfcbe45734 ("wifi: mwifiex: Fix HT40 bandwidth issue.")
Closes: https://lore.kernel.org/all/20250603203337.GA109929@francesco-nb/
Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Link: https://patch.msgid.link/20250605130302.55555-1-francesco@dolcini.it
[fix commit reference format]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
According to 802.1AE standard, when ES and SC flags in TCI are zero,
used SCI should be the current active SC_RX. Current code uses the
header MAC address. Without this patch, when ES flag is 0 (using a
bridge or switch), header MAC will not fit the SCI and MACSec frames
will be discarted.
In order to test this issue, MACsec link should be stablished between
two interfaces, setting SC and ES flags to zero and a port identifier
different than one. For example, using ip macsec tools:
ip link add link $ETH0 macsec0 type macsec port 11 send_sci off
end_station off
ip macsec add macsec0 tx sa 0 pn 2 on key 01 $ETH1_KEY
ip macsec add macsec0 rx port 11 address $ETH1_MAC
ip macsec add macsec0 rx port 11 address $ETH1_MAC sa 0 pn 2 on key 02
ip link set dev macsec0 up
ip link add link $ETH1 macsec1 type macsec port 11 send_sci off
end_station off
ip macsec add macsec1 tx sa 0 pn 2 on key 01 $ETH0_KEY
ip macsec add macsec1 rx port 11 address $ETH0_MAC
ip macsec add macsec1 rx port 11 address $ETH0_MAC sa 0 pn 2 on key 02
ip link set dev macsec1 up
Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver")
Co-developed-by: Andreu Montiel <Andreu.Montiel@technica-engineering.de>
Signed-off-by: Andreu Montiel <Andreu.Montiel@technica-engineering.de>
Signed-off-by: Carlos Fernandez <carlos.fernandez@technica-engineering.de>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix RX_DONE_INT_MASK definition in order to enable RX queues 16-31.
Fixes: f252493e18353 ("net: airoha: Enable multiple IRQ lines support in airoha_eth driver.")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250609-aioha-fix-rx-queue-mask-v1-1-f33706a06fa2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce flowtable hw acceleration for PPPoE traffic.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20250609-b4-airoha-flowtable-pppoe-v1-1-1520fa7711b4@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Most of the packetdrill tests have not flaked once last week.
Add the few which did to the XFAIL list.
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250610000001.1970934-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Once the THREADED napi is disabled, the napi kthread should also be
stopped. Keeping the kthread intact after disabling THREADED napi makes
the PID of this kthread show up in the output of netlink 'napi-get' and
ps -ef output.
The is discussed in the patch below:
https://lore.kernel.org/all/20250502191548.559cc416@kernel.org
NAPI kthread should stop only if,
- There are no pending napi poll scheduled for this thread.
- There are no new napi poll scheduled for this thread while it has
stopped.
- The ____napi_schedule can correctly fallback to the softirq for napi
polling.
Since napi_schedule_prep provides mutual exclusion over STATE_SCHED bit,
it is safe to unset the STATE_THREADED when SCHED_THREADED is set or the
SCHED bit is not set. SCHED_THREADED being set means that SCHED is
already set and the kthread owns this napi.
To disable threaded napi, unset STATE_THREADED bit safely if
SCHED_THREADED is set or SCHED is unset. Once STATE_THREADED is unset
safely then wait for the kthread to unset the SCHED_THREADED bit so it
safe to stop the kthread.
Add a new test in nl_netdev to verify this behaviour.
Tested:
./tools/testing/selftests/net/nl_netdev.py
TAP version 13
1..6
ok 1 nl_netdev.empty_check
ok 2 nl_netdev.lo_check
ok 3 nl_netdev.page_pool_check
ok 4 nl_netdev.napi_list_check
ok 5 nl_netdev.dev_set_threaded
ok 6 nl_netdev.nsim_rxq_reset_down
# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0
Ran neper for 300 seconds and did enable/disable of thread napi in a
loop continuously.
Signed-off-by: Samiullah Khawaja <skhawaja@google.com>
Link: https://patch.msgid.link/20250609173015.3851695-1-skhawaja@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Enable memory-mapped I/O access to RMON statistics registers for devices
known to work correctly. Currently, only the D-Link DGE-550T (`0x4000`)
with PCI revision A3 (`0x0c`) is allowed.
To avoid issues on other hardware, a runtime check was added to restrict
MMIO usage. The `MEM_MAPPING` macro was removed in favor of runtime
detection.
To access RMON registers, the code `dw32(RmonStatMask, 0x0007ffff);`
must also be skipped, so this patch conditionally disables it as well.
Tested-on: D-Link DGE-550T Rev-A3
Signed-off-by: Moon Yeounsu <yyyynoom@gmail.com>
Link: https://patch.msgid.link/20250610000130.49065-2-yyyynoom@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Before appending sysdata, prepare_extradata() checks if any feature is
enabled in sysdata_fields (and exits early if none is enabled).
When SYSDATA_RELEASE was introduced, we missed adding it to the list of
features being checked against sysdata_fields in prepare_extradata().
The result was that, if only SYSDATA_RELEASE is enabled in
sysdata_fields, we incorreclty exit early and fail to append the
release.
Instead of checking specific bits in sysdata_fields, check if
sysdata_fields has ALL bit zeroed and exit early if true. This fixes
case when only SYSDATA_RELEASE enabled and makes the code more general /
less error prone in future feature implementation.
Signed-off-by: Gustavo Luiz Duarte <gustavold@gmail.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Fixes: cfcc9239e78a ("netconsole: append release to sysdata")
Link: https://patch.msgid.link/20250609-netconsole-fix-v1-1-17543611ae31@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2025-06-10
The first 4 patches are by Vincent Mailhol and prepare the CAN netlink
interface for the introduction of CAN XL configuration.
Geert Uytterhoeven's patch updates the CAN networking documentation.
The last 2 patched are by Davide Caratti and introduce skb drop
reasons in the receive path of several CAN protocols.
* tag 'linux-can-next-for-6.17-20250610' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
can: add drop reasons in CAN protocols receive path
can: add drop reasons in the receive path of AF_CAN
documentation: networking: can: Document alloc_candev_mqs()
can: netlink: can_changelink(): rename tdc_mask into fd_tdc_flag_provided
can: bittiming: rename can_tdc_is_enabled() into can_fd_tdc_is_enabled()
can: bittiming: rename CAN_CTRLMODE_TDC_MASK into CAN_CTRLMODE_FD_TDC_MASK
can: netlink: replace tabulation by space in assignment
====================
Link: https://patch.msgid.link/20250610094933.1593081-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sk->sk_prot->sock_is_readable is a valid function pointer when sk resides
in a sockmap. After the last sk_psock_put() (which usually happens when
socket is removed from sockmap), sk->sk_prot gets restored and
sk->sk_prot->sock_is_readable becomes NULL.
This makes sk_is_readable() racy, if the value of sk->sk_prot is reloaded
after the initial check. Which in turn may lead to a null pointer
dereference.
Ensure the function pointer does not turn NULL after the check.
Fixes: 8934ce2fd081 ("bpf: sockmap redirect ingress support")
Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250609-skisreadable-toctou-v1-1-d0dfb2d62c37@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Just because otx2_atomic64_add is using u64 pointer as argument
all callers has to typecast __iomem void pointers which inturn
causing sparse warnings. Fix those by changing otx2_atomic64_add
argument to void pointer.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/1749484421-3607-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch removes unnecessary typecasts by marking the
mbox_regions array as __iomem since it is used to store
pointers to memory-mapped I/O (MMIO) regions. Also simplified
the call to readq() in PF driver by removing redundant type casts.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/1749484309-3434-1-git-send-email-sbhatta@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Gur Stavi says:
====================
hinic3: queue_api related fixes
This patch series contains improvement to queue_api and 2 queue_api
related patches to the hinic3 driver.
v1: https://lore.kernel.org/cover.1747824040.git.gur.stavi@huawei.com
v2: https://lore.kernel.org/cover.1747896423.git.gur.stavi@huawei.com
====================
Link: https://patch.msgid.link/cover.1749038081.git.gur.stavi@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A local variable of tx_q worked around name collision with internal
txq variable in netif_subqueue macros.
This workaround is no longer needed.
Signed-off-by: Gur Stavi <gur.stavi@huawei.com>
Link: https://patch.msgid.link/6376db2a39b8d3bf2fa893f58f56246bed128d5d.1749038081.git.gur.stavi@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Improve consistency of code by using only netif_subqueue variant apis
Signed-off-by: Gur Stavi <gur.stavi@huawei.com>
Link: https://patch.msgid.link/5fd897b75729cf078385aacd9ed40091314ea63d.1749038081.git.gur.stavi@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new function, netif_subqueue_sent, which is a wrapper for
netdev_tx_sent_queue.
Drivers that use the subqueue variant macros, netif_subqueue_xxx,
identify queue by index and are not required to obtain
struct netdev_queue explicitly.
Such drivers still need to call netdev_tx_sent_queue which is a
counterpart of netif_subqueue_completed_wake. Allowing drivers to use a
subqueue variant for this purpose improves their code consistency by
always referring to queue by its index.
Signed-off-by: Gur Stavi <gur.stavi@huawei.com>
Link: https://patch.msgid.link/909a5c92db49cad39f0954d6cb86775e6480ef4c.1749038081.git.gur.stavi@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-06-09 (ice, i40e, ixgbe, iavf)
Jake moves from individual virtchnl RSS configuration values, for ice,
i40e, and iavf, to a common libie location and values.
Martyna and Dawid add counters for link_down_events to ice, i40e, and
ixgbe drivers. The counter increments only on actual physical link-down
events visible to the PHY. It does not increment when the user performs
a software-only interface down/up (e.g. ip link set dev down).
The counter does increment in cases where the interface is reinitialized
in a way that causes a real link drop - such as eg. when attaching
an XDP program, reconfiguring channels, or toggling certain priv-flags.
For ice:
Arkadiusz and Karol separate PTP and DPLL functionality to their
respective APIs.
Michal adds a separate handler for Flow Director command processing.
For iavf:
Ahmed converts driver to utilize core's IRQ affinity API.
For ixgbe:
Alok Tiwari fixes issues with some comments; typos, copy/paste errors,
etc.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ixgbe: Fix typos and clarify comments in X550 driver code
iavf: convert to NAPI IRQ affinity API
ice: add a separate Rx handler for flow director commands
ice: add ice driver PTP pin documentation
ice: change SMA pins to SDP in PTP API
ice: redesign dpll sma/u.fl pins control
ixgbe: add link_down_events statistic
i40e: add link_down_events statistic
ice: add link_down_events statistic
net: intel: move RSS packet classifier types to libie
net: intel: rename 'hena' to 'hashcfg' for clarity
====================
Link: https://patch.msgid.link/20250609212652.1138933-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This function was introduced in commit 783da70e8396 ("net: add
sock_enable_timestamps"), with one caller in rxrpc.
That only caller was removed in commit 7903d4438b3f ("rxrpc: Don't use
received skbuff timestamps").
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250609153254.3504909-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The last use of t3_l2t_send_event() was removed in 2019 by
commit 30e0f6cf5acb ("RDMA/iw_cxgb3: Remove the iw_cxgb3 module from
kernel")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://patch.msgid.link/20250609152330.24027-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The TP-Link UE200 is a RTL8152B based USB 2.0 Fast Ethernet adapter. This
patch adds its device ID. It has been tested on Ubuntu 22.04.5.
Signed-off-by: Lucas Sanchez Sagrado <lucsansag@gmail.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Link: https://patch.msgid.link/20250609145536.26648-1-lucsansag@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A decade ago commit 6d08acd2d32e ("in6: fix conflict with glibc")
hid the definitions of IPV6 options, because GCC was complaining
about duplicates. The commit did not list the warnings seen, but
trying to recreate them now I think they are (building iproute2):
In file included from ./include/uapi/rdma/rdma_user_cm.h:39,
from rdma.h:16,
from res.h:9,
from res-ctx.c:7:
../include/uapi/linux/in6.h:171:9: warning: ‘IPV6_ADD_MEMBERSHIP’ redefined
171 | #define IPV6_ADD_MEMBERSHIP 20
| ^~~~~~~~~~~~~~~~~~~
In file included from /usr/include/netinet/in.h:37,
from rdma.h:13:
/usr/include/bits/in.h:233:10: note: this is the location of the previous definition
233 | # define IPV6_ADD_MEMBERSHIP IPV6_JOIN_GROUP
| ^~~~~~~~~~~~~~~~~~~
../include/uapi/linux/in6.h:172:9: warning: ‘IPV6_DROP_MEMBERSHIP’ redefined
172 | #define IPV6_DROP_MEMBERSHIP 21
| ^~~~~~~~~~~~~~~~~~~~
/usr/include/bits/in.h:234:10: note: this is the location of the previous definition
234 | # define IPV6_DROP_MEMBERSHIP IPV6_LEAVE_GROUP
| ^~~~~~~~~~~~~~~~~~~~
Compilers don't complain about redefinition if the defines
are identical, but here we have the kernel using the literal
value, and glibc using an indirection (defining to a name
of another define, with the same numerical value).
Problem is, the commit in question hid all the IPV6 socket
options, and glibc has a pretty sparse list. For instance
it lacks Flow Label related options. Willem called this out
in commit 3fb321fde22d ("selftests/net: ipv6 flowlabel"):
/* uapi/glibc weirdness may leave this undefined */
#ifndef IPV6_FLOWINFO
#define IPV6_FLOWINFO 11
#endif
More interestingly some applications (socat) use
a #ifdef IPV6_FLOWINFO to gate compilation of thier
rudimentary flow label support. (For added confusion
socat misspells it as IPV4_FLOWINFO in some places.)
Hide only the two defines we know glibc has a problem
with. If we discover more warnings we can hide more
but we should avoid covering the entire block of
defines for "IPV6 socket options".
Link: https://patch.msgid.link/20250609143933.1654417-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for reporting additional hardware counters for drop and
TC using the ethtool -S interface.
These counters include:
- Aggregate Rx/Tx drop counters
- Per-TC Rx/Tx packet counters
- Per-TC Rx/Tx byte counters
- Per-TC Rx/Tx pause frame counters
The counters are exposed using ethtool_ops->get_ethtool_stats and
ethtool_ops->get_strings. This feature/counters are not available
to all versions of hardware.
Signed-off-by: Dipayaan Roy <dipayanroy@linux.microsoft.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://patch.msgid.link/20250609100103.GA7102@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Breno Leitao says:
====================
netconsole: Optimize console registration and improve testing
During performance analysis of console subsystem latency, I discovered that
netconsole registers console handlers even when no active targets exist.
These orphaned console handlers are invoked on every printk() call, get
the lock, iterate through empty target lists, and consume CPU cycles
without performing any useful work.
This patch series addresses the inefficiency by:
1. Implementing dynamic console registration/unregistration based on target
availability, ensuring console handlers are only active when needed
2. Adding automatic cleanup of unused console registrations when targets
are disabled or removed
3. Extending the selftest suite to cover non-extended console format,
which was previously untested
The optimization reduces printk() overhead by eliminating unnecessary
function calls and list traversals when netconsole targets are not
configured, improving overall system performance during heavy logging
scenarios.
v2: https://lore.kernel.org/20250602-netcons_ext-v2-0-ef88d999326d@debian.org
v1: https://lore.kernel.org/20250528-netcons_ext-v1-1-69f71e404e00@debian.org
====================
Link: https://patch.msgid.link/20250609-netcons_ext-v3-0-5336fa670326@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Extend the netconsole selftest to validate both basic and extended
target formats. The basic format is a simpler variant that doesn't
support userdata or release functionality.
The test now validates that netconsole works correctly in both
configurations, improving test coverage for different netconsole
deployment scenarios.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250609-netcons_ext-v3-4-5336fa670326@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove the exit call from validate_result() function and move the
test exit logic to the main script. This allows the function to
be reused in scenarios where the test needs to continue execution
after validation, rather than terminating immediately.
The validate_result() function should focus on validation logic
only, while the calling script maintains control over program
flow and exit conditions. This change improves code modularity
and prepares for potential future enhancements where multiple
validations might be needed in a single test run.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250609-netcons_ext-v3-3-5336fa670326@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add unregister_netcons_consoles() function to automatically unregister
console handlers when no targets of the corresponding type remain active.
The function iterates through the target list to determine which console
types (basic vs extended) are still needed, and unregisters any console
handlers that are no longer required. This prevents having registered
console handlers without corresponding active targets.
The function is called when a target is disabled and moved to the cleanup
list, ensuring proper cleanup of unused console registrations.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250609-netcons_ext-v3-2-5336fa670326@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The netconsole driver currently registers the basic console driver
unconditionally during initialization, even when only extended targets
are configured. This results in unnecessary console registration and
performance overhead, as the write_msg() callback is invoked for every
log message only to return early when no matching targets are found.
Optimize the driver by conditionally registering console drivers based
on the actual target configuration. The basic console driver is now
registered only when non-extended targets exist, same as the extended
console. The implementation also handles dynamic target creation through
the configfs interface.
This change eliminates unnecessary console driver registrations,
redundant write_msg() callbacks for unused console types, and associated
lock contention and target list iterations. The optimization is
particularly beneficial for systems using only the most common extended
console type.
Fixes: e2f15f9a79201 ("netconsole: implement extended console support")
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20250609-netcons_ext-v3-1-5336fa670326@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I've been focusing on networking BPF bits lately, add myself as a
reviewer.
Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Acked-by: KP Singh <kpsingh@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20250610175442.2138504-1-stfomichev@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This reverts commit 28615e6eed152f2fda5486680090b74aeed7b554.
No, we don't make random features default to being on.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: SeongJae Park <sj@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Previously, e1000_down called cancel_work_sync for the e1000 reset task
(via e1000_down_and_stop), which takes RTNL.
As reported by users and syzbot, a deadlock is possible in the following
scenario:
CPU 0:
- RTNL is held
- e1000_close
- e1000_down
- cancel_work_sync (cancel / wait for e1000_reset_task())
CPU 1:
- process_one_work
- e1000_reset_task
- take RTNL
To remedy this, avoid calling cancel_work_sync from e1000_down
(e1000_reset_task does nothing if the device is down anyway). Instead,
call cancel_work_sync for e1000_reset_task when the device is being
removed.
Fixes: e400c7444d84 ("e1000: Hold RTNL when e1000_down can be called")
Reported-by: syzbot+846bb38dc67fe62cc733@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/683837bf.a00a0220.52848.0003.GAE@google.com/
Reported-by: John <john.cs.hey@gmail.com>
Closes: https://lore.kernel.org/netdev/CAP=Rh=OEsn4y_2LvkO3UtDWurKcGPnZ_NPSXK=FbgygNXL37Sw@mail.gmail.com/
Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Set use_nsecs=true as timestamp is reported in ns. Lack of this result
in smaller timestamp error window which cause error during phc2sys
execution on E825 NICs:
phc2sys[1768.256]: ioctl PTP_SYS_OFFSET_PRECISE: Invalid argument
This problem was introduced in the cited commit which omitted setting
use_nsecs to true when converting the ice driver to use
convert_base_to_cs().
Testing hints (ethX is PF netdev):
phc2sys -s ethX -c CLOCK_REALTIME -O 37 -m
phc2sys[1769.256]: CLOCK_REALTIME phc offset -5 s0 freq -0 delay 0
Fixes: d4bea547ebb57 ("ice/ptp: Remove convert_art_to_tsc()")
Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
If a reset event is received from the PF early in the init cycle, the
state machine hangs for about 25 seconds.
Reproducer:
echo 1 > /sys/class/net/$PF0/device/sriov_numvfs
ip link set dev $PF0 vf 0 mac $NEW_MAC
The log shows:
[792.620416] ice 0000:5e:00.0: Enabling 1 VFs
[792.738812] iavf 0000:5e:01.0: enabling device (0000 -> 0002)
[792.744182] ice 0000:5e:00.0: Enabling 1 VFs with 17 vectors and 16 queues per VF
[792.839964] ice 0000:5e:00.0: Setting MAC 52:54:00:00:00:11 on VF 0. VF driver will be reinitialized
[813.389684] iavf 0000:5e:01.0: Failed to communicate with PF; waiting before retry
[818.635918] iavf 0000:5e:01.0: Hardware came out of reset. Attempting reinit.
[818.766273] iavf 0000:5e:01.0: Multiqueue Enabled: Queue pair count = 16
Fix it by scheduling the reset task and making the reset task capable of
resetting early in the init cycle.
Fixes: ef8693eb90ae3 ("i40evf: refactor reset handling")
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Tested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When a VFLR interrupt is received during a VF reset initiated from a
different source, the VFLR may be not fully handled. This can
leave the VF in an undefined state.
To address this, set the I40E_VFLR_EVENT_PENDING bit again during VFLR
handling if the reset is not yet complete. This ensures the driver
will properly complete the VF reset in such scenarios.
Fixes: 52424f974bc5 ("i40e: Fix VF hang when reset is triggered on another VF")
Signed-off-by: Robert Malz <robert.malz@canonical.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The function i40e_vc_reset_vf attempts, up to 20 times, to handle a
VF reset request, using the return value of i40e_reset_vf as an indicator
of whether the reset was successfully triggered. Currently, i40e_reset_vf
always returns true, which causes new reset requests to be ignored if a
different VF reset is already in progress.
This patch updates the return value of i40e_reset_vf to reflect when
another VF reset is in progress, allowing the caller to properly use
the retry mechanism.
Fixes: 52424f974bc5 ("i40e: Fix VF hang when reset is triggered on another VF")
Signed-off-by: Robert Malz <robert.malz@canonical.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
When cross compiling the kernel with clang, we need to override
CLANG_CROSS_FLAGS when preparing the step libraries.
Prior to commit d1d096312176 ("tools: fix annoying "mkdir -p ..." logs
when building tools in parallel"), MAKEFLAGS would have been set to a
value that wouldn't set a value for CLANG_CROSS_FLAGS, hiding the
fact that we weren't properly overriding it.
Fixes: 56a2df7615fa ("tools/resolve_btfids: Compile resolve_btfids as host program")
Signed-off-by: Suleiman Souhlal <suleiman@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/bpf/20250606074538.1608546-1-suleiman@google.com
|