Age | Commit message (Collapse) | Author |
|
When updating the source/destination address, the TCP/UDP checksum needs to
be updated as well.
Fixes: bee88cd5bd83 ("net: add support for segmenting TCP fraglist GSO packets")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://patch.msgid.link/20250311212530.91519-1-nbd@nbd.name
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Paolo Abeni says:
====================
udp_tunnel: GRO optimizations
The UDP tunnel GRO stage is source of measurable overhead for workload
based on UDP-encapsulated traffic: each incoming packets requires a full
UDP socket lookup and an indirect call.
In the most common setups a single UDP tunnel device is used. In such
case we can optimize both the lookup and the indirect call.
Patch 1 tracks per netns the active UDP tunnels and replaces the socket
lookup with a single destination port comparison when possible.
Patch 2 tracks the different types of UDP tunnels and replaces the
indirect call with a static one when there is a single UDP tunnel type
active.
I measure ~5% performance improvement in TCP over UDP tunnel stream
tests on top of this series.
v3: https://lore.kernel.org/netdev/cover.1741632298.git.pabeni@redhat.com/
v2: https://lore.kernel.org/netdev/cover.1741338765.git.pabeni@redhat.com/
v1: https://lore.kernel.org/netdev/cover.1741275846.git.pabeni@redhat.com/
====================
Link: https://patch.msgid.link/cover.1741718157.git.pabeni@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
It's quite common to have a single UDP tunnel type active in the
whole system. In such a case we can replace the indirect call for
the UDP tunnel GRO callback with a static call.
Add the related accounting in the control path and switch to static
call when possible. To keep the code simple use a static array for
the registered tunnel types, and size such array based on the kernel
config.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/6fd1f9c7651151493ecab174e7b8386a1534170d.1741718157.git.pabeni@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Most UDP tunnels bind a socket to a local port, with ANY address, no
peer and no interface index specified.
Additionally it's quite common to have a single tunnel device per
namespace.
Track in each namespace the UDP tunnel socket respecting the above.
When only a single one is present, store a reference in the netns.
When such reference is not NULL, UDP tunnel GRO lookup just need to
match the incoming packet destination port vs the socket local port.
The tunnel socket never sets the reuse[port] flag[s]. When bound to no
address and interface, no other socket can exist in the same netns
matching the specified local port.
Matching packets with non-local destination addresses will be
aggregated, and eventually segmented as needed - no behavior changes
intended.
Note that the UDP tunnel socket reference is stored into struct
netns_ipv4 for both IPv4 and IPv6 tunnels. That is intentional to keep
all the fastpath-related netns fields in the same struct and allow
cacheline-based optimization. Currently both the IPv4 and IPv6 socket
pointer share the same cacheline as the `udp_table` field.
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/4d5c319c4471161829f50cb8436841de81a5edae.1741718157.git.pabeni@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
According to GDMA protocol, holes (zeros) are allowed at the beginning
or middle of the gdma_list_devices_resp message. The existing code
cannot properly handle this, and may miss some devices in the list.
To fix, scan the entire list until the num_of_devs are found, or until
the end of the list.
Cc: stable@vger.kernel.org
Fixes: ca9c54d2d6a5 ("net: mana: Add a driver for Microsoft Azure Network Adapter (MANA)")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Long Li <longli@microsoft.com>
Reviewed-by: Shradha Gupta <shradhagupta@microsoft.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/1741723974-1534-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Registering the interrupts for TX or RX DMA Channels prior to registering
their respective NAPI callbacks can result in a NULL pointer dereference.
This is seen in practice as a random occurrence since it depends on the
randomness associated with the generation of traffic by Linux and the
reception of traffic from the wire.
Fixes: 681eb2beb3ef ("net: ethernet: ti: am65-cpsw: ensure proper channel cleanup in error path")
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Co-developed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Alexander Sverdlin <alexander.sverdlin@siemens.com>
Reviewed-by: Roger Quadros <rogerq@kernel.org>
Link: https://patch.msgid.link/20250311154259.102865-1-s-vadapalli@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove unnecessary casts to pointer types passed to kfree.
Issue detected by coccinelle:
@@
type t1;
expression *e;
@@
-kfree((t1 *)e);
+kfree(e);
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Link: https://patch.msgid.link/20250311070624.1037787-1-nichen@iscas.ac.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Tariq Toukan says:
====================
mlx5: Support setting a parent for a devlink rate node
This series by Carolina adds mlx5 support for the setting of a parent to
devlink rate nodes.
By introducing a hierarchical level to scheduling nodes, these changes
allow for more granular control over bandwidth allocation and isolation
of Virtual Functions.
Function renaming for parent setting on leafs:
- net/mlx5: Rename devlink rate parent set function for leaf nodes
Add support for hierarchy level tracking:
- net/mlx5: Introduce hierarchy level tracking on scheduling nodes
- net/mlx5: Preserve rate settings when creating a rate node
Support setting parent for rate nodes:
- net/mlx5: Add support for setting parent of nodes
====================
Link: https://patch.msgid.link/1741642016-44918-1-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Introduce `mlx5_esw_devlink_rate_node_parent_set()` to allow assigning
a parent to scheduling nodes.
Implement `mlx5_esw_qos_node_update_parent()` and
`mlx5_esw_qos_node_validate_set_parent()` to enforce constraints on
node reassignment.
Don't allow reassignment of nodes with active rate objects.
Update `esw_qos_node_set_parent()` to handle cases where
the parent is NULL. A NULL parent indicates that the scheduling element
is attached to the root scheduling element, and since only rate nodes
can be connected to the root, this update is now necessary.
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1741642016-44918-5-git-send-email-tariqt@nvidia.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Modify `esw_qos_create_node_sched_elem()` to receive max_rate and
bw_share values while maintaining the previous configuration.
This change is essential for the upcoming patch that will modify rate
nodes and requires the existing settings to be preserved unless
explicitly changed.
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1741642016-44918-4-git-send-email-tariqt@nvidia.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add a `level` field to `mlx5_esw_sched_node` to track the hierarchy
depth of each scheduling node. This allows enforcement of the
scheduling depth constraints based on `log_esw_max_sched_depth`.
Modify `esw_qos_node_set_parent()` and `__esw_qos_alloc_node()` to
correctly assign hierarchy levels. Ensure that nodes inherit their
parent’s level incrementally.
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1741642016-44918-3-git-send-email-tariqt@nvidia.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Rename `mlx5_esw_devlink_rate_parent_set()` to
`mlx5_esw_devlink_rate_leaf_parent_set()` to distinguish setting a
parent for leafs from nodes, which is not yet supported.
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1741642016-44918-2-git-send-email-tariqt@nvidia.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Michael Chan says:
====================
bnxt_en: Driver update
This patchset contains these updates to the driver:
1. New ethtool coredump type for FW to include cached context for live dump.
2. Support ENABLE_ROCE devlink generic parameter.
3. Support capability change flag from FW.
4. FW interface update.
5. Support .set_module_eeprom_by_page().
====================
Link: https://patch.msgid.link/20250310183129.3154117-1-michael.chan@broadcom.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add support for .set_module_eeprom_by_page() callback
which implements generic solution for modules eeprom access.
This implementation also supports CMIS 5.0.3 compliant
eeprom FW download.
Sample Usage:
ethtool --flash-module-firmware enp177s0np0 file dummy.bin
Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250310183129.3154117-8-michael.chan@broadcom.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In preparation for adding .set_module_eeprom_by_page(), extract the
common error checking done in bnxt_get_module_eeprom_by_page() into
a new common function that can be re-used for
.set_module_eeprom_by_page().
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250310183129.3154117-7-michael.chan@broadcom.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The main changes are adding i2c write for module eeprom and a new v2
PCIe statistics structure.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250310183129.3154117-6-michael.chan@broadcom.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Newer FW can set the CAPS_CHANGE flag during ifup if some capabilities
or configurations have changed. For example, the CoS queue
configurations may have changed. Support this new flag by treating it
almost like FW reset. The driver will essentially rediscover all
features and capabilities, reconfigure all backing store context memory,
reset everything to default, and reserve all resources.
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: shantiprasad shettar <shantiprasad.shettar@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250310183129.3154117-5-michael.chan@broadcom.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add set/show support for the ENABLE_ROCE NVM parameter to
enable/disable RoCE for a PF.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Co-developed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250310183129.3154117-4-michael.chan@broadcom.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
bnxt_hwrm_nvm_req() first searches the nvm_params[] array for the
NVM parameter to set or get. The array entry contains all the
NVM information about that parameter. The information is then used
to send the FW message to set or get the parameter.
Refactor it to only do the array search in bnxt_hwrm_nvm_req() and
pass the array entry to the new function __bnxt_hwrm_nvm_req() to
send the FW message. The next patch will be able to use the new
function.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250310183129.3154117-3-michael.chan@broadcom.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When doing a live coredump with ethtool -w, the context data cached
in the NIC is not dumped by the FW by default. The reason is that
retrieving this cached context data with traffic running may cause
problems. Add a new dump flag 3 to allow the option to include this
cached context data which may be useful in some debug scenarios.
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Signed-off-by: Vasuthevan Maheswaran <vasuthevan.maheswaran@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250310183129.3154117-2-michael.chan@broadcom.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-03-10 (ice, ixgbe)
For ice:
Paul adds generic checksum support for E830 devices.
Karol refactors PTP code related to E825C; simplifying PHY register info
struct, utilizing GENMASK, removing unused defines, etc.
For ixgbe:
Piotr adds PTP support for E610 devices.
Jedrzej adds reporting when overheating is detected on E610 devices.
The following are changes since commit 8ef890df4031121a94407c84659125cbccd3fdbe:
net: move misc netdev_lock flavors to a separate header
and are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue 100GbE
====================
Link: https://patch.msgid.link/20250310174502.3708121-1-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
E610 NICs unlike the previous devices utilising ixgbe driver
are notified in the case of overheating by the FW ACI event.
In event of overheat when threshold is exceeded, FW suspends all
traffic and sends overtemp event to the driver. Then driver
logs appropriate message and disables the adapter instance.
The card remains in that state until the platform is rebooted.
This approach is a solution to the fact current version of the
E610 FW doesn't support reading thermal sensor data by the
SW. So give to user at least any info that overtemp event
has occurred, without interface disappearing from the OS
without any note.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Jeremiah Lokan <jeremiahx.j.lokan@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20250310174502.3708121-7-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add PTP support for E610 adapter. The E610 is based on X550 and adds
firmware managed link, enhanced security capabilities and support for
updated server manageability. It does not introduce any new PTP features
compared to X550.
Reviewed-by: Milena Olech <milena.olech@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Tested-by: Bharath R <bharath.r@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20250310174502.3708121-6-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Minor PTP register refactor, including logical grouping E825C 1-step
timestamping registers. Remove unused register definitions
(PHY_REG_GPCS_BITSLIP, PHY_REG_REVISION).
Also, apply preferred GENMASK macro (instead of ICE_M) for register
fields definition affected by this patch.
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20250310174502.3708121-5-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Simplify ice_phy_reg_info_eth56g struct definition to include base
address for the very first quad. Use base address info and 'step'
value to determine address for specific PHY quad.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20250310174502.3708121-4-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Refactor the code by changing ice_ptp_init_phc_eth56g function
name to ice_ptp_init_phc_e825, to be consistent with the naming pattern
for other devices.
Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com>
Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20250310174502.3708121-3-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
E830 supports raw receive and generic transmit checksum offloads.
Raw receive checksum support is provided by hardware calculating the
checksum over the whole packet, regardless of type. The calculated
checksum is provided to driver in the Rx flex descriptor. Then the driver
assigns the checksum to skb->csum and sets skb->ip_summed to
CHECKSUM_COMPLETE.
Generic transmit checksum support is provided by hardware calculating the
checksum given two offsets: the start offset to begin checksum calculation,
and the offset to insert the calculated checksum in the packet. Support is
advertised to the stack using NETIF_F_HW_CSUM feature.
E830 has the following limitations when both generic transmit checksum
offload and TCP Segmentation Offload (TSO) are enabled:
1. Inner packet header modification is not supported. This restriction
includes the inability to alter TCP flags, such as the push flag. As a
result, this limitation can impact the receiver's ability to coalesce
packets, potentially degrading network throughput.
2. The Maximum Segment Size (MSS) is limited to 1023 bytes, which prevents
support of Maximum Transmission Unit (MTU) greater than 1063 bytes.
Therefore NETIF_F_HW_CSUM and NETIF_F_ALL_TSO features are mutually
exclusive. NETIF_F_HW_CSUM hardware feature support is indicated but is not
enabled by default. Instead, IP checksums and NETIF_F_ALL_TSO are the
defaults. Enforcement of mutual exclusivity of NETIF_F_HW_CSUM and
NETIF_F_ALL_TSO is done in ice_set_features(). Mutual exclusivity
of IP checksums and NETIF_F_HW_CSUM is handled by netdev_fix_features().
When NETIF_F_HW_CSUM is requested the provided skb->csum_start and
skb->csum_offset are passed to hardware in the Tx context descriptor
generic checksum (GCS) parameters. Hardware calculates the 1's complement
from skb->csum_start to the end of the packet, and inserts the result in
the packet at skb->csum_offset.
Co-developed-by: Alice Michael <alice.michael@intel.com>
Signed-off-by: Alice Michael <alice.michael@intel.com>
Co-developed-by: Eric Joyner <eric.joyner@intel.com>
Signed-off-by: Eric Joyner <eric.joyner@intel.com>
Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20250310174502.3708121-2-anthony.l.nguyen@intel.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Before commit 7627a0edef54 ("ata: ahci: Drop low power policy board type")
the ATI AHCI controllers specified board type 'board_ahci' rather than
board type 'board_ahci'. This means that LPM was historically not enabled
for the ATI AHCI controllers.
By looking at commit 7a8526a5cd51 ("libata: Add ATA_HORKAGE_NO_NCQ_ON_ATI
for Samsung 860 and 870 SSD."), it is clear that, for some unknown reason,
that Samsung SSDs do not play nice with ATI AHCI controllers. (When using
other AHCI controllers, NCQ can be enabled on these Samsung SSDs without
issues.)
In a similar way, from user reports, it is clear the ATI AHCI controllers
can enable LPM on e.g. Maxtor HDDs perfectly fine, but when enabling LPM
on certain Samsung SSDs, things break. (E.g. the SSDs will not get detected
by the ATI AHCI controller even after a COMRESET.)
Yet, when using LPM on these Samsung SSDs with other AHCI controllers, e.g.
Intel AHCI controllers, these Samsung drives appear to work perfectly fine.
Considering that the combination of ATI + Samsung, for some unknown reason,
does not seem to work well, disable LPM when detecting an ATI AHCI
controller with a problematic Samsung SSD.
Apply this new ATA_QUIRK_NO_LPM_ON_ATI quirk for all Samsung SSDs that have
already been reported to not play nice with ATI (ATA_QUIRK_NO_NCQ_ON_ATI).
Fixes: 7627a0edef54 ("ata: ahci: Drop low power policy board type")
Suggested-by: Hans de Goede <hdegoede@redhat.com>
Reported-by: Eric <eric.4.debian@grabatoulnz.fr>
Closes: https://lore.kernel.org/linux-ide/Z8SBZMBjvVXA7OAK@eldamar.lan/
Tested-by: Eric <eric.4.debian@grabatoulnz.fr>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20250317170348.1748671-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
|
|
Maxime Chevallier says:
====================
net: phy: Rework linkmodes handling in a dedicated file
This is V5 of the phy_caps series. In a nutshell, this series reworks the way
we maintain the list of speed/duplex capablities for each linkmode so that we
no longer have multiple definition of these associations.
That will help making sure that when people add new linkmodes in
include/uapi/linux/ethtool.h, they don't have to update phylib and phylink as
well, making the process more straightforward and less error-prone.
It also generalises the phy_caps interface to be able to lookup linkmodes
from phy_interface_t, which is needed for the multi-port work I've been working
on for a while.
This V5 addresse Russell's and Paolo's reviews, namely :
- Error out when encountering an unknown SPEED_XXX setting
It prints an error and fails to initialize phylib. I've tested by
introducing a dummy 1.6T speed, I guess it's only a matter of time
before that actually happens :)
- Deal more gracefully with the fixed-link settings, keeping some level of
compatibility with what we had before by making sure we report a
single BaseT mode like before.
V1 : https://lore.kernel.org/netdev/20250222142727.894124-1-maxime.chevallier@bootlin.com/
V2 : https://lore.kernel.org/netdev/20250226100929.1646454-1-maxime.chevallier@bootlin.com/
V3 : https://lore.kernel.org/netdev/20250228145540.2209551-1-maxime.chevallier@bootlin.com/
V4 : https://lore.kernel.org/netdev/20250303090321.805785-1-maxime.chevallier@bootlin.com/
====================
Link: https://patch.msgid.link/20250307173611.129125-1-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Phylink has internal code to get the MAC capabilities of a given PHY
interface (what are the supported speed and duplex).
Extract that into phy_caps, but use the link_capa for conversion. Add an
internal phylink helper for the link caps -> mac caps conversion, and
use this in phylink_caps_to_linkmodes().
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-14-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
phylink_caps_to_linkmodes() is used to derive a list of linkmodes that
can be conceivably exposed using a given set of speeds and duplex
through phylink's MAC capabilities.
This list can be derived from the link_caps array in phy_caps, provided
we convert the MAC capabilities into a LINK_CAPA bitmask first.
Introduce an internal phylink helper phylink_caps_to_link_caps() to
convert from MAC capabilities into phy_caps, then phy_caps_linkmodes()
to do the link_caps -> linkmodes conversion.
This avoids having to update phylink for every new linkmode.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-13-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
phylink allows MAC drivers to report the capabilities in terms of speed,
duplex and pause support. This is done through a dedicated set of enum
values in the form of the MAC_ capabilities. They are very close to what
the LINK_CAPA_xxx can express, with the difference that LINK_CAPA don't
have any information about Pause/Asym Pause support.
To prepare converting phylink to using the phy_caps, add the mapping
between MAC capabilities and phy_caps. While doing so, we move the
phylink_caps_params array up a bit to simplify future commits.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-12-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The phy_settings array is no longer relevant as it has now been replaced
by the link_caps array and associated phy_caps helpers.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-11-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When phylink creates a fixed-link configuration, it finds a matching
linkmode to set as the advertised, lp_advertising and supported modes
based on the speed and duplex of the fixed link.
Use the newly introduced phy_caps_lookup to get these modes instead of
phy_lookup_settings(). This has the side effect that the matched
settings and configured linkmodes may now contain several linkmodes (the
intersection of supported linkmodes from the phylink settings and the
linkmodes that match speed/duplex) instead of the one from
phy_lookup_settings().
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-10-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When configuring PHY advertising with autoneg disabled, we lookd for an
exact linkmode to advertise and configure for the requested Speed and
Duplex, specially at or over 1G.
Using phy_caps_lookup allows us to build a list of the supported
linkmodes at that speed that we can advertise instead of the first mode
that matches.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-9-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
As the link_caps array is efficient for <speed,duplex> lookups,
implement a function for speed/duplex lookups that matches a given
mask. This replicates to some extent the phy_lookup_settings()
behaviour, matching full link_capabilities instead of a single linkmode.
phy.c's phy_santize_settings() and phylink's
phylink_ethtool_ksettings_set() performs such lookup using the
phy_settings table, but are only interested in the actual speed/duplex
that were matched, rathet than the individual linkmode.
Similar to phy_lookup_settings(), the newly introduced phy_caps_lookup()
will run through the link_caps[] array by descending speed/duplex order.
If the link_capabilities for a given <speed/duplex> tuple intersects the
passed linkmodes, we consider that a match.
Similar to phy_lookup_settings(), we also allow passing an 'exact'
boolean, allowing non-exact match. Here, we MUST always match the
linkmodes mask, but we allow matching on lower speed settings.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-8-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In several occasions, phylib needs to lookup a set of matching speed and
duplex against a given linkmode set. Instead of relying on the
phy_settings array and thus iterate over the whole linkmodes list, use
the link_capabilities array to lookup these matches, as we aren't
interested in the actual link setting that matches but rather the speed
and duplex for that setting.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-7-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
With the link_capabilities array, it's trivial to validate a given mask
againts a <speed, duplex> tuple. Create a helper for that purpose, and
use it to replace a phy_settings lookup in phy_check_valid();
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-6-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Convert the __set_linkmode_max_speed to use the link_capabilities array.
This makes it easy to clamp the linkmodes to a given max speed.
Introduce a new helper phy_caps_linkmode_max_speed to replace the
previous one that used phy_settings.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-5-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Use the newly introduced link_capabilities array to derive the list of
possible speeds when given a combination of linkmodes. As
link_capabilities is indexed by speed, we don't have to iterate the
whole phy_settings array.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-4-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The canonical definition for all the link modes is in linux/ethtool.h,
which is complemented by the link_mode_params array stored in
net/ethtool/common.h . That array contains all the metadata about each
of these modes, including the Speed and Duplex information.
Phylib and phylink needs that information as well for internal
management of the link, which was done by duplicating that information
in locally-stored arrays and lookup functions. This makes it easy for
developpers adding new modes to forget modifying phylib and phylink
accordingly.
However, the link_mode_params array in net/ethtool/common.c is fairly
inefficient to search through, as it isn't sorted in any manner. Phylib
and phylink perform a lot of lookup operations, mostly to filter modes
by speed and/or duplex.
We therefore introduce the link_caps private array in phy_caps.c, that
indexes linkmodes in a more efficient manner. Each element associated a
tuple <speed, duplex> to a bitfield of all the linkmodes runs at these
speed/duplex.
We end-up with an array that's fairly short, easily addressable and that
it optimised for the typical use-cases of phylib/phylink.
That array is initialized at the same time as phylib. As the
link_mode_params array is part of the net stack, which phylink depends
on, it should always be accessible from phylib.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-3-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
link_mode_params contains a lookup table of all 802.3 link modes that
are currently supported with structured data about each mode's speed,
duplex, number of lanes and mediums.
As a preparation for a port representation, export that table for the
rest of the net stack to use.
Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20250307173611.129125-2-maxime.chevallier@bootlin.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
LSMs often inspect the path.mnt of files in the security hooks, and this
causes a NULL deref in efivarfs_pm_notify() because the path is
constructed with a NULL path.mnt.
Fix by obtaining from vfs_kern_mount() instead, and being very careful
to ensure that deactivate_super() (potentially triggered by a racing
userspace umount) is not called directly from the notifier, because it
would deadlock when efivarfs_kill_sb() tried to unregister the notifier
chain.
[ Al notes:
Umm... That's probably safe, but not as a long-term solution -
it's too intimately dependent upon fs/super.c internals. The
reasons why you can't run into ->s_umount deadlock here are
non-trivial... ]
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Link: https://lore.kernel.org/r/e54e6a2f-1178-4980-b771-4d9bafc2aa47@tnxip.de
Link: https://lore.kernel.org/r/3e998bf87638a442cbc6864cdcd3d8d9e08ce3e3.camel@HansenPartnership.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc hotfixes from Andrew Morton:
"15 hotfixes. 7 are cc:stable and the remainder address post-6.13
issues or aren't considered necessary for -stable kernels.
13 are for MM and the other two are for squashfs and procfs.
All are singletons. Please see the individual changelogs for details"
* tag 'mm-hotfixes-stable-2025-03-17-20-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/page_alloc: fix memory accept before watermarks gets initialized
mm: decline to manipulate the refcount on a slab page
memcg: drain obj stock on cpu hotplug teardown
mm/huge_memory: drop beyond-EOF folios with the right number of refs
selftests/mm: run_vmtests.sh: fix half_ufd_size_MB calculation
mm: fix error handling in __filemap_get_folio() with FGP_NOWAIT
mm: memcontrol: fix swap counter leak from offline cgroup
mm/vma: do not register private-anon mappings with khugepaged during mmap
squashfs: fix invalid pointer dereference in squashfs_cache_delete
mm/migrate: fix shmem xarray update during migration
mm/hugetlb: fix surplus pages in dissolve_free_huge_page()
mm/damon/core: initialize damos->walk_completed in damon_new_scheme()
mm/damon: respect core layer filters' allowance decision on ops layer
filemap: move prefaulting out of hot write path
proc: fix UAF in proc_get_inode()
|
|
Unfortunately I no longer have time to meaningfully take part in the
linux kernel development.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull SoC fixes from Arnd Bergmann:
"The majority of these last fixes are for devicetree files.
These address two important regressions for the Qualcomm SMMU and the
Raspberry Pi 4 USB controller, as well as a larger number of patches
fixing minor mistakes in board specific files for Rockchips, i.MX,
starfive and broadcom.
The non-DT changes are
- A fix for an old boot regression on Renesas shmobile chips
- Another boot time regression for for the Qualcomm PDR SoC driver,
among a few other Qualcomm firmware driver fixes for efivars and
tzmem
- Minor Kconfig fixes for davinci and OMAP1
- Minor code fixes for sparx5 reset controllers, OMAP memory
controller, i.MX SCU, cpufreq and SoC drivers and a Hisilicon SoC
driver
- One more update to the Asahi maintainers, adding Neal Gompa as a
reviewer"
* tag 'soc-fixes-6.14-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (35 commits)
ARM: davinci: da850: fix selecting ARCH_DAVINCI_DA8XX
soc: hisilicon: kunpeng_hccs: Fix incorrect string assembly
memory: omap-gpmc: drop no compatible check
reset: mchp: sparx5: Fix for lan966x
ARM: shmobile: smp: Enforce shmobile_smp_* alignment
MAINTAINERS: Add myself (Neal Gompa) as a reviewer for ARM Apple support
MAINTAINERS: Add apple-spi driver & binding files
arm64: dts: rockchip: slow down emmc freq for rock 5 itx
ARM: dts: BCM5301X: Fix switch port labels of ASUS RT-AC3200
ARM: dts: BCM5301X: Fix switch port labels of ASUS RT-AC5300
ARM: dts: bcm2711: Don't mark timer regs unconfigured
ARM: OMAP1: select CONFIG_GENERIC_IRQ_CHIP
arm64: dts: rockchip: Add missing PCIe supplies to RockPro64 board dtsi
arm64: dts: rockchip: Add avdd HDMI supplies to RockPro64 board dtsi
arm64: dts: rockchip: Remove undocumented sdmmc property from lubancat-1
arm64: dts: rockchip: fix pinmux of UART5 for PX30 Ringneck on Haikou
arm64: dts: rockchip: fix pinmux of UART0 for PX30 Ringneck on Haikou
arm64: dts: rockchip: fix u2phy1_host status for NanoPi R4S
arm64: dts: bcm2712: PL011 UARTs are actually r1p5
ARM: dts: bcm2711: PL011 UARTs are actually r1p5
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull probes fixes from Masami Hiramatsu:
- Clean up tprobe correctly when module unload
Tracepoint probes do not set TRACEPOINT_STUB on the 'tpoint' pointer
when unloading a module, thus they show as a normal 'fprobe' instead
of 'tprobe' and never come back
- Fix leakage of tprobe module refcount
When a tprobe's target module is loaded, it gets the module's
refcount in the module notifier but forgot to put it after
registering the probe on it.
Fix it by getting the refcount only when registering tprobe.
* tag 'probes-fixes-v6.14-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: tprobe-events: Fix leakage of module refcount
tracing: tprobe-events: Fix to clean up tprobe correctly when module unload
|
|
'net-stmmac-avoid-unnecessary-work-in-stmmac_release-stmmac_dvr_remove'
Russell King says:
====================
net: stmmac: avoid unnecessary work in stmmac_release()/stmmac_dvr_remove()
This small series is a subset of a RFC I sent earlier. These two
patches remove code that is unnecessary and/or wrong in these paths.
Details in each commit.
====================
Link: https://patch.msgid.link/Z87bpDd7QYYVU0ML@shell.armlinux.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
stmmac_release() calls phylink_stop() and then goes on to call
stmmac_mac_set(, false). However, phylink_stop() will call
stmmac_mac_link_down() before returning, which will do this work.
Remove this unnecessary call.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Furong Xu <0x1207@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1trcI6-005rn8-GV@rmk-PC.armlinux.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
While the network device is registered, it is published to userspace,
and thus userspace can change its state. This means calling
functions such as stmmac_stop_all_dma() and stmmac_mac_set() are
racy.
Moreover, unregister_netdev() will unpublish the network device, and
then if appropriate call the .ndo_stop() method, which is
stmmac_release(). This will first call phylink_stop() which will
synchronously take the link down, resulting in stmmac_mac_link_down()
and stmmac_mac_set(, false) being called.
stmmac_release() will also call stmmac_stop_all_dma().
Consequently, neither of these two functions need to called prior
to unregister_netdev() as that will safely call paths that will
result in this work being done if necessary.
Remove these redundant racy calls.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Furong Xu <0x1207@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1trcI1-005rn2-CZ@rmk-PC.armlinux.org.uk
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|