linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-11-20	net: axienet: Preparatory changes for dmaengine support	Sarath Babu Naidu Gaddam
	The axiethernet driver has inbuilt dma programming. In order to add dmaengine support and make it's integration seamless the current axidma inbuilt programming code is put under use_dmaengine check. It also performs minor code reordering to minimize conditional use_dmaengine checks and there is no functional change. It uses "dmas" property to identify whether it should use a dmaengine framework or inbuilt axidma programming. Signed-off-by: Sarath Babu Naidu Gaddam <sarath.babu.naidu.gaddam@amd.com> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://lore.kernel.org/r/1700074613-1977070-3-git-send-email-radhey.shyam.pandey@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-20	net: microchip: lan743x : bidirectional throughput improvement	Vishvambar Panth S
	The LAN743x/PCI11xxx DMA descriptors are always 4 dwords long, but the device supports placing the descriptors in memory back to back or reserving space in between them using its DMA_DESCRIPTOR_SPACE (DSPACE) configurable hardware setting. Currently DSPACE is unnecessarily set to match the host's L1 cache line size, resulting in space reserved in between descriptors in most platforms and causing a suboptimal behavior (single PCIe Mem transaction per descriptor). By changing the setting to DSPACE=16 many descriptors can be packed in a single PCIe Mem transaction resulting in a massive performance improvement in bidirectional tests without any negative effects. Tested and verified improvements on x64 PC and several ARM platforms (typical data below) Test setup 1: x64 PC with LAN7430 ---> x64 PC iperf3 UDP bidirectional with DSPACE set to L1 CACHE Size: - - - - - - - - - - - - - - - - - - - - - - - - - [ ID][Role] Interval Transfer Bitrate [ 5][TX-C] 0.00-10.00 sec 170 MBytes 143 Mbits/sec sender [ 5][TX-C] 0.00-10.04 sec 169 MBytes 141 Mbits/sec receiver [ 7][RX-C] 0.00-10.00 sec 1.02 GBytes 876 Mbits/sec sender [ 7][RX-C] 0.00-10.04 sec 1.02 GBytes 870 Mbits/sec receiver iperf3 UDP bidirectional with DSPACE set to 16 Bytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID][Role] Interval Transfer Bitrate [ 5][TX-C] 0.00-10.00 sec 1.11 GBytes 956 Mbits/sec sender [ 5][TX-C] 0.00-10.04 sec 1.11 GBytes 951 Mbits/sec receiver [ 7][RX-C] 0.00-10.00 sec 1.10 GBytes 948 Mbits/sec sender [ 7][RX-C] 0.00-10.04 sec 1.10 GBytes 942 Mbits/sec receiver Test setup 2 : RK3399 with LAN7430 ---> x64 PC RK3399 Spec: The SOM-RK3399 is ARM module designed and developed by FriendlyElec. Cores: 64-bit Dual Core Cortex-A72 + Quad Core Cortex-A53 Frequency: Cortex-A72(up to 2.0GHz), Cortex-A53(up to 1.5GHz) PCIe: PCIe x4, compatible with PCIe 2.1, Dual operation mode iperf3 UDP bidirectional with DSPACE set to L1 CACHE Size: - - - - - - - - - - - - - - - - - - - - - - - - - [ ID][Role] Interval Transfer Bitrate [ 5][TX-C] 0.00-10.00 sec 534 MBytes 448 Mbits/sec sender [ 5][TX-C] 0.00-10.05 sec 534 MBytes 446 Mbits/sec receiver [ 7][RX-C] 0.00-10.00 sec 1.12 GBytes 961 Mbits/sec sender [ 7][RX-C] 0.00-10.05 sec 1.11 GBytes 946 Mbits/sec receiver iperf3 UDP bidirectional with DSPACE set to 16 Bytes - - - - - - - - - - - - - - - - - - - - - - - - - [ ID][Role] Interval Transfer Bitrate [ 5][TX-C] 0.00-10.00 sec 966 MBytes 810 Mbits/sec sender [ 5][TX-C] 0.00-10.04 sec 965 MBytes 806 Mbits/sec receiver [ 7][RX-C] 0.00-10.00 sec 1.11 GBytes 956 Mbits/sec sender [ 7][RX-C] 0.00-10.04 sec 1.07 GBytes 919 Mbits/sec receiver Signed-off-by: Vishvambar Panth S <vishvambarpanth.s@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231116054350.620420-1-vishvambarpanth.s@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-21	powerpc/ps3: move udbg_shutdown_ps3gelic prototype	Arnd Bergmann
	Allmodconfig kernels produce a missing-prototypes warning: arch/powerpc/platforms/ps3/gelic_udbg.c:239:6: error: no previous prototype for 'udbg_shutdown_ps3gelic' [-Werror=missing-prototypes] Move the declaration from a local header to asm/ps3.h where it can be seen from both the caller and the definition. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Geoff Levand <geoff@infradead.org> Acked-by: Jakub Kicinski <kuba@kernel.org> [mpe: Drop CONFIG_PS3GELIC_UDBG to fix build error] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231108125843.3806765-18-arnd@kernel.org
2023-11-20	bpf, netkit: Add indirect call wrapper for fetching peer dev	Daniel Borkmann
	ndo_get_peer_dev is used in tcx BPF fast path, therefore make use of indirect call wrapper and therefore optimize the bpf_redirect_peer() internal handling a bit. Add a small skb_get_peer_dev() wrapper which utilizes the INDIRECT_CALL_1() macro instead of open coding. Future work could potentially add a peer pointer directly into struct net_device in future and convert veth and netkit over to use it so that eventually ndo_get_peer_dev can be removed. Co-developed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20231114004220.6495-7-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-20	veth: Use tstats per-CPU traffic counters	Peilin Ye
	Currently veth devices use the lstats per-CPU traffic counters, which only cover TX traffic. veth_get_stats64() actually populates RX stats of a veth device from its peer's TX counters, based on the assumption that a veth device can _only_ receive packets from its peer, which is no longer true: For example, recent CNIs (like Cilium) can use the bpf_redirect_peer() BPF helper to redirect traffic from NIC's tc ingress to veth's tc ingress (in a different netns), skipping veth's peer device. Unfortunately, this kind of traffic isn't currently accounted for in veth's RX stats. In preparation for the fix, use tstats (instead of lstats) to maintain both RX and TX counters for each veth device. We'll use RX counters for bpf_redirect_peer() traffic, and keep using TX counters for the usual "peer-to-peer" traffic. In veth_get_stats64(), calculate RX stats by _adding_ RX count to peer's TX count, in order to cover both kinds of traffic. veth_stats_rx() might need a name change (perhaps to "veth_stats_xdp()") for less confusion, but let's leave it to another patch to keep the fix minimal. Signed-off-by: Peilin Ye <peilin.ye@bytedance.com> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20231114004220.6495-5-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-20	netkit: Add tstats per-CPU traffic counters	Daniel Borkmann
	Add dev->tstats traffic accounting to netkit. The latter contains per-CPU RX and TX counters. The dev's TX counters are bumped upon pass/unspec as well as redirect verdicts, in other words, on everything except for drops. The dev's RX counters are bumped upon successful __netif_rx(), as well as from skb_do_redirect() (not part of this commit here). Using dev->lstats with having just a single packets/bytes counter and inferring one another's RX counters from the peer dev's lstats is not possible given skb_do_redirect() can also bump the device's stats. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/r/20231114004220.6495-4-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-20	net: Move {l,t,d}stats allocation to core and convert veth & vrf	Daniel Borkmann
	Move {l,t,d}stats allocation to the core and let netdevs pick the stats type they need. That way the driver doesn't have to bother with error handling (allocation failure checking, making sure free happens in the right spot, etc) - all happening in the core. Co-developed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Cc: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20231114004220.6495-3-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-20	net, vrf: Move dstats structure to core	Daniel Borkmann
	Just move struct pcpu_dstats out of the vrf into the core, and streamline the field names slightly, so they better align with the {t,l}stats ones. No functional change otherwise. A conversion of the u64s to u64_stats_t could be done at a separate point in future. This move is needed as we are moving the {t,l,d}stats allocation/freeing to the core. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Cc: Jakub Kicinski <kuba@kernel.org> Cc: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20231114004220.6495-2-daniel@iogearbox.net Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-11-20	spi: axi-spi-engine improvements	Mark Brown
	Merge series from David Lechner <dlechner@baylibre.com>: We are working towards adding support for the offload feature[1] of the AXI SPI Engine IP core. Before we can do that, we want to make some general fixes and improvements to the driver. In order to avoid a giant series with 35+ patches, we are splitting this up into a few smaller series. This first series mostly doing some housekeeping: * Convert device tree bindings to yaml. * Add a MAINTAINERS entry. * Clean up probe and remove using devm. * Separate message state from driver state. * Add support for cs_off and variable word size. Once this series is applied, we will follow up with a second series of general improvements, and then finally a 3rd series that implements the offload support. The offload support will also involve the IIO subsystem (a new IIO driver will depend on the new SPI offload feature), so I'm mentioning this now in case we want to do anything ahead of time to prepare for that (e.g. putting all of these changes on a separate branch). [1]: https://wiki.analog.com/resources/fpga/peripherals/spi_engine/offload
2023-11-20	ieee802154: hwsim: Convert to platform remove callback returning void	Uwe Kleine-König
	The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-wpan/20231117095922.876489-11-u.kleine-koenig@pengutronix.de
2023-11-20	ieee802154: fakelb: Convert to platform remove callback returning void	Uwe Kleine-König
	The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is ignored (apart from emitting a warning) and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new(), which already returns void. Eventually after all drivers are converted, .remove_new() will be renamed to .remove(). Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-wpan/20231117095922.876489-10-u.kleine-koenig@pengutronix.de
2023-11-19	octeontx2-pf: Fix memory leak during interface down	Suman Ghosh
	During 'ifconfig <netdev> down' one RSS memory was not getting freed. This patch fixes the same. Fixes: 81a4362016e7 ("octeontx2-pf: Add RSS multi group support") Signed-off-by: Suman Ghosh <sumang@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-19	net: ethernet: mtk_wed: rely on __dev_alloc_page in mtk_wed_tx_buffer_alloc	Lorenzo Bianconi
	Simplify the code and use __dev_alloc_page() instead of __dev_alloc_pages() with order 0 in mtk_wed_tx_buffer_alloc routine Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-19	wireguard: use DEV_STATS_INC()	Eric Dumazet
	wg_xmit() can be called concurrently, KCSAN reported [1] some device stats updates can be lost. Use DEV_STATS_INC() for this unlikely case. [1] BUG: KCSAN: data-race in wg_xmit / wg_xmit read-write to 0xffff888104239160 of 8 bytes by task 1375 on cpu 0: wg_xmit+0x60f/0x680 drivers/net/wireguard/device.c:231 __netdev_start_xmit include/linux/netdevice.h:4918 [inline] netdev_start_xmit include/linux/netdevice.h:4932 [inline] xmit_one net/core/dev.c:3543 [inline] dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3559 ... read-write to 0xffff888104239160 of 8 bytes by task 1378 on cpu 1: wg_xmit+0x60f/0x680 drivers/net/wireguard/device.c:231 __netdev_start_xmit include/linux/netdevice.h:4918 [inline] netdev_start_xmit include/linux/netdevice.h:4932 [inline] xmit_one net/core/dev.c:3543 [inline] dev_hard_start_xmit+0x11b/0x3f0 net/core/dev.c:3559 ... v2: also change wg_packet_consume_data_done() (Hangbin Liu) and wg_packet_purge_staged_packets() Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-19	net: ethernet: ti: am65-cpsw: Fix error handling in am65_cpsw_nuss_common_open()	Roger Quadros
	k3_udma_glue_enable_rx/tx_chn returns error code on failure. Bail out on error while enabling TX/RX channel. In the error path, clean up the RX descriptors and SKBs. Get rid of kmemleak_not_leak() as it seems unnecessary now. Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver") Signed-off-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-19	net: ethernet: am65-cpsw: Set default TX channels to maximum	Roger Quadros
	am65-cpsw supports 8 TX hardware queues. Set this as default. The rationale is that some am65-cpsw devices can have up to 4 ethernet ports. If the number of TX channels have to be changed then all interfaces have to be brought down and up as the old default of 1 TX channel is too restrictive for any mqprio/taprio usage. Another reason for this change is to allow testing using kselftest:net/forwarding:ethtool_mm.sh out of the box. Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-19	net: ethernet: ti: am65-cpsw: Re-arrange functions to avoid forward declaration	Roger Quadros
	Re-arrange am65_cpsw_nuss_rx_cleanup(), am65_cpsw_nuss_xmit_free() and am65_cpsw_nuss_tx_cleanup() to avoid forward declaration. No functional change. Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-19	net: ethernet: am65-cpsw: Add standard Ethernet MAC stats to ethtool	Roger Quadros
	Gets 'ethtool -S eth0 --groups eth-mac' command to work. Signed-off-by: Roger Quadros <rogerq@kernel.org> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-19	net: wangxun: fix kernel panic due to null pointer	Jiawen Wu
	When the device uses a custom subsystem vendor ID, the function wx_sw_init() returns before the memory of 'wx->mac_table' is allocated. The null pointer will causes the kernel panic. Fixes: 79625f45ca73 ("net: wangxun: Move MAC address handling to libwx") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	Merge branch '100GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: one by one port representors creation Michal Swiatkowski says: Currently ice supports creating port representors only for VFs. For that use case they can be created and removed in one step. This patchset is refactoring current flow to support port representor creation also for subfunctions and SIOV. In this case port representors need to be created and removed one by one. Also, they can be added and removed while other port representors are running. To achieve that we need to change the switchdev configuration flow. Three first patches are only cosmetic (renaming, removing not used code). Next few ones are preparation for new flow. The most important one is "add VF representor one by one". It fully implements new flow. New type of port representor (for subfunction) will be introduced in follow up patchset. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: reserve number of CP queues ice: adjust switchdev rebuild path ice: add VF representors one by one ice: realloc VSI stats arrays ice: set Tx topology every time new repr is added ice: allow changing SWITCHDEV_CTRL VSI queues ice: return pointer to representor ice: make representor code generic ice: remove VF pointer reference in eswitch code ice: track port representors in xarray ice: use repr instead of vf->repr ice: track q_id in representor ice: remove unused control VSI parameter ice: remove redundant max_vsi_num variable ice: rename switchdev to eswitch ==================== Link: https://lore.kernel.org/r/20231114181449.1290117-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-18	Merge branch '1GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== igc: Add support for physical + free-running timers Vinicius Costa Gomes says: The objective is to allow having functionality that depends on the physical timer (taprio and ETF offloads, for example) and vclocks operating together. The "big" missing piece is the implementation of the .getcyclesx64() function in igc, as i225/i226 have multiple timers, we use one of those timers (timer 1) as a free-running (non adjustable) timer. The complication is that only implementing .getcyclesx64() and nothing else will break synchronization when using vclocks, as reading the clock will retrieve the free-running value but timnestamps will come from the adjustable timer. The solution is to modify "in one go" the timestamping code to be able to retrieve the timestamp from the correct timer (if a socket is "phc_bound" to a vclock the timestamp will come from the free-running timer). I was debating whether or not to do the adjustments for the internal latencies for the free-running timestamps, decided to do the adjustments so the path delay when using vclocks is similar to the one when using the physical clock. One future improvement is to implement the .getcrosscycles() function. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igc: Add support for PTP .getcyclesx64() igc: Simplify setting flags in the TX data descriptor ==================== Link: https://lore.kernel.org/r/20231114183640.1303163-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-18	net: partial revert of the "Make timestamping selectable: series	Jakub Kicinski
	Revert following commits: commit acec05fb78ab ("net_tstamp: Add TIMESTAMPING SOFTWARE and HARDWARE mask") commit 11d55be06df0 ("net: ethtool: Add a command to expose current time stamping layer") commit bb8645b00ced ("netlink: specs: Introduce new netlink command to get current timestamp") commit d905f9c75329 ("net: ethtool: Add a command to list available time stamping layers") commit aed5004ee7a0 ("netlink: specs: Introduce new netlink command to list available time stamping layers") commit 51bdf3165f01 ("net: Replace hwtstamp_source by timestamping layer") commit 0f7f463d4821 ("net: Change the API of PHY default timestamp to MAC") commit 091fab122869 ("net: ethtool: ts: Update GET_TS to reply the current selected timestamp") commit 152c75e1d002 ("net: ethtool: ts: Let the active time stamping layer be selectable") commit ee60ea6be0d3 ("netlink: specs: Introduce time stamping set command") They need more time for reviews. Link: https://lore.kernel.org/all/20231118183529.6e67100c@kernel.org/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-18	r8169: improve RTL8411b phy-down fixup	Heiner Kallweit
	Mirsad proposed a patch to reduce the number of spinlock lock/unlock operations and the function code size. This can be further improved because the function sets a consecutive register block. Suggested-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Mirsad Todorovac <mirsad.todorovac@alu.unizg.hr> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	Merge tag 'mlx5-updates-2023-11-13' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux mlx5-updates-2023-11-13 1) Cleanup patches, leftovers from previous cycle 2) Allow sync reset flow when BF MGT interface device is present 3) Trivial ptp refactorings and improvements 4) Add local loopback counter to vport rep stats Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	mlxsw: pci: Implement PCI reset handlers	Ido Schimmel
	Implement reset_prepare() and reset_done() handlers that are invoked by the PCI core before and after issuing a PCI reset, respectively. Specifically, implement reset_prepare() by calling mlxsw_core_bus_device_unregister() and reset_done() by calling mlxsw_core_bus_device_register(). This is the same implementation as the reload_{down,up}() devlink operations with the following differences: 1. The devlink instance is unregistered and then registered again after the reset. 2. A reset via the device's command interface (using MRSR register) is not issued during reset_done() as PCI core already issued a PCI reset. Tested: # for i in $(seq 1 10); do echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset; done Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	mlxsw: pci: Add support for new reset flow	Ido Schimmel
	The driver resets the device during probe and during a devlink reload. The current reset method reloads the current firmware version or a pending one, if one was previously flashed using devlink. However, the current reset method does not result in a PCI hot reset, preventing the PCI firmware from being upgraded, unless the system is rebooted. To solve this problem, a new reset command (6) was implemented in the firmware. Unlike the current command (1), after issuing the new command the device will not start the reset immediately, but only after a PCI hot reset. Implement the new reset method by first verifying that it is supported by the current firmware version by querying the Management Capabilities Mask (MCAM) register. If supported, issue the new reset command (6) via MRSR register followed by a PCI reset by calling __pci_reset_function_locked(). Once the PCI firmware is operational, go back to the regular reset flow and wait for the entire device to become ready. That is, repeatedly read the "system_status" register from the BAR until a value of "FW_READY" (0x5E) appears. Tested: # for i in $(seq 1 10); do devlink dev reload pci/0000:01:00.0; done Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	mlxsw: pci: Move software reset code to a separate function	Amit Cohen
	In general, the existing flow of software reset in the driver is: 1. Wait for system ready status. 2. Send MRSR command, to start the reset. 3. Wait for system ready status. This flow will be extended once a new reset command is supported. As a preparation, move step #2 to a separate function. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	mlxsw: pci: Rename mlxsw_pci_sw_reset()	Amit Cohen
	In the next patches, mlxsw_pci_sw_reset() will be extended to support more reset types and will not necessarily issue a software reset. Rename the function to reflect that. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	mlxsw: Extend MRSR pack() function to support new commands	Amit Cohen
	Currently mlxsw_reg_mrsr_pack() always sets 'command=1'. As preparation for support of new reset flow, pass the command as an argument to the function and add an enum for this field. For now, always pass 'command=1' to the pack() function. Signed-off-by: Amit Cohen <amcohen@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	net: Change the API of PHY default timestamp to MAC	Kory Maincent
	Change the API to select MAC default time stamping instead of the PHY. Indeed the PHY is closer to the wire therefore theoretically it has less delay than the MAC timestamping but the reality is different. Due to lower time stamping clock frequency, latency in the MDIO bus and no PHC hardware synchronization between different PHY, the PHY PTP is often less precise than the MAC. The exception is for PHY designed specially for PTP case but these devices are not very widespread. For not breaking the compatibility I introduce a default_timestamp flag in phy_device that is set by the phy driver to know we are using the old API behavior. The phy_set_timestamp function is called at each call of phy_attach_direct. In case of MAC driver using phylink this function is called when the interface is turned up. Then if the interface goes down and up again the last choice of timestamp will be overwritten by the default choice. A solution could be to cache the timestamp status but it can bring other issues. In case of SFP, if we change the module, it doesn't make sense to blindly re-set the timestamp back to PHY, if the new module has a PHY with mediocre timestamping capabilities. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	net: Replace hwtstamp_source by timestamping layer	Kory Maincent
	Replace hwtstamp_source which is only used by the kernel_hwtstamp_config structure by the more widely use timestamp_layer structure. This is done to prepare the support of selectable timestamping source. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	net: phy: micrel: fix ts_info value in case of no phc	Kory Maincent
	In case of no phc we should not return SOFTWARE TIMESTAMPING flags as we do not know whether the netdev supports of timestamping. Remove it from the lan8841_ts_info and simply return 0. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	net: macb: Convert to ndo_hwtstamp_get() and ndo_hwtstamp_set()	Kory Maincent
	The hardware timestamping through ndo_eth_ioctl() is going away. Convert the macb driver to the new API before that can be removed. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	net: ethtool: Refactor identical get_ts_info implementations.	Richard Cochran
	The vlan, macvlan and the bonding drivers call their "real" device driver in order to report the time stamping capabilities. Provide a core ethtool helper function to avoid copy/paste in the stack. Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	net: phy: Remove the call to phy_mii_ioctl in phy_hwstamp_get/set	Kory Maincent
	__phy_hwtstamp_set function were calling the phy_mii_ioctl function which will then use the ifreq pointer to call the hwtstamp callback. Now that ifreq has been removed from the hwstamp callback parameters it seems more logical to not go through the phy_mii_ioctl function and pass directly kernel_hwtstamp_config parameter to the hwtstamp callback. Lets do the same for __phy_hwtstamp_get function and return directly EOPNOTSUPP as SIOCGHWTSTAMP is not supported for now for the PHYs. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-18	net: Convert PHYs hwtstamp callback to use kernel_hwtstamp_config	Kory Maincent
	The PHYs hwtstamp callback are still getting the timestamp config from ifreq and using copy_from/to_user. Get rid of these functions by using timestamp configuration in parameter. This also allow to move on to kernel_hwtstamp_config and be similar to net devices using the new ndo_hwstamp_get/set. This adds the possibility to manipulate the timestamp configuration from the kernel which was not possible with the copy_from/to_user. Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-17	gve: add gve_features_check()	Eric Dumazet
	It is suboptimal to attempt skb linearization from ndo_start_xmit() if a gso skb has pathological layout, or if host stack does not have access to the payload (TCP direct). Linearization of large skbs can also fail under memory pressure. We should instead have an ndo_features_check() so that we can fallback to GSO, which is supported even for TCP direct, and generally much more efficient (no payload copy). Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Bailey Forrest <bcf@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Jeroen de Borst <jeroendb@google.com> Cc: Praveen Kaligineedi <pkaligineedi@google.com> Cc: Shailend Chand <shailend@google.com> Cc: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-17	net: phy: broadcom: Wire suspend/resume for BCM54612E	Marco von Rosenberg
	The BCM54612E ethernet PHY supports IDDQ-SR. Therefore wire-up the suspend and resume callbacks to point to bcm54xx_suspend() and bcm54xx_resume(). Signed-off-by: Marco von Rosenberg <marcovr@selfnet.de> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	stmmac: dwmac-loongson: Add architecture dependency	Jean Delvare
	Only present the DWMAC_LOONGSON option on architectures where it can actually be used. This follows the same logic as the DWMAC_INTEL option. Signed-off-by: Jean Delvare <jdelvare@suse.de> Cc: Keguang Zhang <keguang.zhang@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	net: sfp: use linkmode_*() rather than open coding	Russell King (Oracle)
	Use the linkmode_*() helpers rather than open coding the calls to the bitmap operators. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	net: phylink: use linkmode_fill()	Russell King (Oracle)
	Use linkmode_fill() rather than open coding the bitmap operation. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	usb: aqc111: check packet for fixup for true limit	Oliver Neukum
	If a device sends a packet that is inbetween 0 and sizeof(u64) the value passed to skb_trim() as length will wrap around ending up as some very large value. The driver will then proceed to parse the header located at that position, which will either oops or process some random value. The fix is to check against sizeof(u64) rather than 0, which the driver currently does. The issue exists since the introduction of the driver. Signed-off-by: Oliver Neukum <oneukum@suse.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	vxlan: add support for flowlabel inherit	Alce Lafranque
	By default, VXLAN encapsulation over IPv6 sets the flow label to 0, with an option for a fixed value. This commits add the ability to inherit the flow label from the inner packet, like for other tunnel implementations. This enables devices using only L3 headers for ECMP to correctly balance VXLAN-encapsulated IPv6 packets. ``` $ ./ip/ip link add dummy1 type dummy $ ./ip/ip addr add 2001:db8::2/64 dev dummy1 $ ./ip/ip link set up dev dummy1 $ ./ip/ip link add vxlan1 type vxlan id 100 flowlabel inherit remote 2001:db8::1 local 2001:db8::2 $ ./ip/ip link set up dev vxlan1 $ ./ip/ip addr add 2001:db8:1::2/64 dev vxlan1 $ ./ip/ip link set arp off dev vxlan1 $ ping -q 2001:db8:1::1 & $ tshark -d udp.port==8472,vxlan -Vpni dummy1 -c1 [...] Internet Protocol Version 6, Src: 2001:db8::2, Dst: 2001:db8::1 0110 .... = Version: 6 .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0) .... 1011 0001 1010 1111 1011 = Flow Label: 0xb1afb [...] Virtual eXtensible Local Area Network Flags: 0x0800, VXLAN Network ID (VNI) Group Policy ID: 0 VXLAN Network Identifier (VNI): 100 [...] Internet Protocol Version 6, Src: 2001:db8:1::2, Dst: 2001:db8:1::1 0110 .... = Version: 6 .... 0000 0000 .... .... .... .... .... = Traffic Class: 0x00 (DSCP: CS0, ECN: Not-ECT) .... 0000 00.. .... .... .... .... .... = Differentiated Services Codepoint: Default (0) .... .... ..00 .... .... .... .... .... = Explicit Congestion Notification: Not ECN-Capable Transport (0) .... 1011 0001 1010 1111 1011 = Flow Label: 0xb1afb ``` Signed-off-by: Alce Lafranque <alce@lafranque.net> Co-developed-by: Vincent Bernat <vincent@bernat.ch> Signed-off-by: Vincent Bernat <vincent@bernat.ch> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	net: phy: aquantia: add firmware load support	Robert Marko
	Aquantia PHY-s require firmware to be loaded before they start operating. It can be automatically loaded in case when there is a SPI-NOR connected to Aquantia PHY-s or can be loaded from the host via MDIO. This patch adds support for loading the firmware via MDIO as in most cases there is no SPI-NOR being used to save on cost. Firmware loading code itself is ported from mainline U-boot with cleanups. The firmware has mixed values both in big and little endian. PHY core itself is big-endian but it expects values to be in little-endian. The firmware is little-endian but CRC-16 value for it is stored at the end of firmware in big-endian. It seems the PHY does the conversion internally from firmware that is little-endian to the PHY that is big-endian on using the mailbox but mailbox returns a big-endian CRC-16 to verify the written data integrity. Co-developed-by: Christian Marangi <ansuelsmth@gmail.com> Signed-off-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	net: phy: aquantia: move MMD_VEND define to header	Christian Marangi
	Move MMD_VEND define to header to clean things up and in preparation for firmware loading support that require some define placed in aquantia_main. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	net: phy: aquantia: move to separate directory	Christian Marangi
	Move aquantia PHY driver to separate driectory in preparation for firmware loading support to keep things tidy. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	octeon_ep: remove atomic variable usage in Tx data path	Shinas Rasheed
	Replace atomic variable "instr_pending" which represents number of posted tx instructions pending completion, with host_write_idx and flush_index variables in the xmit and completion processing respectively. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	octeon_ep: implement xmit_more in transmit	Shinas Rasheed
	Add xmit_more handling in tx datapath for octeon_ep pf. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	octeon_ep: remove dma sync in trasmit path	Shinas Rasheed
	Cleanup dma sync calls for scatter gather mappings, since they are coherent allocations and do not need explicit sync to be called. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-11-16	octeon_ep: add padding for small packets	Shinas Rasheed
	Pad small packets to ETH_ZLEN before transmit, as hardware cannot pad and requires software padding to ensure minimum ethernet frame length. Signed-off-by: Shinas Rasheed <srasheed@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>