linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2022-12-20	net: fec: check the return value of build_skb()	Wei Fang
	The build_skb might return a null pointer but there is no check on the return value in the fec_enet_rx_queue(). So a null pointer dereference might occur. To avoid this, we check the return value of build_skb. If the return value is a null pointer, the driver will recycle the page and update the statistic of ndev. Then jump to rx_processing_done to clear the status flags of the BD so that the hardware can recycle the BD. Fixes: 95698ff6177b ("net: fec: using page pool to manage RX buffers") Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Shenwei Wang <Shenwei.wang@nxp.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://lore.kernel.org/r/20221219022755.1047573-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-14	net: enetc: avoid buffer leaks on xdp_do_redirect() failure	Vladimir Oltean
	Before enetc_clean_rx_ring_xdp() calls xdp_do_redirect(), each software BD in the RX ring between index orig_i and i can have one of 2 refcount values on its page. We are the owner of the current buffer that is being processed, so the refcount will be at least 1. If the current owner of the buffer at the diametrically opposed index in the RX ring (i.o.w, the other half of this page) has not yet called kfree(), this page's refcount could even be 2. enetc_page_reusable() in enetc_flip_rx_buff() tests for the page refcount against 1, and [ if it's 2 ] does not attempt to reuse it. But if enetc_flip_rx_buff() is put after the xdp_do_redirect() call, the page refcount can have one of 3 values. It can also be 0, if there is no owner of the other page half, and xdp_do_redirect() for this buffer ran so far that it triggered a flush of the devmap/cpumap bulk queue, and the consumers of those bulk queues also freed the buffer, all by the time xdp_do_redirect() returns the execution back to enetc. This is the reason why enetc_flip_rx_buff() is called before xdp_do_redirect(), but there is a big flaw with that reasoning: enetc_flip_rx_buff() will set rx_swbd->page = NULL on both sides of the enetc_page_reusable() branch, and if xdp_do_redirect() returns an error, we call enetc_xdp_free(), which does not deal gracefully with that. In fact, what happens is quite special. The page refcounts start as 1. enetc_flip_rx_buff() figures they're reusable, transfers these rx_swbd->page pointers to a different rx_swbd in enetc_reuse_page(), and bumps the refcount to 2. When xdp_do_redirect() later returns an error, we call the no-op enetc_xdp_free(), but we still haven't lost the reference to that page. A copy of it is still at rx_ring->next_to_alloc, but that has refcount 2 (and there are no concurrent owners of it in flight, to drop the refcount). What really kills the system is when we'll flip the rx_swbd->page the second time around. With an updated refcount of 2, the page will not be reusable and we'll really leak it. Then enetc_new_page() will have to allocate more pages, which will then eventually leak again on further errors from xdp_do_redirect(). The problem, summarized, is that we zeroize rx_swbd->page before we're completely done with it, and this makes it impossible for the error path to do something with it. Since the packet is potentially multi-buffer and therefore the rx_swbd->page is potentially an array, manual passing of the old pointers between enetc_flip_rx_buff() and enetc_xdp_free() is a bit difficult. For the sake of going with a simple solution, we accept the possibility of racing with xdp_do_redirect(), and we move the flip procedure to execute only on the redirect success path. By racing, I mean that the page may be deemed as not reusable by enetc (having a refcount of 0), but there will be no leak in that case, either. Once we accept that, we have something better to do with buffers on XDP_REDIRECT failure. Since we haven't performed half-page flipping yet, we won't, either (and this way, we can avoid enetc_xdp_free() completely, which gives the entire page to the slab allocator). Instead, we'll call enetc_xdp_drop(), which will recycle this half of the buffer back to the RX ring. Fixes: 9d2b68cc108d ("net: enetc: add support for XDP_REDIRECT") Suggested-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20221213001908.2347046-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-08	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-07	dpaa2-switch: Fix memory leak in dpaa2_switch_acl_entry_add() and ↵	Yuan Can
	dpaa2_switch_acl_entry_remove() The cmd_buff needs to be freed when error happened in dpaa2_switch_acl_entry_add() and dpaa2_switch_acl_entry_remove(). Fixes: 1110318d83e8 ("dpaa2-switch: add tc flower hardware offload on ingress traffic") Signed-off-by: Yuan Can <yuancan@huawei.com> Link: https://lore.kernel.org/r/20221205061515.115012-1-yuancan@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-06	net: fec: properly guard irq coalesce setup	Rasmus Villemoes
	Prior to the Fixes: commit, the initialization code went through the same fec_enet_set_coalesce() function as used by ethtool, and that function correctly checks whether the current variant has support for irq coalescing. Now that the initialization code instead calls fec_enet_itr_coal_set() directly, that call needs to be guarded by a check for the FEC_QUIRK_HAS_COALESCE bit. Fixes: df727d4547de (net: fec: don't reset irq coalesce settings to defaults on "ip link up") Reported-by: Greg Ungerer <gregungerer@westnet.com.au> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20221205204604.869853-1-linux@rasmusvillemoes.dk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-01	net: dpaa2-mac: move rtnl_lock() only around phylink_{,dis}connect_phy()	Vladimir Oltean
	After the introduction of a private mac_lock that serializes access to priv->mac (and port_priv->mac in the switch), the only remaining purpose of rtnl_lock() is to satisfy the locking requirements of phylink_fwnode_phy_connect() and phylink_disconnect_phy(). But the functions these live in, dpaa2_mac_connect() and dpaa2_mac_disconnect(), have contradictory locking requirements. While phylink_fwnode_phy_connect() wants rtnl_lock() to be held, phylink_create() wants it to not be held. Move the rtnl_lock() from top-level (in the dpaa2-eth and dpaa2-switch drivers) to only surround the phylink calls that require it, in the dpaa2-mac library code. This is possible because dpaa2_mac_connect() and dpaa2_mac_disconnect() run unlocked, and there isn't any danger of an AB/BA deadlock between the rtnl_mutex and other private locks. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2-switch: serialize changes to priv->mac with a mutex	Vladimir Oltean
	The dpaa2-switch driver uses a DPMAC in the same way as the dpaa2-eth driver, so we need to duplicate the locking solution established by the previous change to the switch driver as well. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2-eth: serialize changes to priv->mac with a mutex	Vladimir Oltean
	The dpaa2 architecture permits dynamic connections between objects on the fsl-mc bus, specifically between a DPNI object (represented by a struct net_device) and a DPMAC object (represented by a struct phylink). The DPNI driver is notified when those connections are created/broken through the dpni_irq0_handler_thread() method. To ensure that ethtool operations, as well as netdev up/down operations serialize with the connection/disconnection of the DPNI with a DPMAC, dpni_irq0_handler_thread() takes the rtnl_lock() to block those other operations from taking place. There is code called by dpaa2_mac_connect() which wants to acquire the rtnl_mutex once again, see phylink_create() -> phylink_register_sfp() -> sfp_bus_add_upstream() -> rtnl_lock(). So the strategy doesn't quite work out, even though it's fairly simple. Create a different strategy, where all code paths in the dpaa2-eth driver access priv->mac only while they are holding priv->mac_lock. The phylink instance is not created or connected to the PHY under the priv->mac_lock, but only assigned to priv->mac then. This will eliminate the reliance on the rtnl_mutex. Add lockdep annotations and put comments where holding the lock is not necessary, and priv->mac can be dereferenced freely. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2-eth: connect to MAC before requesting the "endpoint changed" IRQ	Vladimir Oltean
	dpaa2_eth_connect_mac() is called both from dpaa2_eth_probe() and from dpni_irq0_handler_thread(). It could happen that the DPNI gets connected to a DPMAC on the fsl-mc bus exactly during probe, as soon as the "endpoint change" interrupt is requested in dpaa2_eth_setup_irqs(). This will cause the dpni_irq0_handler_thread() to register a phylink instance for that DPMAC. Then, the probing function will also try to register a phylink instance for the same DPMAC, operation which should fail (and this will fail the probing of the driver). Reorder dpaa2_eth_setup_irqs() and dpaa2_eth_connect_mac(), such that dpni_irq0_handler_thread() never races with the DPMAC-related portion of the probing path. Also reorder dpaa2_eth_disconnect_mac() to be in the mirror position of dpaa2_eth_connect_mac() in the teardown path. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2-switch replace direct MAC access with dpaa2_switch_port_has_mac()	Vladimir Oltean
	The helper function will gain a lockdep annotation in a future patch. Make sure to benefit from it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2: publish MAC stringset to ethtool -S even if MAC is missing	Vladimir Oltean
	DPNIs and DPSW objects can connect and disconnect at runtime from DPMAC objects on the same fsl-mc bus. The DPMAC object also holds "ethtool -S" unstructured counters. Those counters are only shown for the entity owning the netdev (DPNI, DPSW) if it's connected to a DPMAC. The ethtool stringset code path is split into multiple callbacks, but currently, connecting and disconnecting the DPMAC takes the rtnl_lock(). This blocks the entire ethtool code path from running, see ethnl_default_doit() -> rtnl_lock() -> ops->prepare_data() -> strset_prepare_data(). This is going to be a problem if we are going to no longer require rtnl_lock() when connecting/disconnecting the DPMAC, because the DPMAC could appear between ops->get_sset_count() and ops->get_strings(). If it appears out of the blue, we will provide a stringset into an array that was dimensioned thinking the DPMAC wouldn't be there => array accessed out of bounds. There isn't really a good way to work around that, and I don't want to put too much pressure on the ethtool framework by playing locking games. Just make the DPMAC counters be always available. They'll be zeroes if the DPNI or DPSW isn't connected to a DPMAC. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2-switch: assign port_priv->mac after dpaa2_mac_connect() call	Vladimir Oltean
	The dpaa2-switch has the exact same locking requirements when connected to a DPMAC, so it needs port_priv->mac to always point either to NULL, or to a DPMAC with a fully initialized phylink instance. Make the same preparatory change in the dpaa2-switch driver as in the dpaa2-eth one. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2-eth: assign priv->mac after dpaa2_mac_connect() call	Vladimir Oltean
	There are 2 requirements for correct code: - Any time the driver accesses the priv->mac pointer at runtime, it either holds NULL to indicate a DPNI-DPNI connection (or unconnected DPNI), or a struct dpaa2_mac whose phylink instance was fully initialized (created and connected to the PHY). No changes are made to priv->mac while it is being used. Currently, rtnl_lock() watches over the call to dpaa2_eth_connect_mac(), so it serves the purpose of serializing this with all readers of priv->mac. - dpaa2_mac_connect() should run unlocked, because inside it are 2 phylink calls with incompatible locking requirements: phylink_create() requires that the rtnl_mutex isn't held, and phylink_fwnode_phy_connect() requires that the rtnl_mutex is held. The only way to solve those contradictory requirements is to let dpaa2_mac_connect() take rtnl_lock() when it needs to. To solve both requirements, we need to identify the writer side of the priv->mac pointer, which can be wrapped in a mutex private to the driver in a future patch. The dpaa2_mac_connect() cannot be part of the writer side critical section, because of an AB/BA deadlock with rtnl_lock(). So the strategy needs to be that where we prepare the DPMAC by calling dpaa2_mac_connect(), and only make priv->mac point to it once it's fully prepared. This ensures that the writer side critical section has the absolute minimum surface it can. The reverse strategy is adopted in the dpaa2_eth_disconnect_mac() code path. This makes sure that priv->mac is NULL when we start tearing down the DPMAC that we disconnected from, and concurrent code will simply not see it. No locking changes in this patch (concurrent code is still blocked by the rtnl_mutex). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2-mac: remove defensive check in dpaa2_mac_disconnect()	Vladimir Oltean
	dpaa2_mac_disconnect() will only be called with a NULL mac->phylink if dpaa2_mac_connect() failed, or was never called. The callers are these: dpaa2_eth_disconnect_mac(): if (dpaa2_eth_is_type_phy(priv)) dpaa2_mac_disconnect(priv->mac); dpaa2_switch_port_disconnect_mac(): if (dpaa2_switch_port_is_type_phy(port_priv)) dpaa2_mac_disconnect(port_priv->mac); priv->mac can be NULL, but in that case, dpaa2_eth_is_type_phy() returns false, and dpaa2_mac_disconnect() is never called. Similar for dpaa2-switch. When priv->mac is non-NULL, it means that dpaa2_mac_connect() returned zero (success), and therefore, priv->mac->phylink is also a valid pointer. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2-mac: absorb phylink_start() call into dpaa2_mac_start()	Vladimir Oltean
	The phylink handling is intended to be hidden inside the dpaa2_mac object. Move the phylink_start() call into dpaa2_mac_start(), and phylink_stop() into dpaa2_mac_stop(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2: replace dpaa2_mac_is_type_fixed() with dpaa2_mac_is_type_phy()	Vladimir Oltean
	dpaa2_mac_is_type_fixed() is a header with no implementation and no callers, which is referenced from the documentation though. It can be deleted. On the other hand, it would be useful to reuse the code between dpaa2_eth_is_type_phy() and dpaa2_switch_port_is_type_phy(). That common code should be called dpaa2_mac_is_type_phy(), so let's create that. The removal and the addition are merged into the same patch because, in fact, is_type_phy() is the logical opposite of is_type_fixed(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-12-01	net: dpaa2-eth: don't use -ENOTSUPP error code	Vladimir Oltean
	dpaa2_eth_setup_dpni() is called from the probe path and dpaa2_eth_set_link_ksettings() is propagated to user space. include/linux/errno.h says that ENOTSUPP is "Defined for the NFSv3 protocol". Conventional wisdom has it to not use it in networking drivers. Replace it with -EOPNOTSUPP. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Ioana Ciornei <ioana.ciornei@nxp.com> Tested-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-30	net: devlink: let the core report the driver name instead of the drivers	Vincent Mailhol
	The driver name is available in device_driver::name. Right now, drivers still have to report this piece of information themselves in their devlink_ops::info_get callback function. In order to factorize code, make devlink_nl_info_fill() add the driver name attribute. Now that the core sets the driver name attribute, drivers are not supposed to call devlink_info_driver_name_put() anymore. Remove devlink_info_driver_name_put() and clean-up all the drivers using this function in their callback. Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Tested-by: Ido Schimmel <idosch@nvidia.com> # mlxsw Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	tools/lib/bpf/ringbuf.c 927cbb478adf ("libbpf: Handle size overflow for ringbuf mmap") b486d19a0ab0 ("libbpf: checkpatch: Fixed code alignments in ringbuf.c") https://lore.kernel.org/all/20221121122707.44d1446a@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-25	net: fec: don't reset irq coalesce settings to defaults on "ip link up"	Rasmus Villemoes
	Currently, when a FEC device is brought up, the irq coalesce settings are reset to their default values (1000us, 200 frames). That's unexpected, and breaks for example use of an appropriate .link file to make systemd-udev apply the desired settings (https://www.freedesktop.org/software/systemd/man/systemd.link.html), or any other method that would do a one-time setup during early boot. Refactor the code so that fec_restart() instead uses fec_enet_itr_coal_set(), which simply applies the settings that are stored in the private data, and initialize that private data with the default values. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-23	net: enetc: preserve TX ring priority across reconfiguration	Vladimir Oltean
	In the blamed commit, a rudimentary reallocation procedure for RX buffer descriptors was implemented, for the situation when their format changes between normal (no PTP) and extended (PTP). enetc_hwtstamp_set() calls enetc_close() and enetc_open() in a sequence, and this sequence loses information which was previously configured in the TX BDR Mode Register, specifically via the enetc_set_bdr_prio() call. The TX ring priority is configured by tc-mqprio and tc-taprio, and affects important things for TSN such as the TX time of packets. The issue manifests itself most visibly by the fact that isochron --txtime reports premature packet transmissions when PTP is first enabled on an enetc interface. Save the TX ring priority in a new field in struct enetc_bdr (occupies a 2 byte hole on arm64) in order to make this survive a ring reconfiguration. Fixes: 434cebabd3a2 ("enetc: Add dynamic allocation of extended Rx BD rings") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Link: https://lore.kernel.org/r/20221122130936.1704151-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18	net: fman: remove reference to non-existing config PCS	Lukas Bulwahn
	Commit a7c2a32e7f22 ("net: fman: memac: Use lynx pcs driver") makes the Freescale Data-Path Acceleration Architecture Frame Manager use lynx pcs driver by selecting PCS_LYNX. It also selects the non-existing config PCS as well, which has no effect. Remove this select to a non-existing config. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Link: https://lore.kernel.org/r/20221116102450.13928-1-lukas.bulwahn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-14	net: dpaa2: Remove linux/msi.h includes	Thomas Gleixner
	Nothing in these file needs anything from linux/msi.h Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-14	net: fec: add xdp and page pool statistics	Shenwei Wang
	Added xdp and page pool statistics. In order to make the implementation simple and compatible, the patch uses the 32bit integer to record the XDP statistics. Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-11	ptp: convert remaining drivers to adjfine interface	Jacob Keller
	Convert all remaining drivers that still use .adjfreq to the newer .adjfine implementation. These drivers are not straightforward, as they use non-standard methods of programming their hardware. They are all converted to use scaled_ppm_to_ppb to get the parts per billion value that their logic depends on. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Cc: Ariel Elior <aelior@marvell.com> Cc: Sudarsana Kalluru <skalluru@marvell.com> Cc: Manish Chopra <manishc@marvell.com> Cc: Derek Chickles <dchickles@marvell.com> Cc: Satanand Burla <sburla@marvell.com> Cc: Felix Manlunas <fmanlunas@marvell.com> Cc: Raju Rangoju <rajur@chelsio.com> Cc: Joakim Zhang <qiangqing.zhang@nxp.com> Cc: Edward Cree <ecree.xilinx@gmail.com> Cc: Martin Habets <habetsm.xilinx@gmail.com> Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	drivers/net/can/pch_can.c ae64438be192 ("can: dev: fix skb drop check") 1dd1b521be85 ("can: remove obsolete PCH CAN driver") https://lore.kernel.org/all/20221110102509.1f7d63cc@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-07	net: remove explicit phylink_generic_validate() references	Russell King (Oracle)
	Virtually all conventional network drivers are now converted to use phylink_generic_validate() - only DSA drivers and fman_memac remain, so lets remove the necessity for network drivers to explicitly set this member, and default to phylink_generic_validate() when unset. This is possible as .validate must currently be set. Any remaining instances that have not been addressed by this patch can be fixed up later. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/E1or0FZ-001tRa-DI@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-07	net: fec: simplify the code logic of quirks	Shenwei Wang
	Simplify the code logic of handling the quirk of FEC_QUIRK_HAS_RACC. If a SoC has the RACC quirk, the driver will enable the 16bit shift by default in the probe function for a better performance. This patch handles the logic in one place to make the logic simple and clean. The patch optimizes the fec_enet_xdp_get_tx_queue function according to Paolo Abeni's comments, and it also exludes the SoCs that require to do frame swap from XDP support. Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-04	net: fman: Unregister ethernet device on removal	Sean Anderson
	When the mac device gets removed, it leaves behind the ethernet device. This will result in a segfault next time the ethernet device accesses mac_dev. Remove the ethernet device when we get removed to prevent this. This is not completely reversible, since some resources aren't cleaned up properly, but that can be addressed later. Fixes: 3933961682a3 ("fsl/fman: Add FMan MAC driver") Signed-off-by: Sean Anderson <sean.anderson@seco.com> Link: https://lore.kernel.org/r/20221103182831.2248833-1-sean.anderson@seco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03	net: make drivers to use SET_NETDEV_DEVLINK_PORT to set devlink_port	Jiri Pirko
	Benefit from the previously implemented tracking of netdev events in devlink code and instead of calling devlink_port_type_eth_set() and devlink_port_type_clear() to set devlink port type and link to related netdev, use SET_NETDEV_DEVLINK_PORT() macro to assign devlink_port pointer to netdevice which is about to be registered. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03	net: fec: add initial XDP support	Shenwei Wang
	This patch adds the initial XDP support to Freescale driver. It supports XDP_PASS, XDP_DROP and XDP_REDIRECT actions. Upcoming patches will add support for XDP_TX and Zero Copy features. As the patch is rather large, the part of codes to collect the statistics is separated and will prepare a dedicated patch for that part. I just tested with the application of xdpsock. -- Native here means running command of "xdpsock -i eth0" -- SKB-Mode means running command of "xdpsock -S -i eth0" The following are the testing result relating to XDP mode: root@imx8qxpc0mek:~/bpf# ./xdpsock -i eth0 sock0@eth0:0 rxdrop xdp-drv pps pkts 1.00 rx 371347 2717794 tx 0 0 root@imx8qxpc0mek:~/bpf# ./xdpsock -S -i eth0 sock0@eth0:0 rxdrop xdp-skb pps pkts 1.00 rx 202229 404528 tx 0 0 root@imx8qxpc0mek:~/bpf# ./xdp2 eth0 proto 0: 496708 pkt/s proto 0: 505469 pkt/s proto 0: 505283 pkt/s proto 0: 505443 pkt/s proto 0: 505465 pkt/s root@imx8qxpc0mek:~/bpf# ./xdp2 -S eth0 proto 0: 0 pkt/s proto 17: 118778 pkt/s proto 17: 118989 pkt/s proto 0: 1 pkt/s proto 17: 118987 pkt/s proto 0: 0 pkt/s proto 17: 118943 pkt/s proto 17: 118976 pkt/s proto 0: 1 pkt/s proto 17: 119006 pkt/s proto 0: 0 pkt/s proto 17: 119071 pkt/s proto 17: 119092 pkt/s Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com> Reported-by: kernel test robot <lkp@intel.com> Link: https://lore.kernel.org/r/20221031185350.2045675-1-shenwei.wang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-10-31	net: dpaa2: Add some debug prints on deferred probe	Sean Anderson
	When this device is deferred, there is often no way to determine what the cause was. Add some debug prints to make it easier to figure out what is blocking the probe. Signed-off-by: Sean Anderson <sean.anderson@seco.com> Link: https://lore.kernel.org/r/20221027190005.400839-1-sean.anderson@seco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-31	net: fec: fix improper use of NETDEV_TX_BUSY	Zhang Changzhong
	The ndo_start_xmit() method must not free skb when returning NETDEV_TX_BUSY, since caller is going to requeue freed skb. Fix it by returning NETDEV_TX_OK in case of dma_map_single() fails. Fixes: 79f339125ea3 ("net: fec: Add software TSO support") Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-31	drivers: net: convert to boolean for the mac_managed_pm flag	Denis Kirjanov
	Signed-off-by: Dennis Kirjanov <dkirjanov@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-27	net: dpaa2-eth: Simplify bool conversion	Yang Li
	./drivers/net/ethernet/freescale/dpaa2/dpaa2-xsk.c:453:42-47: WARNING: conversion to bool not needed here Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2577 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20221026051824.38730-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c 2871edb32f46 ("can: kvaser_usb: Fix possible completions during init_completion") abb8670938b2 ("can: kvaser_usb_leaf: Ignore stale bus-off after start") 8d21f5927ae6 ("can: kvaser_usb_leaf: Fix improved state not being reported") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27	net: enetc: survive memory pressure without crashing	Vladimir Oltean
	Under memory pressure, enetc_refill_rx_ring() may fail, and when called during the enetc_open() -> enetc_setup_rxbdr() procedure, this is not checked for. An extreme case of memory pressure will result in exactly zero buffers being allocated for the RX ring, and in such a case it is expected that hardware drops all RX packets due to lack of buffers. This does not happen, because the reset-default value of the consumer and produces index is 0, and this makes the ENETC think that all buffers have been initialized and that it owns them (when in reality none were). The hardware guide explains this best: \| Configure the receive ring producer index register RBaPIR with a value \| of 0. The producer index is initially configured by software but owned \| by hardware after the ring has been enabled. Hardware increments the \| index when a frame is received which may consume one or more BDs. \| Hardware is not allowed to increment the producer index to match the \| consumer index since it is used to indicate an empty condition. The ring \| can hold at most RBLENR[LENGTH]-1 received BDs. \| \| Configure the receive ring consumer index register RBaCIR. The \| consumer index is owned by software and updated during operation of the \| of the BD ring by software, to indicate that any receive data occupied \| in the BD has been processed and it has been prepared for new data. \| - If consumer index and producer index are initialized to the same \| value, it indicates that all BDs in the ring have been prepared and \| hardware owns all of the entries. \| - If consumer index is initialized to producer index plus N, it would \| indicate N BDs have been prepared. Note that hardware cannot start if \| only a single buffer is prepared due to the restrictions described in \| (2). \| - Software may write consumer index to match producer index anytime \| while the ring is operational to indicate all received BDs prior have \| been processed and new BDs prepared for hardware. Normally, the value of rx_ring->rcir (consumer index) is brought in sync with the rx_ring->next_to_use software index, but this only happens if page allocation ever succeeded. When PI==CI==0, the hardware appears to receive frames and write them to DMA address 0x0 (?!), then set the READY bit in the BD. The enetc_clean_rx_ring() function (and its XDP derivative) is naturally not prepared to handle such a condition. It will attempt to process those frames using the rx_swbd structure associated with index i of the RX ring, but that structure is not fully initialized (enetc_new_page() does all of that). So what happens next is undefined behavior. To operate using no buffer, we must initialize the CI to PI + 1, which will block the hardware from advancing the CI any further, and drop everything. The issue was seen while adding support for zero-copy AF_XDP sockets, where buffer memory comes from user space, which can even decide to supply no buffers at all (example: "xdpsock --txonly"). However, the bug is present also with the network stack code, even though it would take a very determined person to trigger a page allocation failure at the perfect time (a series of ifup/ifdown under memory pressure should eventually reproduce it given enough retries). Fixes: d4fd0404c1c9 ("enetc: Introduce basic PF and VF ENETC ethernet drivers") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20221027182925.3256653-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-26	net: fec: limit register access on i.MX6UL	Juergen Borleis
	Using 'ethtool -d […]' on an i.MX6UL leads to a kernel crash: Unhandled fault: external abort on non-linefetch (0x1008) at […] due to this SoC has less registers in its FEC implementation compared to other i.MX6 variants. Thus, a run-time decision is required to avoid access to non-existing registers. Fixes: a51d3ab50702 ("net: fec: use a more proper compatible string for i.MX6UL type device") Signed-off-by: Juergen Borleis <jbe@pengutronix.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20221024080552.21004-1-jbe@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	include/linux/net.h a5ef058dc4d9 ("net: introduce and use custom sockopt socket flag") e993ffe3da4b ("net: flag sockets supporting msghdr originated zerocopy") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-24	net: fec: Add support for periodic output signal of PPS	Wei Fang
	This patch adds the support for configuring periodic output signal of PPS. So the PPS can be output at a specified time and period. For developers or testers, they can use the command "echo <channel> <start.sec> <start.nsec> <period.sec> <period. nsec> > /sys/class/ptp/ptp0/period" to specify time and period to output PPS signal. Notice that, the channel can only be set to 0. In addtion, the start time must larger than the current PTP clock time. So users can use the command "phc_ctl /dev/ptp0 -- get" to get the current PTP clock time before. Signed-off-by: Wei Fang <wei.fang@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24	net: fman: Use physical address for userspace interfaces	Sean Anderson
	Before 262f2b782e25 ("net: fman: Map the base address once"), the physical address of the MAC was exposed to userspace in two places: via sysfs and via SIOCGIFMAP. While this is not best practice, it is an external ABI which is in use by userspace software. The aforementioned commit inadvertently modified these addresses and made them virtual. This constitutes and ABI break. Additionally, it leaks the kernel's memory layout to userspace. Partially revert that commit, reintroducing the resource back into struct mac_device, while keeping the intended changes (the rework of the address mapping). Fixes: 262f2b782e25 ("net: fman: Map the base address once") Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Sean Anderson <sean.anderson@seco.com> Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24	net: dpaa2-eth: add trace points on XSK events	Robert-Ionut Alexa
	Define the dpaa2_tx_xsk_fd and dpaa2_rx_xsk_fd trace events for the XSK zero-copy Rx and Tx path. Also, define the dpaa2_eth_buf as an event class so that both dpaa2_eth_buf_seed and dpaa2_xsk_buf_seed traces can derive from the same class. Signed-off-by: Robert-Ionut Alexa <robert-ionut.alexa@nxp.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24	net: dpaa2-eth: AF_XDP TX zero copy support	Robert-Ionut Alexa
	Add support in dpaa2-eth for packet processing on the Tx path using AF_XDP zero copy mode. The newly added dpaa2_xsk_tx() function will handle enqueuing AF_XDP Tx packets into the appropriate queue and update any necessary statistics. On a more detailed note, the dpaa2_xsk_tx_build_fd() function handles creating a Scatter-Gather frame descriptor with only one data buffer. This is needed because otherwise we would need to impose a headroom in the Tx buffer to store our software annotation structures. This tactic is already used on the normal data path of the dpaa2-eth driver, thus we are reusing the dpaa2_eth_sgt_get/dpaa2_eth_sgt_recycle functions in order to allocate and recycle the Scatter-Gather table buffers. In case we have reached the maximum number of Tx XSK packets to be sent in a NAPI cycle, we'll exit the dpaa2_eth_poll() and hope to be rescheduled again. On the XSK Tx confirmation path, we are just unmapping the SGT buffer and recycle it for further use. Signed-off-by: Robert-Ionut Alexa <robert-ionut.alexa@nxp.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24	net: dpaa2-eth: AF_XDP RX zero copy support	Robert-Ionut Alexa
	This patch adds the support for receiving packets via the AF_XDP zero-copy mechanism in the dpaa2-eth driver. The support is available only on the LX2160A SoC and variants because we are relying on the HW capability to associate a buffer pool to a specific queue (QDBIN), only available on newer WRIOP versions. On the control path, the dpaa2_xsk_enable_pool() function is responsible to allocate a buffer pool (BP), setup this new BP to be used only on the requested queue and change the consume function to point to the XSK ZC one. We are forced to call dev_close() in order to change the queue to buffer pool association (dpaa2_xsk_set_bp_per_qdbin) . This also works in our favor since at dev_close() the buffer pools will be drained and at the later dev_open() call they will be again seeded, this time with buffers allocated from the XSK pool if needed. On the data path, a new software annotation type is defined to be used only for the XSK scenarios. This will enable us to pass keep necessary information about a packet buffer between the moment in which it was seeded and when it's received by the driver. In the XSK case, we are keeping the associated xdp_buff. Depending on the action returned by the BPF program, we will do the following: - XDP_PASS: copy the contents of the packet into a brand new skb, recycle the initial buffer. - XDP_TX: just enqueue the same frame descriptor back into the Tx path, the buffer will get automatically released into the initial BP. - XDP_REDIRECT: call xdp_do_redirect() and exit. Signed-off-by: Robert-Ionut Alexa <robert-ionut.alexa@nxp.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24	net: dpaa2-eth: create and export the dpaa2_eth_receive_skb() function	Robert-Ionut Alexa
	Carve out code from the dpaa2_eth_rx() function in order to create and export the dpaa2_eth_receive_skb() function. Do this in order to reuse this code also from the XSK path which will be introduced in a later patch. Signed-off-by: Robert-Ionut Alexa <robert-ionut.alexa@nxp.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24	net: dpaa2-eth: create and export the dpaa2_eth_alloc_skb function	Robert-Ionut Alexa
	The dpaa2_eth_alloc_skb() function is added by moving code from the dpaa2_eth_copybreak() previously defined function. What the new API does is to allocate a new skb, copy the frame data from the passed FD to the new skb and then return the skb. Export this new function since we'll need the this functionality also from the XSK code path. Signed-off-by: Robert-Ionut Alexa <robert-ionut.alexa@nxp.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24	net: dpaa2-eth: use dev_close/open instead of the internal functions	Ioana Ciornei
	Instead of calling the internal functions which implement .ndo_stop and .ndo_open, we can simply call dev_close and dev_open, so that we keep the code cleaner. Also, in the next patches we'll use the same APIs from other files without needing to export the internal functions. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24	net: dpaa2-eth: update the dpni_set_pools() API to support per QDBIN pools	Robert-Ionut Alexa
	Update the dpni_set_pool() firmware API so that in the next patches we can configure per Rx queue (per QDBIN) buffer pools. This is a hard requirement of the AF_XDP, thus we need the newer API version. Signed-off-by: Robert-Ionut Alexa <robert-ionut.alexa@nxp.com> Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-24	net: dpaa2-eth: export buffer pool info into a new debugfs file	Ioana Ciornei
	Export the allocated buffer pools, the number of buffers that they have currently and which channels are using which BP. The output looks like below: Buffer pool info for eth2: IDX BPID Buf count CH#0 CH#1 CH#2 CH#3 BP#0 1 5124 x x x x Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>