summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-10-03octeontx2-af: cn10k: mcs: Manage the MCS block hardware resourcesGeetha sowjanya
To establish a macsec connection association netdev driver needs hardware resources like SecY, TCAM flows, SCs and SAs. This patch manages allocating, freeing and configuring those resources. AF consumers can request resources and configure them via these mailbox messages. AF can allocate until it runs out of hardware resources. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03octeontx2-af: cn10k: mcs: Add mailboxes for port related operationsGeetha sowjanya
There are set of configurations to be done at MCS port level like bringing port out of reset, making port as operational or bypass. This patch adds all the port related mailbox message handlers so that AF consumers can use them. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03octeontx2-af: cn10k: Introduce driver for macsec block.Geetha sowjanya
CN10K-B and CNF10K-B has macsec block(MCS) to encrypt and decrypt packets at MAC level. This block is a global resource with hardware resources like SecYs, SCs and SAs and is in between NIX block and RPM LMAC. CN10K-B silicon has only one MCS block which receives packets from all LMACS whereas CNF10K-B has seven MCS blocks for seven LMACs. Both MCS blocks are similar in operation except for few register offsets and some configurations require writing to different registers. Those differences between IPs are handled using separate ops. This patch adds basic driver and does the initial hardware calibration and parser configuration. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: Vamsi Attunuru <vattunuru@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03eth: sp7021: fix use after free bug in spl2sw_nvmem_get_mac_addressZheng Wang
This frees "mac" and tries to display its address as part of the error message on the next line. Swap the order. Fixes: fd3040b9394c ("net: ethernet: Add driver for Sunplus SP7021") Signed-off-by: Zheng Wang <zyytlz.wz@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03Merge branch 'lan966x-police-mirroring'David S. Miller
Horatiu Vultur says: ==================== net: lan966x: Add police and mirror using tc-matchall Add tc-matchall classifier offload support both for ingress and egress. For this add support for the port police and port mirroring action support. Port police can happen only on ingress while port mirroring is supported both on ingress and egress ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: lan966x: Add port mirroring support using tc-matchallHoratiu Vultur
Add support for port mirroring. It is possible to mirror only one port at a time and it is possible to have both ingress and egress mirroring. Frames injected by the CPU don't get egress mirrored because they are bypassing the analyzer module. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: lan966x: Add port police support using tc-matchallHoratiu Vultur
Add support for port police. It is possible to police only on the ingress side. To be able to add police support also it was required to add tc-matchall classifier offload support. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: fec: using page pool to manage RX buffersShenwei Wang
This patch optimizes the RX buffer management by using the page pool. The purpose for this change is to prepare for the following XDP support. The current driver uses one frame per page for easy management. Added __maybe_unused attribute to the following functions to avoid the compiling warning. Those functions will be removed by a separate patch once this page pool solution is accepted. - fec_enet_new_rxbdp - fec_enet_copybreak The following are the comparing result between page pool implementation and the original implementation (non page pool). --- small packet (64 bytes) testing are almost the same --- no matter what the implementation is --- on both i.MX8 and i.MX6SX platforms. shenwei@5810:~/pktgen$ iperf -c 10.81.16.245 -w 2m -i 1 -l 64 ------------------------------------------------------------ Client connecting to 10.81.16.245, TCP port 5001 TCP window size: 416 KByte (WARNING: requested 1.91 MByte) ------------------------------------------------------------ [ 1] local 10.81.17.20 port 39728 connected with 10.81.16.245 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.0000-1.0000 sec 37.0 MBytes 311 Mbits/sec [ 1] 1.0000-2.0000 sec 36.6 MBytes 307 Mbits/sec [ 1] 2.0000-3.0000 sec 37.2 MBytes 312 Mbits/sec [ 1] 3.0000-4.0000 sec 37.1 MBytes 312 Mbits/sec [ 1] 4.0000-5.0000 sec 37.2 MBytes 312 Mbits/sec [ 1] 5.0000-6.0000 sec 37.2 MBytes 312 Mbits/sec [ 1] 6.0000-7.0000 sec 37.2 MBytes 312 Mbits/sec [ 1] 7.0000-8.0000 sec 37.2 MBytes 312 Mbits/sec [ 1] 0.0000-8.0943 sec 299 MBytes 310 Mbits/sec --- Page Pool implementation on i.MX8 ---- shenwei@5810:~$ iperf -c 10.81.16.245 -w 2m -i 1 ------------------------------------------------------------ Client connecting to 10.81.16.245, TCP port 5001 TCP window size: 416 KByte (WARNING: requested 1.91 MByte) ------------------------------------------------------------ [ 1] local 10.81.17.20 port 43204 connected with 10.81.16.245 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.0000-1.0000 sec 111 MBytes 933 Mbits/sec [ 1] 1.0000-2.0000 sec 111 MBytes 934 Mbits/sec [ 1] 2.0000-3.0000 sec 112 MBytes 935 Mbits/sec [ 1] 3.0000-4.0000 sec 111 MBytes 933 Mbits/sec [ 1] 4.0000-5.0000 sec 111 MBytes 934 Mbits/sec [ 1] 5.0000-6.0000 sec 111 MBytes 933 Mbits/sec [ 1] 6.0000-7.0000 sec 111 MBytes 931 Mbits/sec [ 1] 7.0000-8.0000 sec 112 MBytes 935 Mbits/sec [ 1] 8.0000-9.0000 sec 111 MBytes 933 Mbits/sec [ 1] 9.0000-10.0000 sec 112 MBytes 935 Mbits/sec [ 1] 0.0000-10.0077 sec 1.09 GBytes 933 Mbits/sec --- Non Page Pool implementation on i.MX8 ---- shenwei@5810:~$ iperf -c 10.81.16.245 -w 2m -i 1 ------------------------------------------------------------ Client connecting to 10.81.16.245, TCP port 5001 TCP window size: 416 KByte (WARNING: requested 1.91 MByte) ------------------------------------------------------------ [ 1] local 10.81.17.20 port 49154 connected with 10.81.16.245 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.0000-1.0000 sec 104 MBytes 868 Mbits/sec [ 1] 1.0000-2.0000 sec 105 MBytes 878 Mbits/sec [ 1] 2.0000-3.0000 sec 105 MBytes 881 Mbits/sec [ 1] 3.0000-4.0000 sec 105 MBytes 879 Mbits/sec [ 1] 4.0000-5.0000 sec 105 MBytes 878 Mbits/sec [ 1] 5.0000-6.0000 sec 105 MBytes 878 Mbits/sec [ 1] 6.0000-7.0000 sec 104 MBytes 875 Mbits/sec [ 1] 7.0000-8.0000 sec 104 MBytes 875 Mbits/sec [ 1] 8.0000-9.0000 sec 104 MBytes 873 Mbits/sec [ 1] 9.0000-10.0000 sec 104 MBytes 875 Mbits/sec [ 1] 0.0000-10.0073 sec 1.02 GBytes 875 Mbits/sec --- Page Pool implementation on i.MX6SX ---- shenwei@5810:~/pktgen$ iperf -c 10.81.16.245 -w 2m -i 1 ------------------------------------------------------------ Client connecting to 10.81.16.245, TCP port 5001 TCP window size: 416 KByte (WARNING: requested 1.91 MByte) ------------------------------------------------------------ [ 1] local 10.81.17.20 port 57288 connected with 10.81.16.245 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.0000-1.0000 sec 78.8 MBytes 661 Mbits/sec [ 1] 1.0000-2.0000 sec 82.5 MBytes 692 Mbits/sec [ 1] 2.0000-3.0000 sec 82.4 MBytes 691 Mbits/sec [ 1] 3.0000-4.0000 sec 82.4 MBytes 691 Mbits/sec [ 1] 4.0000-5.0000 sec 82.5 MBytes 692 Mbits/sec [ 1] 5.0000-6.0000 sec 82.4 MBytes 691 Mbits/sec [ 1] 6.0000-7.0000 sec 82.5 MBytes 692 Mbits/sec [ 1] 7.0000-8.0000 sec 82.4 MBytes 691 Mbits/sec [ 1] 8.0000-9.0000 sec 82.4 MBytes 691 Mbits/sec [ 1] 9.0000-9.5506 sec 45.0 MBytes 686 Mbits/sec [ 1] 0.0000-9.5506 sec 783 MBytes 688 Mbits/sec --- Non Page Pool implementation on i.MX6SX ---- shenwei@5810:~/pktgen$ iperf -c 10.81.16.245 -w 2m -i 1 ------------------------------------------------------------ Client connecting to 10.81.16.245, TCP port 5001 TCP window size: 416 KByte (WARNING: requested 1.91 MByte) ------------------------------------------------------------ [ 1] local 10.81.17.20 port 36486 connected with 10.81.16.245 port 5001 [ ID] Interval Transfer Bandwidth [ 1] 0.0000-1.0000 sec 70.5 MBytes 591 Mbits/sec [ 1] 1.0000-2.0000 sec 64.5 MBytes 541 Mbits/sec [ 1] 2.0000-3.0000 sec 73.6 MBytes 618 Mbits/sec [ 1] 3.0000-4.0000 sec 73.6 MBytes 618 Mbits/sec [ 1] 4.0000-5.0000 sec 72.9 MBytes 611 Mbits/sec [ 1] 5.0000-6.0000 sec 73.4 MBytes 616 Mbits/sec [ 1] 6.0000-7.0000 sec 73.5 MBytes 617 Mbits/sec [ 1] 7.0000-8.0000 sec 73.4 MBytes 616 Mbits/sec [ 1] 8.0000-9.0000 sec 73.4 MBytes 616 Mbits/sec [ 1] 9.0000-10.0000 sec 73.9 MBytes 620 Mbits/sec [ 1] 0.0000-10.0174 sec 723 MBytes 605 Mbits/sec Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: Remove DECnet leftovers from flow.h.Guillaume Nault
DECnet was removed by commit 1202cdd66531 ("Remove DECnet support from kernel"). Let's also revome its flow structure. Compile-tested only (allmodconfig). Signed-off-by: Guillaume Nault <gnault@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03bnx2x: fix potential memory leak in bnx2x_tpa_stop()Jianglei Nie
bnx2x_tpa_stop() allocates a memory chunk from new_data with bnx2x_frag_alloc(). The new_data should be freed when gets some error. But when "pad + len > fp->rx_buf_size" is true, bnx2x_tpa_stop() returns without releasing the new_data, which will lead to a memory leak. We should free the new_data with bnx2x_frag_free() when "pad + len > fp->rx_buf_size" is true. Fixes: 07b0f00964def8af9321cfd6c4a7e84f6362f728 ("bnx2x: fix possible panic under memory stress") Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03eth: lan743x: reject extts for non-pci11x1x devicesRaju Lakkaraju
Remove PTP_PF_EXTTS support for non-PCI11x1x devices since they do not support the PTP-IO Input event triggered timestamping mechanisms added Fixes: 60942c397af6 ("net: lan743x: Add support for PTP-IO Event Input External Timestamp (extts)") Signed-off-by: Raju Lakkaraju <Raju.Lakkaraju@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03gro: add support of (hw)gro packets to gro stackCoco Li
Current GRO stack only supports incoming packets containing one frame/MSS. This patch changes GRO to accept packets that are already GRO. HW-GRO (aka RSC for some vendors) is very often limited in presence of interleaved packets. Linux SW GRO stack can complete the job and provide larger GRO packets, thus reducing rate of ACK packets and cpu overhead. This also means BIG TCP can still be used, even if HW-GRO/RSC was able to cook ~64 KB GRO packets. v2: fix logic in tcp_gro_receive() Only support TCP for the moment (Paolo) Co-Developed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Coco Li <lixiaoyan@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: prestera: acl: Add check for kmemdupJiasheng Jiang
As the kemdup could return NULL, it should be better to check the return value and return error if fails. Moreover, the return value of prestera_acl_ruleset_keymask_set() should be checked by cascade. Fixes: 604ba230902d ("net: prestera: flower template support") Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn> Reviewed-by: Taras Chornyi<tchornyi@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03Merge branch 'mptcp-fastclose'David S. Miller
Mat Martineau says: ==================== mptcp: Fastclose edge cases and error handling MPTCP has existing code to use the MP_FASTCLOSE option header, which works like a RST for the MPTCP-level connection (regular RSTs only affect specific subflows in MPTCP). This series has some improvements for fastclose. Patch 1 aligns fastclose socket error handling with TCP RST behavior on TCP sockets. Patch 2 adds use of MP_FASTCLOSE in some more edge cases, like file descriptor close, FIN_WAIT timeout, and when the socket has unread data. Patch 3 updates the fastclose self tests. Patch 4 does not change any code, just fixes some outdated comments. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03mptcp: update misleading comments.Paolo Abeni
The MPTCP data path is quite complex and hard to understend even without some foggy comments referring to modified code and/or completely misleading from the beginning. Update a few of them to more accurately describing the current status. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03selftests: mptcp: update and extend fastclose test-casesPaolo Abeni
After the previous patches, the MPTCP protocol can generate fast-closes on both ends of the connection. Rework the relevant test-case to carefully trigger the fast-close code-path on a single end at the time, while ensuring than a predictable amount of data is spooled on both ends. Additionally add another test-cases for the passive socket fast-close. Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03mptcp: use fastclose on more edge scenariosPaolo Abeni
Daire reported a user-space application hang-up when the peer is forcibly closed before the data transfer completion. The relevant application expects the peer to either do an application-level clean shutdown or a transport-level connection reset. We can accommodate a such user by extending the fastclose usage: at fd close time, if the msk socket has some unread data, and at FIN_WAIT timeout. Note that at MPTCP close time we must ensure that the TCP subflows will reset: set the linger socket option to a suitable value. Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03mptcp: propagate fastclose errorPaolo Abeni
When an mptcp socket is closed due to an incoming FASTCLOSE option, so specific sk_err is set and later syscall will fail usually with EPIPE. Align the current fastclose error handling with TCP reset, properly setting the socket error according to the current msk state and propagating such error. Additionally sendmsg() is currently not handling properly the sk_err, always returning EPIPE. Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03Merge branch 'RollBall-Hilink-Turris-10G-copper-SFP-support'David S. Miller
Marek Behún says: ==================== RollBall / Hilink / Turris 10G copper SFP support I am resurrecting my attempt to add support for RollBall / Hilink / Turris 10G copper SFPs modules. The modules contain Marvell 88X3310 PHY, which can communicate with the system via sgmii, 2500base-x, 5gbase-r, 10gbase-r or usxgmii mode. Some of the patches I've taken from Russell King's net-queue [1] (with some rebasing). The important change from my previous attempts are: - I am including the changes needed to phylink and marvell10g driver, so that the 88X3310 PHY is configured to use PHY modes supported by the host (the PHY defaults to use 10gbase-r only on host's side) - I have changed the patch that informs phylib about the interfaces supported by the host (patch 5 of this series): it now fills in the phydev->host_interfaces member only when connecting a PHY that is inside a SFP module. This may change in the future. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: sfp: add support for multigig RollBall transceiversMarek Behún
This adds support for multigig copper SFP modules from RollBall/Hilink. These modules have a specific way to access clause 45 registers of the internal PHY. We also need to wait at least 22 seconds after deasserting TX disable before accessing the PHY. The code waits for 25 seconds just to be sure. Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: phy: mdio-i2c: support I2C MDIO protocol for RollBall SFP modulesMarek Behún
Some multigig SFPs from RollBall and Hilink do not expose functional MDIO access to the internal PHY of the SFP via I2C address 0x56 (although there seems to be read-only clause 22 access on this address). Instead these SFPs PHY can be accessed via I2C via the SFP Enhanced Digital Diagnostic Interface - I2C address 0x51. The SFP_PAGE has to be selected to 3 and the password must be filled with 0xff bytes for this PHY communication to work. This extends the mdio-i2c driver to support this protocol by adding a special parameter to mdio_i2c_alloc function via which this RollBall protocol can be selected. Signed-off-by: Marek Behún <kabel@kernel.org> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: sfp: create/destroy I2C mdiobus before PHY probe/after PHY releaseMarek Behún
Instead of configuring the I2C mdiobus when SFP driver is probed, create/destroy the mdiobus before the PHY is probed for/after it is released. This way we can tell the mdio-i2c code which protocol to use for each SFP transceiver. Move the code that determines MDIO I2C protocol from sfp_sm_probe_for_phy() to sfp_sm_mod_probe(), where most of the SFP ID parsing is done. Don't allocate I2C bus if no PHY is expected. Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: sfp: Add and use macros for SFP quirks definitionsMarek Behún
Add macros SFP_QUIRK(), SFP_QUIRK_M() and SFP_QUIRK_F() for defining SFP quirk table entries. Use them to deduplicate the code a little bit. Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: phylink: allow attaching phy for SFP modules on 802.3z modeMarek Behún
Some SFPs may contain an internal PHY which may in some cases want to connect with the host interface in 1000base-x/2500base-x mode. Do not fail if such PHY is being attached in one of these PHY interface modes. Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Pali Rohár <pali@kernel.org> Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: phy: marvell10g: select host interface configurationRussell King
Select the host interface configuration according to the capabilities of the host if the host provided them. This is currently provided only when connecting PHY that is inside a SFP. The PHY supports several configurations of host communication: - always communicate with host in 10gbase-r, even if copper speed is lower (rate matching mode), - the same as above but use xaui/rxaui instead of 10gbase-r, - switch host SerDes mode between 10gbase-r, 5gbase-r, 2500base-x and sgmii according to copper speed, - the same as above but use xaui/rxaui instead of 10gbase-r. This mode of host communication, called MACTYPE, is by default selected by strapping pins, but it can be changed in software. This adds support for selecting this mode according to which modes are supported by the host. This allows the kernel to: - support SFP modules with 88X33X0 or 88E21X0 inside them Note: we use mv3310_select_mactype() for both 88X3310 and 88X3340, although 88X3340 does not support XAUI. This is not a problem because 88X3340 does not declare XAUI in it's supported_interfaces, and so this function will never choose that MACTYPE. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> [ rebase, updated, also added support for 88E21X0 ] Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: phy: marvell10g: Use tabs instead of spaces for indentationMarek Behún
Some register definitions were defined with spaces used for indentation. Change them to tabs. Signed-off-by: Marek Behún <kabel@kernel.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: phylink: pass supported host PHY interface modes to phylib for SFP's PHYsMarek Behún
Pass the supported PHY interface types to phylib if the PHY we are connecting is inside a SFP, so that the PHY driver can select an appropriate host configuration mode for their interface according to the host capabilities. For example the Marvell 88X3310 PHY inside RollBall SFP modules defaults to 10gbase-r mode on host's side, and the marvell10g driver currently does not change this setting. But a host may not support 10gbase-r. For example Turris Omnia only supports sgmii, 1000base-x and 2500base-x modes. The PHY can be configured to use those modes, but in order for the PHY driver to do that, it needs to know which modes are supported. Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: phylink: rename phylink_sfp_config()Russell King (Oracle)
phylink_sfp_config() now only deals with configuring the MAC for a SFP containing a PHY. Rename it to be specific. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: phylink: use phy_interface_t bitmaps for optical modulesRussell King
Where a MAC provides a phy_interface_t bitmap, use these bitmaps to select the operating interface mode for optical SFP modules, rather than using the linkmode bitmaps. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: sfp: augment SFP parsing with phy_interface_t bitmapRussell King
We currently parse the SFP EEPROM to a bitmap of ethtool link modes, and then attempt to convert the link modes to a PHY interface mode. While this works at present, there are cases where this is sub-optimal. For example, where a module can operate with several different PHY interface modes. To start addressing this, arrange for the SFP EEPROM parsing to also provide a bitmap of the possible PHY interface modes. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: phylink: add ability to validate a set of interface modesRussell King (Oracle)
Rather than having the ability to validate all supported interface modes or a single interface mode, introduce the ability to validate a subset of supported modes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [ rebased on current net-next ] Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03x86/hyperv: Replace kmap() with kmap_local_page()Zhao Liu
kmap() is being deprecated in favor of kmap_local_page()[1]. There are two main problems with kmap(): (1) It comes with an overhead as mapping space is restricted and protected by a global lock for synchronization and (2) it also requires global TLB invalidation when the kmap's pool wraps and it might block when the mapping space is fully utilized until a slot becomes available. With kmap_local_page() the mappings are per thread, CPU local, can take page faults, and can be called from any context (including interrupts). It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, the tasks can be preempted and, when they are scheduled to run again, the kernel virtual addresses are restored and are still valid. In the fuction hyperv_init() of hyperv/hv_init.c, the mapping is used in a single thread and is short live. So, in this case, it's safe to simply use kmap_local_page() to create mapping, and this avoids the wasted cost of kmap() for global synchronization. In addtion, the fuction hyperv_init() checks if kmap() fails by BUG_ON(). From the original discussion[2], the BUG_ON() here is just used to explicitly panic NULL pointer. So still keep the BUG_ON() in place to check if kmap_local_page() fails. Based on this consideration, memcpy_to_page() is not selected here but only kmap_local_page() is used. Therefore, replace kmap() with kmap_local_page() in hyperv/hv_init.c. [1]: https://lore.kernel.org/all/20220813220034.806698-1-ira.weiny@intel.com [2]: https://lore.kernel.org/lkml/20200915103710.cqmdvzh5lys4wsqo@liuwe-devbox-debian-v2/ Suggested-by: Dave Hansen <dave.hansen@intel.com> Suggested-by: Ira Weiny <ira.weiny@intel.com> Suggested-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/r/20220928095640.626350-1-zhao1.liu@linux.intel.com Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-10-03platform/x86: use PLATFORM_DEVID_NONE instead of -1Barnabás Pőcze
Use the `PLATFORM_DEVID_NONE` constant instead of hard-coding -1 when creating a platform device. No functional changes are intended. Signed-off-by: Barnabás Pőcze <pobrn@protonmail.com> Link: https://lore.kernel.org/r/20220930104857.2796923-1-pobrn@protonmail.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-10-03platform/x86/amd: pmc: Dump idle mask during "check" stage insteadMario Limonciello
The idle mask is dumped during the "prepare" and "restore" stage right now, which helps to demonstrate issues only related to the first s2idle entry. If the system has entered s2idle once, but was woken up never breaking the s2idle loop but also never went back to sleep we might still have another issue to deal with however. Move the dynamic debugging message here so that we'll catch it on each iteration. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216516 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20220929215042.745-1-mario.limonciello@amd.com Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-10-03net: sparx5: Fix return type of sparx5_port_xmit_implNathan Huckleberry
The ndo_start_xmit field in net_device_ops is expected to be of type netdev_tx_t (*ndo_start_xmit)(struct sk_buff *skb, struct net_device *dev). The mismatched return type breaks forward edge kCFI since the underlying function definition does not match the function hook definition. The return type of sparx5_port_xmit_impl should be changed from int to netdev_tx_t. Reported-by: Dan Carpenter <error27@gmail.com> Link: https://github.com/ClangBuiltLinux/linux/issues/1703 Cc: llvm@lists.linux.dev Signed-off-by: Nathan Huckleberry <nhuck@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03af_unix: Fix memory leaks of the whole sk due to OOB skb.Kuniyuki Iwashima
syzbot reported a sequence of memory leaks, and one of them indicated we failed to free a whole sk: unreferenced object 0xffff8880126e0000 (size 1088): comm "syz-executor419", pid 326, jiffies 4294773607 (age 12.609s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 7d 00 00 00 00 00 00 00 ........}....... 01 00 07 40 00 00 00 00 00 00 00 00 00 00 00 00 ...@............ backtrace: [<000000006fefe750>] sk_prot_alloc+0x64/0x2a0 net/core/sock.c:1970 [<0000000074006db5>] sk_alloc+0x3b/0x800 net/core/sock.c:2029 [<00000000728cd434>] unix_create1+0xaf/0x920 net/unix/af_unix.c:928 [<00000000a279a139>] unix_create+0x113/0x1d0 net/unix/af_unix.c:997 [<0000000068259812>] __sock_create+0x2ab/0x550 net/socket.c:1516 [<00000000da1521e1>] sock_create net/socket.c:1566 [inline] [<00000000da1521e1>] __sys_socketpair+0x1a8/0x550 net/socket.c:1698 [<000000007ab259e1>] __do_sys_socketpair net/socket.c:1751 [inline] [<000000007ab259e1>] __se_sys_socketpair net/socket.c:1748 [inline] [<000000007ab259e1>] __x64_sys_socketpair+0x97/0x100 net/socket.c:1748 [<000000007dedddc1>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<000000007dedddc1>] do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80 [<000000009456679f>] entry_SYSCALL_64_after_hwframe+0x63/0xcd We can reproduce this issue by creating two AF_UNIX SOCK_STREAM sockets, send()ing an OOB skb to each other, and close()ing them without consuming the OOB skbs. int skpair[2]; socketpair(AF_UNIX, SOCK_STREAM, 0, skpair); send(skpair[0], "x", 1, MSG_OOB); send(skpair[1], "x", 1, MSG_OOB); close(skpair[0]); close(skpair[1]); Currently, we free an OOB skb in unix_sock_destructor() which is called via __sk_free(), but it's too late because the receiver's unix_sk(sk)->oob_skb is accounted against the sender's sk->sk_wmem_alloc and __sk_free() is called only when sk->sk_wmem_alloc is 0. In the repro sequences, we do not consume the OOB skb, so both two sk's sock_put() never reach __sk_free() due to the positive sk->sk_wmem_alloc. Then, no one can consume the OOB skb nor call __sk_free(), and we finally leak the two whole sk. Thus, we must free the unconsumed OOB skb earlier when close()ing the socket. Fixes: 314001f0bf92 ("af_unix: Add OOB support") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03Merge branch 'ip_tunnel-netlink-parms'David S. Miller
Liu Jian says: ==================== Add helper functions to parse netlink msg of ip_tunnel v1->v2: Move the implementation of the helper function to ip_tunnel_core.c v2->v3: Change EXPORT_SYMBOL to EXPORT_SYMBOL_GPL ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: Add helper function to parse netlink msg of ip_tunnel_parmLiu Jian
Add ip_tunnel_netlink_parms to parse netlink msg of ip_tunnel_parm. Reduces duplicate code, no actual functional changes. Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: Add helper function to parse netlink msg of ip_tunnel_encapLiu Jian
Add ip_tunnel_netlink_encap_parms to parse netlink msg of ip_tunnel_encap. Reduces duplicate code, no actual functional changes. Signed-off-by: Liu Jian <liujian56@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03net: rds: don't hold sock lock when cancelling work from ↵Tetsuo Handa
rds_tcp_reset_callbacks() syzbot is reporting lockdep warning at rds_tcp_reset_callbacks() [1], for commit ac3615e7f3cffe2a ("RDS: TCP: Reduce code duplication in rds_tcp_reset_callbacks()") added cancel_delayed_work_sync() into a section protected by lock_sock() without realizing that rds_send_xmit() might call lock_sock(). We don't need to protect cancel_delayed_work_sync() using lock_sock(), for even if rds_{send,recv}_worker() re-queued this work while __flush_work() from cancel_delayed_work_sync() was waiting for this work to complete, retried rds_{send,recv}_worker() is no-op due to the absence of RDS_CONN_UP bit. Link: https://syzkaller.appspot.com/bug?extid=78c55c7bc6f66e53dce2 [1] Reported-by: syzbot <syzbot+78c55c7bc6f66e53dce2@syzkaller.appspotmail.com> Co-developed-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot <syzbot+78c55c7bc6f66e53dce2@syzkaller.appspotmail.com> Fixes: ac3615e7f3cffe2a ("RDS: TCP: Reduce code duplication in rds_tcp_reset_callbacks()") Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== 1) Refactor selftests to use an array of structs in xfrm_fill_key(). From Gautam Menghani. 2) Drop an unused argument from xfrm_policy_match. From Hongbin Wang. 3) Support collect metadata mode for xfrm interfaces. From Eyal Birger. 4) Add netlink extack support to xfrm. From Sabrina Dubroca. Please note, there is a merge conflict in: include/net/dst_metadata.h between commit: 0a28bfd4971f ("net/macsec: Add MACsec skb_metadata_dst Tx Data path support") from the net-next tree and commit: 5182a5d48c3d ("net: allow storing xfrm interface metadata in metadata_dst") from the ipsec-next tree. Can be solved as done in linux-next. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-03Merge branch 'for-next' into for-linusTakashi Iwai
2022-10-02hwmon: (corsair-psu) add USB id of new revision of the HX1000i psuWilken Gottwalt
Also updates the documentation accordingly. Signed-off-by: Wilken Gottwalt <wilken.gottwalt@posteo.net> Link: https://lore.kernel.org/r/YznOUQ7Pijedu0NW@monster.localdomain Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2022-10-02Linux 6.0v6.0Linus Torvalds
2022-10-03ia64: simplify esi object addition in MakefileMasahiro Yamada
CONFIG_IA64_ESI is a bool option. I do not know why the Makefile was written like this, but this should not have any functional change. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
2022-10-03Revert "kbuild: Check if linker supports the -X option"Masahiro Yamada
This reverts commit d79a27195a33f4b5e591de5536799ad874ea6cf5. According to the commit description, this ld-option test was added for the gold linker at that time. Commit 75959d44f9dc ("kbuild: Fail if gold linker is detected") gave up the gold linker support after all. I tested the BFD linker from binutils 2.23 and LLD from LLVM 11.0.0. Both of them support the -X option. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-10-03kbuild: rebuild .vmlinux.export.o when its prerequisite is updatedMasahiro Yamada
When include/linux/export-internal.h is updated, .vmlinux.export.o must be rebuilt, but it does not happen because its rule is hidden behind scripts/link-vmlinux.sh. Move it out of the shell script, so that Make can see the dependency between vmlinux and .vmlinux.export.o. Move the vmlinux rule to scripts/Makefile.vmlinux. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-10-03kbuild: move modules.builtin(.modinfo) rules to Makefile.vmlinux_oMasahiro Yamada
Do not build modules.builtin(.modinfo) as a side-effect of vmlinux. There are no good reason to rebuild them just because any of vmlinux's prerequistes (vmlinux.lds, .vmlinux.export.c, etc.) has been updated. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-10-03zstd: Fixing mixed module-builtin objectsAlexey Kardashevskiy
With CONFIG_ZSTD_COMPRESS=m and CONFIG_ZSTD_DECOMPRESS=y we end up in a situation when files from lib/zstd/common/ are compiled once to be linked later for ZSTD_DECOMPRESS (build-in) and ZSTD_COMPRESS (module) even though CFLAGS are different for builtins and modules. So far somehow this was not a problem but enabling LLVM LTO exposes the problem as: ld.lld: error: linking module flags 'Code Model': IDs have conflicting values in 'lib/built-in.a(zstd_common.o at 5868)' and 'ld-temp.o' This particular conflict is caused by KBUILD_CFLAGS=-mcmodel=medium vs. KBUILD_CFLAGS_MODULE=-mcmodel=large , modules use the large model on POWERPC as explained at https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/powerpc/Makefile?h=v5.18-rc4#n127 but the current use of common files is wrong anyway. This works around the issue by introducing a zstd_common module with shared code. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-10-03kallsyms: ignore __kstrtab_* and __kstrtabns_* symbolsMasahiro Yamada
Every EXPORT_SYMBOL creates __kstrtab_* and __kstrtabns_*, which consumes 15-20% of the kallsyms entries. For example, on the system built from the x86_64 defconfig, $ cat /proc/kallsyms | wc 129527 388581 5685465 $ cat /proc/kallsyms | grep __kstrtab | wc 23489 70467 1187932 We already ignore __crc_* symbols populated by EXPORT_SYMBOL, so it should be fine to ignore __kstrtab_* and __kstrtabns_* as well. This makes vmlinux a bit smaller. $ size vmlinux.before vmlinux.after text data bss dec hex filename 22785374 8559694 1413328 32758396 1f3da7c vmlinux.before 22785374 8137806 1413328 32336508 1ed6a7c vmlinux.after Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>