summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2022-02-03ptp: unregister virtual clocks when unregistering physical clock.Miroslav Lichvar
When unregistering a physical clock which has some virtual clocks, unregister the virtual clocks with it. This fixes the following oops, which can be triggered by unloading a driver providing a PTP clock when it has enabled virtual clocks: BUG: unable to handle page fault for address: ffffffffc04fc4d8 Oops: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:ptp_vclock_read+0x31/0xb0 Call Trace: timecounter_read+0xf/0x50 ptp_vclock_refresh+0x2c/0x50 ? ptp_clock_release+0x40/0x40 ptp_aux_kworker+0x17/0x30 kthread_worker_fn+0x9b/0x240 ? kthread_should_park+0x30/0x30 kthread+0xe2/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 Fixes: 73f37068d540 ("ptp: support ptp physical/virtual clocks conversion") Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Yangbo Lu <yangbo.lu@nxp.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: xrs700x: convert to phylink_generic_validate()Russell King (Oracle)
Populate the supported interfaces and MAC capabilities for the xrs700x family of DSA switches and remove the old validate implementation to allow DSA to use phylink_generic_validate() for this switch driver. According to commit ee00b24f32eb ("net: dsa: add Arrow SpeedChips XRS700x driver") the switch supports one RMII port and up to three RGMII ports. This commit assumes that port 0 is the RMII port and the remainder are RGMII. This commit also results in the Autoneg bit being set in the ethtool link modes, which wasn't in the original; if this switch supports RGMII to a 10/100/1G PHY, then surely we want to allow Autoneg on the PHY. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: qca8k: convert to phylink_generic_validate()Russell King (Oracle)
Populate the supported interfaces and MAC capabilities for the QCA8K DSA switch and remove the old validate implementation to allow DSA to use phylink_generic_validate() for this switch driver. In making this change, we bring consistency to the ethtool linkmodes that phylink's validate step produces, thereby following the expected behaviour as the phylink documentation has explained. Specifically, the ethtool 1000baseX_Full capability is now permitted for all interface modes, as it is a property of the PHY driver whether 1000baseX fiber connections can be supported. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: ksz8795: convert to phylink_generic_validate()Russell King (Oracle)
Populate the supported interfaces and MAC capabilities for the Microchip KSZ8795 DSA switch and remove the old validate implementation to allow DSA to use phylink_generic_validate() for this switch driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: bcm_sf2: convert to phylink_generic_validate()Russell King (Oracle)
Populate the supported interfaces and MAC capabilities for the bcm_sf2 DSA switch and remove the old validate implementation to allow DSA to use phylink_generic_validate() for this switch driver. The exclusion of Gigabit linkmodes for MII and Reverse MII links is handled within phylink_generic_validate() in phylink, so there is no need to make them conditional on the interface mode in the driver. Thanks to Florian Fainelli for suggesting how to populate the supported interfaces. Link: https://lore.kernel.org/r/3b3fed98-0c82-99e9-dc72-09fe01c2bcf3@gmail.com Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: ar9331: convert to phylink_generic_validate()Russell King (Oracle)
Populate the supported interfaces and MAC capabilities for the AR9331 DSA switch and remove the old validate implementation to allow DSA to use phylink_generic_validate() for this switch driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: ipa: set IPA v4.11 AP<-modem RX buffer size to 32KBAlex Elder
Increase the receive buffer size used for data received from the modem to 32KB, to improve download performance by allowing much greater aggregation. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02net: ipa: define per-endpoint receive buffer sizeAlex Elder
Allow RX endpoints to have differing receive buffer sizes. Define the receive buffer size in the configuration data, and use that rather than IPA_RX_BUFFER_SIZE when configuring the endpoint. Add verification in ipa_endpoint_data_valid_one() that the receive buffer specified for AP RX endpoints is both big enough to handle at least one full packet, and not so big in an aggregating endpoint that its size can't be represented when programming the hardware. Move aggr_byte_limit_max() up in "ipa_endpoint.c" so it can be used earlier in the file without a forward-reference. Initially we'll just keep the 8KB receive buffer size already in use for all AP RX endpoints.. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02drivers: net: Replace acpi_bus_get_device()Rafael J. Wysocki
Replace acpi_bus_get_device() that is going to be dropped with acpi_fetch_acpi_dev(). No intentional functional impact. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/11918902.O9o76ZdvQC@kreacher Link: https://lore.kernel.org/r/11920660.O9o76ZdvQC@kreacher Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02net: dsa: qca8k: introduce qca8k_bulk_read/write functionAnsuel Smith
Introduce qca8k_bulk_read/write() function to use mgmt Ethernet way to read/write packet in bulk. Make use of this new function in the fdb function and while at it reduce the reg for fdb_read from 4 to 3 as the max bit for the ARL(fdb) table is 83 bits. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: add support for larger read/write size with mgmt EthernetAnsuel Smith
mgmt Ethernet packet can read/write up to 16byte at times. The len reg is limited to 15 (0xf). The switch actually sends and accepts data in 4 different steps of len values. Len steps: - 0: nothing - 1-4: first 4 byte - 5-6: first 12 byte - 7-15: all 16 byte In the alloc skb function we check if the len is 16 and we fix it to a len of 15. It the read/write function interest to extract the real asked data. The tagger handler will always copy the fully 16byte with a READ command. This is useful for some big regs like the fdb reg that are more than 4byte of data. This permits to introduce a bulk function that will send and request the entire entry in one go. Write function is changed and it does now require to pass the pointer to val to also handle array val. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: cache lo and hi for mdio writeAnsuel Smith
From Documentation, we can cache lo and hi the same way we do with the page. This massively reduce the mdio write as 3/4 of the time as we only require to write the lo or hi part for a mdio write. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: move page cache to driver privAnsuel Smith
There can be multiple qca8k switch on the same system. Move the static qca8k_current_page to qca8k_priv and make it specific for each switch. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: add support for phy read/write with mgmt EthernetAnsuel Smith
Use mgmt Ethernet also for phy read/write if availabale. Use a different seq number to make sure we receive the correct packet. On any error, we fallback to the legacy mdio read/write. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: add support for mib autocast in Ethernet packetAnsuel Smith
The switch can autocast MIB counter using Ethernet packet. Add support for this and provide a handler for the tagger. The switch will send packet with MIB counter for each port, the switch will use completion API to wait for the correct packet to be received and will complete the task only when each packet is received. Although the handler will drop all the other packet, we still have to consume each MIB packet to complete the request. This is done to prevent mixed data with concurrent ethtool request. connect_tag_protocol() is used to add the handler to the tag_qca tagger, master_state_change() use the MIB lock to make sure no MIB Ethernet is in progress. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: add support for mgmt read/write in Ethernet packetAnsuel Smith
Add qca8k side support for mgmt read/write in Ethernet packet. qca8k supports some specially crafted Ethernet packet that can be used for mgmt read/write instead of the legacy method uart/internal mdio. This add support for the qca8k side to craft the packet and enqueue it. Each port and the qca8k_priv have a special struct to put data in it. The completion API is used to wait for the packet to be received back with the requested data. The various steps are: 1. Craft the special packet with the qca hdr set to mgmt read/write mode. 2. Set the lock in the dedicated mgmt struct. 3. Increment the seq number and set it in the mgmt pkt 4. Reinit the completion. 5. Enqueue the packet. 6. Wait the packet to be received. 7. Use the data set by the tagger to complete the mdio operation. If the completion timeouts or the ack value is not true, the legacy mdio way is used. It has to be considered that in the initial setup mdio is still used and mdio is still used until DSA is ready to accept and tag packet. tag_proto_connect() is used to fill the required handler for the tagger to correctly parse and elaborate the special Ethernet mdio packet. Locking is added to qca8k_master_change() to make sure no mgmt Ethernet are in progress. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: add tracking state of master portAnsuel Smith
MDIO/MIB Ethernet require the master port and the tagger availabale to correctly work. Use the new api master_state_change to track when master is operational or not and set a bool in qca8k_priv. We cache the first cached master available and we check if other cpu port are operational when the cached one goes down. This cached master will later be used by mdio read/write and mib request to correctly use the working function. qca8k implementation for MDIO/MIB Ethernet is bad. CPU port0 is the only one that answers with the ack packet or sends MIB Ethernet packets. For this reason the master_state_change ignore CPU port6 and only checks CPU port0 if it's operational and enables this mode. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01sfc: The size of the RX recycle ring should be more flexibleMartin Habets
Ideally the size would depend on the link speed, but the recycle ring is created when the interface is brought up before the driver knows the link speed. So size it for the maximum speed of a given NIC. PowerPC is only supported on SFN7xxx and SFN8xxx NICs. With this patch on a 40G NIC the number of calls to alloc_pages and friends went down from about 18% to under 2%. On a 10G NIC the number of calls to alloc_pages and friends went down from about 15% to 0 (perf did not capture any calls during the 60 second test). On a 100G NIC the number of calls to alloc_pages and friends went down from about 23% to 4%. Reported-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20220131111054.cp4f6foyinaarwbn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-01r8169: support L1.2 control on RTL8168hHeiner Kallweit
According to Realtek RTL8168h supports the same L1.2 control as RTL8125. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/4784d5ce-38ac-046a-cbfa-5fdd9773f820@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-01net: lan966x: Implement get_ts_infoHoratiu Vultur
Implement the function get_ts_info in ethtool_ops which is needed to get the HW capabilities for timestamping. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01net: lan966x: Add support for ptp interruptsHoratiu Vultur
When doing 2-step timestamping the HW will generate an interrupt when it managed to timestamp a frame. It is the SW responsibility to read it from the FIFO. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01net: lan966x: Update extraction/injection for timestampingHoratiu Vultur
Update both the extraction and injection to do timestamping of the frames. The extraction is always doing the timestamping while for injection is doing the timestamping only if it is configured. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01net: lan966x: Implement SIOCSHWTSTAMP and SIOCGHWTSTAMPHoratiu Vultur
Implement the ioctl callbacks SIOCSHWTSTAMP and SIOCGHWTSTAMP to allow to configure the ports to enable/disable timestamping for TX. The RX timestamping is always enabled. The HW is capable to run both 1-step timestamping and 2-step timestamping. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01net: lan966x: Add support for ptp clocksHoratiu Vultur
The lan966x has 3 PHC. Enable each of them, for now all the timestamping is happening on the first PHC. Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01net: lan966x: Add registers that are use for ptp functionalityHoratiu Vultur
Add the registers that will be used to configure the PHC in the HW. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/nextDavid S. Miller
-queue Tony Nguyen says: ==================== 10GbE Intel Wired LAN Driver Updates 2022-01-31 Alexander Lobakin says: This is an interpolation of [0] to other Intel Ethernet drivers (and is (re)based on its code). The main aim is to keep XDP metadata not only in case with build_skb(), but also when we do napi_alloc_skb() + memcpy(). All Intel drivers suffers from the same here: - metadata gets lost on XDP_PASS in legacy-rx; - excessive headroom allocation on XSK Rx to skbs; - metadata gets lost on XSK Rx to skbs. Those get especially actual in XDP Hints upcoming. I couldn't have addressed the first one for all Intel drivers due to that they don't reserve any headroom for now in legacy-rx mode even with XDP enabled. This is hugely wrong, but requires quite a bunch of work and a separate series. Luckily, ice doesn't suffer from that. igc has 1 and 3 already fixed in [0]. [0] https://lore.kernel.org/netdev/163700856423.565980.10162564921347693758.stgit@firesoul ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31sh_eth: kill useless initializers in sh_eth_{suspend|resume}()Sergey Shtylyov
sh_eth_{suspend|resume}() initialize their local variable 'ret' to 0 but this value is never really used, thus we can kill those intializers... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/f09d7c64-4a2b-6973-09a4-10d759ed0df4@omp.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-31net: ena: Do not waste napi skb cacheHyeonggon Yoo
By profiling, discovered that ena device driver allocates skb by build_skb() and frees by napi_skb_cache_put(). Because the driver does not use napi skb cache in allocation path, napi skb cache is periodically filled and flushed. This is waste of napi skb cache. As ena_alloc_skb() is called only in napi, Use napi_build_skb() and napi_alloc_skb() when allocating skb. This patch was tested on aws a1.metal instance. [ jwiedmann.dev@gmail.com: Use napi_alloc_skb() instead of netdev_alloc_skb_ip_align() to keep things consistent. ] Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/YfUAkA9BhyOJRT4B@ip-172-31-19-208.ap-northeast-1.compute.internal Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-31qed: use msleep() in qed_mcp_cmd() and add qed_mcp_cmd_nosleep() for udelay.Venkata Sudheer Kumar Bhavaraju
Change qed_mcp_cmd() to use msleep() (by setting QED_MB_FLAG_CAN_SLEEP flag) and add new nosleep() version of the api. These api are used to issue cmds to management fw and the change affects how driver behaves while waiting for a response/resource. All sleepable callers of the existing api now use msleep() version. For non-sleepable callers, the new nosleep() version is explicitly used. Signed-off-by: Venkata Sudheer Kumar Bhavaraju <vbhavaraju@marvell.com> Signed-off-by: Alok Prasad <palok@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Link: https://lore.kernel.org/r/20220131005235.1647881-1-vbhavaraju@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-31ixgbe: respect metadata on XSK Rx to skbAlexander Lobakin
For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match ixgbe_construct_skb(). Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support") Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, ixgbe_construct_skb_zc() currently allocates and reserves additional `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directlyAlexander Lobakin
To not dereference bi->xdp each time in ixgbe_construct_skb_zc(), pass bi->xdp as an argument instead of bi. We can also call xsk_buff_free() outside of the function as well as assign bi->xdp to NULL, which seems to make it closer to its name. Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31igc: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, igc_construct_skb_zc() currently allocates and reserves additional `xdp->data_meta - xdp->data_hard_start`, which is about XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only (+ meta) to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Also, net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match igc_construct_skb(). Fixes: fc9df2a0b520 ("igc: Enable RX via AF_XDP zero-copy") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31ice: respect metadata on XSK Rx to skbAlexander Lobakin
For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match ice_construct_skb(). Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, ice_construct_skb_zc() currently allocates and reserves additional `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31ice: respect metadata in legacy-rx/ice_construct_skb()Alexander Lobakin
In "legacy-rx" mode represented by ice_construct_skb(), we can still use XDP (and XDP metadata), but after XDP_PASS the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. Point net_prefetch() to xdp->data_meta instead of data. This won't change anything when the meta is not here, but will save some cache misses otherwise. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31i40e: respect metadata on XSK Rx to skbAlexander Lobakin
For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match i40e_construct_skb(). Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, i40e_construct_skb_zc() currently allocates and reserves additional `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31net: mana: Reuse XDP dropped pageHaiyang Zhang
Reuse the dropped page in RX path to save page allocation overhead. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31net: mana: Add counter for XDP_TXHaiyang Zhang
This counter will show up in ethtool stat. It is the number of packets received and forwarded by XDP program. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31net: mana: Add counter for packet dropped by XDPHaiyang Zhang
This counter will show up in ethtool stat data. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31sh_eth: sh_eth_close() always returns 0Sergey Shtylyov
sh_eth_close() always returns 0, hence the check in sh_eth_wol_restore() is pointless (however we cannot change the prototype of sh_eth_close() as it implements the driver's ndo_stop() method). Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31ravb: ravb_close() always returns 0Sergey Shtylyov
ravb_close() always returns 0, hence the check in ravb_wol_restore() is pointless (however, we cannot change the prototype of ravb_close() as it implements the driver's ndo_stop() method). Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31net/fsl: xgmac_mdio: fix return value check in xgmac_mdio_probe()Wei Yongjun
In case of error, the function devm_ioremap() returns NULL pointer not ERR_PTR(). The IS_ERR() test in the return value check should be replaced with NULL test. Fixes: 1d14eb15dc2c ("net/fsl: xgmac_mdio: Use managed device resources") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Reviewed-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31r8169: add rtl_disable_exit_l1()Heiner Kallweit
Add rtl_disable_exit_l1() for ensuring that the chip doesn't inadvertently exit ASPM L1 when being in a low-power mode. The new function is called from rtl_prepare_power_down() which has to be moved in the code to avoid a forward declaration. According to Realtek OCP register 0xc0ac shadows ERI register 0xd4 on RTL8168 versions from RTL8168g. This allows to simplify the code a little. v2: - call rtl_disable_exit_l1() also if DASH or WoL are enabled Suggested-by: Chun-Hao Lin <hau@realtek.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31phy: make phy_set_max_speed() *void*Sergey Shtylyov
After following the call tree of phy_set_max_speed(), it became clear that this function never returns anything but 0, so we can change its result type to *void* and drop the result checks from the three drivers that actually bothered to do it... Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31net: dsa: mv88e6xxx: Improve indirect addressing performanceTobias Waldekranz
Before this change, both the read and write callback would start out by asserting that the chip's busy flag was cleared. However, both callbacks also made sure to wait for the clearing of the busy bit before returning - making the initial check superfluous. The only time that would ever have an effect was if the busy bit was initially set for some reason. With that in mind, make sure to perform an initial check of the busy bit, after which both read and write can rely the previous operation to have waited for the bit to clear. This cuts the number of operations on the underlying MDIO bus by 25% Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31net: dsa: mv88e6xxx: Improve performance of busy bit pollingTobias Waldekranz
Avoid a long delay when a busy bit is still set and has to be polled again. Measurements on a system with 2 Opals (6097F) and one Agate (6352) show that even with this much tighter loop, we have about a 50% chance of the bit being cleared on the first poll, all other accesses see the bit being cleared on the second poll. On a standard MDIO bus running MDC at 2.5MHz, a single access with 32 bits of preamble plus 32 bits of data takes 64*(1/2.5MHz) = 25.6us. This means that mv88e6xxx_smi_direct_wait took 26us + CPU overhead in the fast scenario, but 26us + 1500us + 26us + CPU overhead in the slow case - bringing the average close to 1ms. With this change in place, the slow case is closer to 2*26us + CPU overhead, with the average well below 100us - a 10x improvement. This translates to real-world winnings. On a 3-chip 20-port system, the modprobe time drops by 88%: Before: root@coronet:~# time modprobe mv88e6xxx real 0m 15.99s user 0m 0.00s sys 0m 1.52s After: root@coronet:~# time modprobe mv88e6xxx real 0m 2.21s user 0m 0.00s sys 0m 1.54s Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31net: bonding: Add support for IPV6 ns/na to balance-alb/balance-tlb modeSun Shouxin
Since ipv6 neighbor solicitation and advertisement messages isn't handled gracefully in bond6 driver, we can see packet drop due to inconsistency between mac address in the option message and source MAC . Another examples is ipv6 neighbor solicitation and advertisement messages from VM via tap attached to host bridge, the src mac might be changed through balance-alb mode, but it is not synced with Link-layer address in the option message. The patch implements bond6's tx handle for ipv6 neighbor solicitation and advertisement messages. Suggested-by: Hu Yadi <huyd12@chinatelecom.cn> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Signed-off-by: Sun Shouxin <sunshouxin@chinatelecom.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-29net: macb: Added ZynqMP-specific initializationRobert Hancock
The GEM controllers on ZynqMP were missing some initialization steps which are required in some cases when using SGMII mode, which uses the PS-GTR transceivers managed by the phy-zynqmp driver. The GEM core appears to need a hardware-level reset in order to work properly in SGMII mode in cases where the GT reference clock was not present at initial power-on. This can be done using a reset mapped to the zynqmp-reset driver in the device tree. Also, when in SGMII mode, the GEM driver needs to ensure the PHY is initialized and powered on. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>