summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2022-02-03net: dsa: mv88e6xxx: improve 88e6352 serdes statistics detectionRussell King (Oracle)
The decision whether to report serdes statistics currently depends on the cached C_Mode value for the port, read at probe time or updated by configuration. However, port 4 can be in "automedia" mode when it is used as a serdes port, meaning it switches between the internal PHY and the serdes, changing the read-only C_Mode value depending on which first gains link. Consequently, the C_Mode value read at probe does not accurately reflect whether the port has the serdes associated with it. In "net: dsa: mv88e6xxx: add mv88e6352_g2_scratch_port_has_serdes()", we added a way to read the hardware configuration to determine which port has the serdes associated with it. Use this to determine which port reports the serdes statistics. Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: mv88e6xxx: convert to phylink_generic_validate()Russell King (Oracle)
Now that the mv88e6xxx chip drivers are supplying the supported interfaces and MAC capabilities, switch the driver to use the generic phylink validation implementation by removing our own validation implementations. This causes DSA to call phylink_generic_validate() on our behalf. Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: mv88e6xxx: populate supported_interfaces and mac_capabilitiesRussell King (Oracle)
Populate the supported interfaces and MAC capabilities for the Marvell MV88E6xxx DSA switches in preparation to using these for the validation functionality. Patch co-authored by Marek. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Marek Behún <kabel@kernel.org> [ fixed 6341 and 6393x ] Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: mv88e6xxx: add mv88e6352_g2_scratch_port_has_serdes()Russell King (Oracle)
Read the hardware configuration to determine which port is attached to the serdes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: mv88e6xxx: Improve multichip isolation of standalone portsTobias Waldekranz
Given that standalone ports are now configured to bypass the ATU and forward all frames towards the upstream port, extend the ATU bypass to multichip systems. Load VID 0 (standalone) into the VTU with the policy bit set. Since VID 4095 (bridged) is already loaded, we now know that all VIDs in use are always available in all VTUs. Therefore, we can safely enable 802.1Q on DSA ports. Setting the DSA ports' VTU policy to TRAP means that all incoming frames on VID 0 will be classified as MGMT - as a result, the ATU is bypassed on all subsequent switches. With this isolation in place, we are able to support configurations that are simultaneously very quirky and very useful. Quirky because it involves looping cables between local switchports like in this example: CPU | .------. .---0---. | .----0----. | sw0 | | | sw1 | '-1-2-3-' | '-1-2-3-4-' $ @ '---' $ @ % % We have three physically looped pairs ($, @, and %). This is very useful because it allows us to run the kernel's kselftests for the bridge on mv88e6xxx hardware. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: mv88e6xxx: Enable port policy support on 6097Tobias Waldekranz
This chip has support for the same per-port policy actions found in later versions of LinkStreet devices. Fixes: f3a2cd326e44 ("net: dsa: mv88e6xxx: introduce .port_set_policy") Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: mv88e6xxx: Support policy entries in the VTUTobias Waldekranz
A VTU entry with policy enabled is used in combination with a port's VTU policy setting to override normal switching behavior for frames assigned to the entry's VID. A typical example is to Treat all frames in a particular VLAN as control traffic, and trap them to the CPU. In which case the relevant user port's VTU policy would be set to TRAP. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: mv88e6xxx: Improve isolation of standalone portsTobias Waldekranz
Clear MapDA on standalone ports to bypass any ATU lookup that might point the packet in the wrong direction. This means that all packets are flooded using the PVT config. So make sure that standalone ports are only allowed to communicate with the local upstream port. Here is a scenario in which this is needed: CPU | .----. .---0---. | .--0--. | sw0 | | | sw1 | '-1-2-3-' | '-1-2-' '---' - sw0p1 and sw1p1 are bridged - sw0p2 and sw1p2 are in standalone mode - Learning must be enabled on sw0p3 in order for hardware forwarding to work properly between bridged ports 1. A packet with SA :aa comes in on sw1p2 1a. Egresses sw1p0 1b. Ingresses sw0p3, ATU adds an entry for :aa towards port 3 1c. Egresses sw0p0 2. A packet with DA :aa comes in on sw0p2 2a. If an ATU lookup is done at this point, the packet will be incorrectly forwarded towards sw0p3. With this change in place, the ATU is bypassed and the packet is forwarded in accordance with the PVT, which only contains the CPU port. Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03ptp: add getcrosststamp() to virtual clocks.Miroslav Lichvar
If the physical clock supports cross timestamping (it has the getcrosststamp() function), provide a wrapper in the virtual clock to enable cross timestamping. This adds support for the PTP_SYS_OFFSET_PRECISE ioctl. Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03ptp: add gettimex64() to virtual clocks.Miroslav Lichvar
If the physical clock has the gettimex64() function, provide a gettimex64() wrapper in the virtual clock to enable more accurate and stable synchronization. This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl. Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03ptp: increase maximum adjustment of virtual clocks.Miroslav Lichvar
Increase the maximum frequency offset of virtual clocks to 50% to enable faster slewing corrections. This value cannot be represented as scaled ppm when long has 32 bits, but that is already the case for other drivers, even those that provide the adjfine() function, i.e. 32-bit applications are expected to check for the limit. Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Yangbo Lu <yangbo.lu@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03ptp: unregister virtual clocks when unregistering physical clock.Miroslav Lichvar
When unregistering a physical clock which has some virtual clocks, unregister the virtual clocks with it. This fixes the following oops, which can be triggered by unloading a driver providing a PTP clock when it has enabled virtual clocks: BUG: unable to handle page fault for address: ffffffffc04fc4d8 Oops: 0000 [#1] PREEMPT SMP NOPTI RIP: 0010:ptp_vclock_read+0x31/0xb0 Call Trace: timecounter_read+0xf/0x50 ptp_vclock_refresh+0x2c/0x50 ? ptp_clock_release+0x40/0x40 ptp_aux_kworker+0x17/0x30 kthread_worker_fn+0x9b/0x240 ? kthread_should_park+0x30/0x30 kthread+0xe2/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 Fixes: 73f37068d540 ("ptp: support ptp physical/virtual clocks conversion") Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Cc: Yangbo Lu <yangbo.lu@nxp.com> Cc: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: xrs700x: convert to phylink_generic_validate()Russell King (Oracle)
Populate the supported interfaces and MAC capabilities for the xrs700x family of DSA switches and remove the old validate implementation to allow DSA to use phylink_generic_validate() for this switch driver. According to commit ee00b24f32eb ("net: dsa: add Arrow SpeedChips XRS700x driver") the switch supports one RMII port and up to three RGMII ports. This commit assumes that port 0 is the RMII port and the remainder are RGMII. This commit also results in the Autoneg bit being set in the ethtool link modes, which wasn't in the original; if this switch supports RGMII to a 10/100/1G PHY, then surely we want to allow Autoneg on the PHY. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: qca8k: convert to phylink_generic_validate()Russell King (Oracle)
Populate the supported interfaces and MAC capabilities for the QCA8K DSA switch and remove the old validate implementation to allow DSA to use phylink_generic_validate() for this switch driver. In making this change, we bring consistency to the ethtool linkmodes that phylink's validate step produces, thereby following the expected behaviour as the phylink documentation has explained. Specifically, the ethtool 1000baseX_Full capability is now permitted for all interface modes, as it is a property of the PHY driver whether 1000baseX fiber connections can be supported. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: ksz8795: convert to phylink_generic_validate()Russell King (Oracle)
Populate the supported interfaces and MAC capabilities for the Microchip KSZ8795 DSA switch and remove the old validate implementation to allow DSA to use phylink_generic_validate() for this switch driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: bcm_sf2: convert to phylink_generic_validate()Russell King (Oracle)
Populate the supported interfaces and MAC capabilities for the bcm_sf2 DSA switch and remove the old validate implementation to allow DSA to use phylink_generic_validate() for this switch driver. The exclusion of Gigabit linkmodes for MII and Reverse MII links is handled within phylink_generic_validate() in phylink, so there is no need to make them conditional on the interface mode in the driver. Thanks to Florian Fainelli for suggesting how to populate the supported interfaces. Link: https://lore.kernel.org/r/3b3fed98-0c82-99e9-dc72-09fe01c2bcf3@gmail.com Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-03net: dsa: ar9331: convert to phylink_generic_validate()Russell King (Oracle)
Populate the supported interfaces and MAC capabilities for the AR9331 DSA switch and remove the old validate implementation to allow DSA to use phylink_generic_validate() for this switch driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: ipa: set IPA v4.11 AP<-modem RX buffer size to 32KBAlex Elder
Increase the receive buffer size used for data received from the modem to 32KB, to improve download performance by allowing much greater aggregation. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02net: ipa: define per-endpoint receive buffer sizeAlex Elder
Allow RX endpoints to have differing receive buffer sizes. Define the receive buffer size in the configuration data, and use that rather than IPA_RX_BUFFER_SIZE when configuring the endpoint. Add verification in ipa_endpoint_data_valid_one() that the receive buffer specified for AP RX endpoints is both big enough to handle at least one full packet, and not so big in an aggregating endpoint that its size can't be represented when programming the hardware. Move aggr_byte_limit_max() up in "ipa_endpoint.c" so it can be used earlier in the file without a forward-reference. Initially we'll just keep the 8KB receive buffer size already in use for all AP RX endpoints.. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02drivers: net: Replace acpi_bus_get_device()Rafael J. Wysocki
Replace acpi_bus_get_device() that is going to be dropped with acpi_fetch_acpi_dev(). No intentional functional impact. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/11918902.O9o76ZdvQC@kreacher Link: https://lore.kernel.org/r/11920660.O9o76ZdvQC@kreacher Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-02net: dsa: qca8k: introduce qca8k_bulk_read/write functionAnsuel Smith
Introduce qca8k_bulk_read/write() function to use mgmt Ethernet way to read/write packet in bulk. Make use of this new function in the fdb function and while at it reduce the reg for fdb_read from 4 to 3 as the max bit for the ARL(fdb) table is 83 bits. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: add support for larger read/write size with mgmt EthernetAnsuel Smith
mgmt Ethernet packet can read/write up to 16byte at times. The len reg is limited to 15 (0xf). The switch actually sends and accepts data in 4 different steps of len values. Len steps: - 0: nothing - 1-4: first 4 byte - 5-6: first 12 byte - 7-15: all 16 byte In the alloc skb function we check if the len is 16 and we fix it to a len of 15. It the read/write function interest to extract the real asked data. The tagger handler will always copy the fully 16byte with a READ command. This is useful for some big regs like the fdb reg that are more than 4byte of data. This permits to introduce a bulk function that will send and request the entire entry in one go. Write function is changed and it does now require to pass the pointer to val to also handle array val. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: cache lo and hi for mdio writeAnsuel Smith
From Documentation, we can cache lo and hi the same way we do with the page. This massively reduce the mdio write as 3/4 of the time as we only require to write the lo or hi part for a mdio write. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: move page cache to driver privAnsuel Smith
There can be multiple qca8k switch on the same system. Move the static qca8k_current_page to qca8k_priv and make it specific for each switch. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: add support for phy read/write with mgmt EthernetAnsuel Smith
Use mgmt Ethernet also for phy read/write if availabale. Use a different seq number to make sure we receive the correct packet. On any error, we fallback to the legacy mdio read/write. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: add support for mib autocast in Ethernet packetAnsuel Smith
The switch can autocast MIB counter using Ethernet packet. Add support for this and provide a handler for the tagger. The switch will send packet with MIB counter for each port, the switch will use completion API to wait for the correct packet to be received and will complete the task only when each packet is received. Although the handler will drop all the other packet, we still have to consume each MIB packet to complete the request. This is done to prevent mixed data with concurrent ethtool request. connect_tag_protocol() is used to add the handler to the tag_qca tagger, master_state_change() use the MIB lock to make sure no MIB Ethernet is in progress. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: add support for mgmt read/write in Ethernet packetAnsuel Smith
Add qca8k side support for mgmt read/write in Ethernet packet. qca8k supports some specially crafted Ethernet packet that can be used for mgmt read/write instead of the legacy method uart/internal mdio. This add support for the qca8k side to craft the packet and enqueue it. Each port and the qca8k_priv have a special struct to put data in it. The completion API is used to wait for the packet to be received back with the requested data. The various steps are: 1. Craft the special packet with the qca hdr set to mgmt read/write mode. 2. Set the lock in the dedicated mgmt struct. 3. Increment the seq number and set it in the mgmt pkt 4. Reinit the completion. 5. Enqueue the packet. 6. Wait the packet to be received. 7. Use the data set by the tagger to complete the mdio operation. If the completion timeouts or the ack value is not true, the legacy mdio way is used. It has to be considered that in the initial setup mdio is still used and mdio is still used until DSA is ready to accept and tag packet. tag_proto_connect() is used to fill the required handler for the tagger to correctly parse and elaborate the special Ethernet mdio packet. Locking is added to qca8k_master_change() to make sure no mgmt Ethernet are in progress. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-02net: dsa: qca8k: add tracking state of master portAnsuel Smith
MDIO/MIB Ethernet require the master port and the tagger availabale to correctly work. Use the new api master_state_change to track when master is operational or not and set a bool in qca8k_priv. We cache the first cached master available and we check if other cpu port are operational when the cached one goes down. This cached master will later be used by mdio read/write and mib request to correctly use the working function. qca8k implementation for MDIO/MIB Ethernet is bad. CPU port0 is the only one that answers with the ack packet or sends MIB Ethernet packets. For this reason the master_state_change ignore CPU port6 and only checks CPU port0 if it's operational and enables this mode. Signed-off-by: Ansuel Smith <ansuelsmth@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01sfc: The size of the RX recycle ring should be more flexibleMartin Habets
Ideally the size would depend on the link speed, but the recycle ring is created when the interface is brought up before the driver knows the link speed. So size it for the maximum speed of a given NIC. PowerPC is only supported on SFN7xxx and SFN8xxx NICs. With this patch on a 40G NIC the number of calls to alloc_pages and friends went down from about 18% to under 2%. On a 10G NIC the number of calls to alloc_pages and friends went down from about 15% to 0 (perf did not capture any calls during the 60 second test). On a 100G NIC the number of calls to alloc_pages and friends went down from about 23% to 4%. Reported-by: Íñigo Huguet <ihuguet@redhat.com> Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20220131111054.cp4f6foyinaarwbn@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-01r8169: support L1.2 control on RTL8168hHeiner Kallweit
According to Realtek RTL8168h supports the same L1.2 control as RTL8125. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/4784d5ce-38ac-046a-cbfa-5fdd9773f820@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-01net: lan966x: Implement get_ts_infoHoratiu Vultur
Implement the function get_ts_info in ethtool_ops which is needed to get the HW capabilities for timestamping. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01net: lan966x: Add support for ptp interruptsHoratiu Vultur
When doing 2-step timestamping the HW will generate an interrupt when it managed to timestamp a frame. It is the SW responsibility to read it from the FIFO. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01net: lan966x: Update extraction/injection for timestampingHoratiu Vultur
Update both the extraction and injection to do timestamping of the frames. The extraction is always doing the timestamping while for injection is doing the timestamping only if it is configured. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01net: lan966x: Implement SIOCSHWTSTAMP and SIOCGHWTSTAMPHoratiu Vultur
Implement the ioctl callbacks SIOCSHWTSTAMP and SIOCGHWTSTAMP to allow to configure the ports to enable/disable timestamping for TX. The RX timestamping is always enabled. The HW is capable to run both 1-step timestamping and 2-step timestamping. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01net: lan966x: Add support for ptp clocksHoratiu Vultur
The lan966x has 3 PHC. Enable each of them, for now all the timestamping is happening on the first PHC. Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01net: lan966x: Add registers that are use for ptp functionalityHoratiu Vultur
Add the registers that will be used to configure the PHC in the HW. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01Merge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/nextDavid S. Miller
-queue Tony Nguyen says: ==================== 10GbE Intel Wired LAN Driver Updates 2022-01-31 Alexander Lobakin says: This is an interpolation of [0] to other Intel Ethernet drivers (and is (re)based on its code). The main aim is to keep XDP metadata not only in case with build_skb(), but also when we do napi_alloc_skb() + memcpy(). All Intel drivers suffers from the same here: - metadata gets lost on XDP_PASS in legacy-rx; - excessive headroom allocation on XSK Rx to skbs; - metadata gets lost on XSK Rx to skbs. Those get especially actual in XDP Hints upcoming. I couldn't have addressed the first one for all Intel drivers due to that they don't reserve any headroom for now in legacy-rx mode even with XDP enabled. This is hugely wrong, but requires quite a bunch of work and a separate series. Luckily, ice doesn't suffer from that. igc has 1 and 3 already fixed in [0]. [0] https://lore.kernel.org/netdev/163700856423.565980.10162564921347693758.stgit@firesoul ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-31sh_eth: kill useless initializers in sh_eth_{suspend|resume}()Sergey Shtylyov
sh_eth_{suspend|resume}() initialize their local variable 'ret' to 0 but this value is never really used, thus we can kill those intializers... Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/f09d7c64-4a2b-6973-09a4-10d759ed0df4@omp.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-31net: ena: Do not waste napi skb cacheHyeonggon Yoo
By profiling, discovered that ena device driver allocates skb by build_skb() and frees by napi_skb_cache_put(). Because the driver does not use napi skb cache in allocation path, napi skb cache is periodically filled and flushed. This is waste of napi skb cache. As ena_alloc_skb() is called only in napi, Use napi_build_skb() and napi_alloc_skb() when allocating skb. This patch was tested on aws a1.metal instance. [ jwiedmann.dev@gmail.com: Use napi_alloc_skb() instead of netdev_alloc_skb_ip_align() to keep things consistent. ] Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: Shay Agroskin <shayagr@amazon.com> Link: https://lore.kernel.org/r/YfUAkA9BhyOJRT4B@ip-172-31-19-208.ap-northeast-1.compute.internal Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-31qed: use msleep() in qed_mcp_cmd() and add qed_mcp_cmd_nosleep() for udelay.Venkata Sudheer Kumar Bhavaraju
Change qed_mcp_cmd() to use msleep() (by setting QED_MB_FLAG_CAN_SLEEP flag) and add new nosleep() version of the api. These api are used to issue cmds to management fw and the change affects how driver behaves while waiting for a response/resource. All sleepable callers of the existing api now use msleep() version. For non-sleepable callers, the new nosleep() version is explicitly used. Signed-off-by: Venkata Sudheer Kumar Bhavaraju <vbhavaraju@marvell.com> Signed-off-by: Alok Prasad <palok@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Link: https://lore.kernel.org/r/20220131005235.1647881-1-vbhavaraju@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-31ixgbe: respect metadata on XSK Rx to skbAlexander Lobakin
For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match ixgbe_construct_skb(). Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support") Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31ixgbe: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, ixgbe_construct_skb_zc() currently allocates and reserves additional `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Fixes: d0bcacd0a130 ("ixgbe: add AF_XDP zero-copy Rx support") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31ixgbe: pass bi->xdp to ixgbe_construct_skb_zc() directlyAlexander Lobakin
To not dereference bi->xdp each time in ixgbe_construct_skb_zc(), pass bi->xdp as an argument instead of bi. We can also call xsk_buff_free() outside of the function as well as assign bi->xdp to NULL, which seems to make it closer to its name. Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31igc: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, igc_construct_skb_zc() currently allocates and reserves additional `xdp->data_meta - xdp->data_hard_start`, which is about XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only (+ meta) to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Also, net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match igc_construct_skb(). Fixes: fc9df2a0b520 ("igc: Enable RX via AF_XDP zero-copy") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31ice: respect metadata on XSK Rx to skbAlexander Lobakin
For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match ice_construct_skb(). Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31ice: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, ice_construct_skb_zc() currently allocates and reserves additional `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31ice: respect metadata in legacy-rx/ice_construct_skb()Alexander Lobakin
In "legacy-rx" mode represented by ice_construct_skb(), we can still use XDP (and XDP metadata), but after XDP_PASS the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. Point net_prefetch() to xdp->data_meta instead of data. This won't change anything when the meta is not here, but will save some cache misses otherwise. Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31i40e: respect metadata on XSK Rx to skbAlexander Lobakin
For now, if the XDP prog returns XDP_PASS on XSK, the metadata will be lost as it doesn't get copied to the skb. Copy it along with the frame headers. Account its size on skb allocation, and when copying just treat it as a part of the frame and do a pull after to "move" it to the "reserved" zone. net_prefetch() xdp->data_meta and align the copy size to speed-up memcpy() a little and better match i40e_construct_skb(). Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com> Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31i40e: don't reserve excessive XDP_PACKET_HEADROOM on XSK Rx to skbAlexander Lobakin
{__,}napi_alloc_skb() allocates and reserves additional NET_SKB_PAD + NET_IP_ALIGN for any skb. OTOH, i40e_construct_skb_zc() currently allocates and reserves additional `xdp->data - xdp->data_hard_start`, which is XDP_PACKET_HEADROOM for XSK frames. There's no need for that at all as the frame is post-XDP and will go only to the networking stack core. Pass the size of the actual data only to __napi_alloc_skb() and don't reserve anything. This will give enough headroom for stack processing. Fixes: 0a714186d3c0 ("i40e: add AF_XDP zero-copy Rx support") Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31net: mana: Reuse XDP dropped pageHaiyang Zhang
Reuse the dropped page in RX path to save page allocation overhead. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>