summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet
AgeCommit message (Collapse)Author
2024-03-05igc: avoid returning frame twice in XDP_REDIRECTFlorian Kauer
When a frame can not be transmitted in XDP_REDIRECT (e.g. due to a full queue), it is necessary to free it by calling xdp_return_frame_rx_napi. However, this is the responsibility of the caller of the ndo_xdp_xmit (see for example bq_xmit_all in kernel/bpf/devmap.c) and thus calling it inside igc_xdp_xmit (which is the ndo_xdp_xmit of the igc driver) as well will lead to memory corruption. In fact, bq_xmit_all expects that it can return all frames after the last successfully transmitted one. Therefore, break for the first not transmitted frame, but do not call xdp_return_frame_rx_napi in igc_xdp_xmit. This is equally implemented in other Intel drivers such as the igb. There are two alternatives to this that were rejected: 1. Return num_frames as all the frames would have been transmitted and release them inside igc_xdp_xmit. While it might work technically, it is not what the return value is meant to represent (i.e. the number of SUCCESSFULLY transmitted packets). 2. Rework kernel/bpf/devmap.c and all drivers to support non-consecutively dropped packets. Besides being complex, it likely has a negative performance impact without a significant gain since it is anyway unlikely that the next frame can be transmitted if the previous one was dropped. The memory corruption can be reproduced with the following script which leads to a kernel panic after a few seconds. It basically generates more traffic than a i225 NIC can transmit and pushes it via XDP_REDIRECT from a virtual interface to the physical interface where frames get dropped. #!/bin/bash INTERFACE=enp4s0 INTERFACE_IDX=`cat /sys/class/net/$INTERFACE/ifindex` sudo ip link add dev veth1 type veth peer name veth2 sudo ip link set up $INTERFACE sudo ip link set up veth1 sudo ip link set up veth2 cat << EOF > redirect.bpf.c SEC("prog") int redirect(struct xdp_md *ctx) { return bpf_redirect($INTERFACE_IDX, 0); } char _license[] SEC("license") = "GPL"; EOF clang -O2 -g -Wall -target bpf -c redirect.bpf.c -o redirect.bpf.o sudo ip link set veth2 xdp obj redirect.bpf.o cat << EOF > pass.bpf.c SEC("prog") int pass(struct xdp_md *ctx) { return XDP_PASS; } char _license[] SEC("license") = "GPL"; EOF clang -O2 -g -Wall -target bpf -c pass.bpf.c -o pass.bpf.o sudo ip link set $INTERFACE xdp obj pass.bpf.o cat << EOF > trafgen.cfg { /* Ethernet Header */ 0xe8, 0x6a, 0x64, 0x41, 0xbf, 0x46, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, const16(ETH_P_IP), /* IPv4 Header */ 0b01000101, 0, # IPv4 version, IHL, TOS const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header)) const16(2), # IPv4 ident 0b01000000, 0, # IPv4 flags, fragmentation off 64, # IPv4 TTL 17, # Protocol UDP csumip(14, 33), # IPv4 checksum /* UDP Header */ 10, 0, 1, 1, # IP Src - adapt as needed 10, 0, 1, 2, # IP Dest - adapt as needed const16(6666), # UDP Src Port const16(6666), # UDP Dest Port const16(1008), # UDP length (UDP header 8 bytes + payload length) csumudp(14, 34), # UDP checksum /* Payload */ fill('W', 1000), } EOF sudo trafgen -i trafgen.cfg -b3000MB -o veth1 --cpp Fixes: 4ff320361092 ("igc: Add support for XDP_REDIRECT action") Signed-off-by: Florian Kauer <florian.kauer@linutronix.de> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05i40e: Fix firmware version comparison functionIvan Vecera
Helper i40e_is_fw_ver_eq() compares incorrectly given firmware version as it returns true when the major version of running firmware is greater than the given major version that is wrong and results in failure during getting of DCB configuration where this helper is used. Fix the check and return true only if the running FW version is exactly equals to the given version. Reproducer: 1. Load i40e driver 2. Check dmesg output [root@host ~]# modprobe i40e [root@host ~]# dmesg | grep 'i40e.*DCB' [ 74.750642] i40e 0000:02:00.0: Query for DCB configuration failed, err -EIO aq_err I40E_AQ_RC_EINVAL [ 74.759770] i40e 0000:02:00.0: DCB init failed -5, disabled [ 74.966550] i40e 0000:02:00.1: Query for DCB configuration failed, err -EIO aq_err I40E_AQ_RC_EINVAL [ 74.975683] i40e 0000:02:00.1: DCB init failed -5, disabled Fixes: cf488e13221f ("i40e: Add other helpers to check version of running firmware and AQ API") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05ice: fix typo in assignmentJesse Brandeburg
Fix an obviously incorrect assignment, created with a typo or cut-n-paste error. Fixes: 5995ef88e3a8 ("ice: realloc VSI stats arrays") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05ice: fix uninitialized dplls mutex usageMichal Schmidt
The pf->dplls.lock mutex is initialized too late, after its first use. Move it to the top of ice_dpll_init. Note that the "err_exit" error path destroys the mutex. And the mutex is the last thing destroyed in ice_dpll_deinit. This fixes the following warning with CONFIG_DEBUG_MUTEXES: ice 0000:10:00.0: The DDP package was successfully loaded: ICE OS Default Package version 1.3.36.0 ice 0000:10:00.0: 252.048 Gb/s available PCIe bandwidth (16.0 GT/s PCIe x16 link) ice 0000:10:00.0: PTP init successful ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 0 PID: 410 at kernel/locking/mutex.c:587 __mutex_lock+0x773/0xd40 Modules linked in: crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni polyval_generic ice(+) nvme nvme_c> CPU: 0 PID: 410 Comm: kworker/0:4 Not tainted 6.8.0-rc5+ #3 Hardware name: HPE ProLiant DL110 Gen10 Plus/ProLiant DL110 Gen10 Plus, BIOS U56 10/19/2023 Workqueue: events work_for_cpu_fn RIP: 0010:__mutex_lock+0x773/0xd40 Code: c0 0f 84 1d f9 ff ff 44 8b 35 0d 9c 69 01 45 85 f6 0f 85 0d f9 ff ff 48 c7 c6 12 a2 a9 85 48 c7 c7 12 f1 a> RSP: 0018:ff7eb1a3417a7ae0 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffff85ac2bff RDI: 00000000ffffffff RBP: ff7eb1a3417a7b80 R08: 0000000000000000 R09: 00000000ffffbfff R10: ff7eb1a3417a7978 R11: ff32b80f7fd2e568 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ff32b7f02c50e0d8 FS: 0000000000000000(0000) GS:ff32b80efe800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055b5852cc000 CR3: 000000003c43a004 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __warn+0x84/0x170 ? __mutex_lock+0x773/0xd40 ? report_bug+0x1c7/0x1d0 ? prb_read_valid+0x1b/0x30 ? handle_bug+0x42/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? __mutex_lock+0x773/0xd40 ? rcu_is_watching+0x11/0x50 ? __kmalloc_node_track_caller+0x346/0x490 ? ice_dpll_lock_status_get+0x28/0x50 [ice] ? __pfx_ice_dpll_lock_status_get+0x10/0x10 [ice] ? ice_dpll_lock_status_get+0x28/0x50 [ice] ice_dpll_lock_status_get+0x28/0x50 [ice] dpll_device_get_one+0x14f/0x2e0 dpll_device_event_send+0x7d/0x150 dpll_device_register+0x124/0x180 ice_dpll_init_dpll+0x7b/0xd0 [ice] ice_dpll_init+0x224/0xa40 [ice] ? _dev_info+0x70/0x90 ice_load+0x468/0x690 [ice] ice_probe+0x75b/0xa10 [ice] ? _raw_spin_unlock_irqrestore+0x4f/0x80 ? process_one_work+0x1a3/0x500 local_pci_probe+0x47/0xa0 work_for_cpu_fn+0x17/0x30 process_one_work+0x20d/0x500 worker_thread+0x1df/0x3e0 ? __pfx_worker_thread+0x10/0x10 kthread+0x103/0x140 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> irq event stamp: 125197 hardirqs last enabled at (125197): [<ffffffff8416409d>] finish_task_switch.isra.0+0x12d/0x3d0 hardirqs last disabled at (125196): [<ffffffff85134044>] __schedule+0xea4/0x19f0 softirqs last enabled at (105334): [<ffffffff84e1e65a>] napi_get_frags_check+0x1a/0x60 softirqs last disabled at (105332): [<ffffffff84e1e65a>] napi_get_frags_check+0x1a/0x60 ---[ end trace 0000000000000000 ]--- Fixes: d7999f5ea64b ("ice: implement dpll interface to control cgu") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05net: ice: Fix potential NULL pointer dereference in ice_bridge_setlink()Rand Deeb
The function ice_bridge_setlink() may encounter a NULL pointer dereference if nlmsg_find_attr() returns NULL and br_spec is dereferenced subsequently in nla_for_each_nested(). To address this issue, add a check to ensure that br_spec is not NULL before proceeding with the nested attribute iteration. Fixes: b1edc14a3fbf ("ice: Implement ice_bridge_getlink and ice_bridge_setlink") Signed-off-by: Rand Deeb <rand.sec96@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05ice: virtchnl: stop pretending to support RSS over AQ or registersJacob Keller
The E800 series hardware uses the same iAVF driver as older devices, including the virtchnl negotiation scheme. This negotiation scheme includes a mechanism to determine what type of RSS should be supported, including RSS over PF virtchnl messages, RSS over firmware AdminQ messages, and RSS via direct register access. The PF driver will always prefer VIRTCHNL_VF_OFFLOAD_RSS_PF if its supported by the VF driver. However, if an older VF driver is loaded, it may request only VIRTCHNL_VF_OFFLOAD_RSS_REG or VIRTCHNL_VF_OFFLOAD_RSS_AQ. The ice driver happily agrees to support these methods. Unfortunately, the underlying hardware does not support these mechanisms. The E800 series VFs don't have the appropriate registers for RSS_REG. The mailbox queue used by VFs for VF to PF communication blocks messages which do not have the VF-to-PF opcode. Stop lying to the VF that it could support RSS over AdminQ or registers, as these interfaces do not work when the hardware is operating on an E800 series device. In practice this is unlikely to be hit by any normal user. The iAVF driver has supported RSS over PF virtchnl commands since 2016, and always defaults to using RSS_PF if possible. In principle, nothing actually stops the existing VF from attempting to access the registers or send an AQ command. However a properly coded VF will check the capability flags and will report a more useful error if it detects a case where the driver does not support the RSS offloads that it does. Fixes: 1071a8358a28 ("ice: Implement virtchnl commands for AVF support") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Alan Brady <alan.brady@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05idpf: disable local BH when scheduling napi for marker packetsEmil Tantilov
Fix softirq's not being handled during napi_schedule() call when receiving marker packets for queue disable by disabling local bottom half. The issue can be seen on ifdown: NOHZ tick-stop error: Non-RCU local softirq work is pending, handler #08!!! Using ftrace to catch the failing scenario: ifconfig [003] d.... 22739.830624: softirq_raise: vec=3 [action=NET_RX] <idle>-0 [003] ..s.. 22739.831357: softirq_entry: vec=3 [action=NET_RX] No interrupt and CPU is idle. After the patch when disabling local BH before calling napi_schedule: ifconfig [003] d.... 22993.928336: softirq_raise: vec=3 [action=NET_RX] ifconfig [003] ..s1. 22993.928337: softirq_entry: vec=3 [action=NET_RX] Fixes: c2d548cad150 ("idpf: add TX splitq napi poll support") Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-05net: introduce page_frag_cache_drain()Yunsheng Lin
When draining a page_frag_cache, most user are doing the similar steps, so introduce an API to avoid code duplication. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05net: txgbe: fix to clear interrupt status after handling IRQJiawen Wu
GPIO EOI is not set to clear interrupt status after handling the interrupt. It should be done in irq_chip->irq_ack, but this function is not called in handle_nested_irq(). So executing function txgbe_gpio_irq_ack() manually in txgbe_gpio_irq_handler(). Fixes: aefd013624a1 ("net: txgbe: use irq_domain for interrupt controller") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://lore.kernel.org/r/20240301092956.18544-2-jiawenwu@trustnetic.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05net: txgbe: fix GPIO interrupt blockingJiawen Wu
The register of GPIO interrupt status is masked before MAC IRQ is enabled. This is because of hardware deficiency. So manually clear the interrupt status before using them. Otherwise, GPIO interrupts will never be reported again. There is a workaround for clearing interrupts to set GPIO EOI in txgbe_up_complete(). Fixes: aefd013624a1 ("net: txgbe: use irq_domain for interrupt controller") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://lore.kernel.org/r/20240301092956.18544-1-jiawenwu@trustnetic.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-04Merge tag 'mlx5-fixes-2024-03-01' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2024-03-01 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. * tag 'mlx5-fixes-2024-03-01' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: Switch to using _bh variant of of spinlock API in port timestamping NAPI poll context net/mlx5e: Use a memory barrier to enforce PTP WQ xmit submission tracking occurs after populating the metadata_map net/mlx5e: Fix MACsec state loss upon state update in offload path net/mlx5e: Change the warning when ignore_flow_level is not supported net/mlx5: Check capability for fw_reset net/mlx5: Fix fw reporter diagnose output net/mlx5: E-switch, Change flow rule destination checking Revert "net/mlx5e: Check the number of elements before walk TC rhashtable" Revert "net/mlx5: Block entering switchdev mode with ns inconsistency" ==================== Link: https://lore.kernel.org/r/20240302070318.62997-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04Merge branch '10GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-03-01 (ixgbe, i40e, ice) This series contains updates to ixgbe, i40e, and ice drivers. Maciej corrects disable flow for ixgbe, i40e, and ice drivers which could cause non-functional interface with AF_XDP. Michal restores host configuration when changing MSI-X count for VFs on ice driver. * '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: reconfig host after changing MSI-X on VF ice: reorder disabling IRQ and NAPI in ice_qp_dis i40e: disable NAPI right after disabling irqs when handling xsk_pool ixgbe: {dis, en}able irqs in ixgbe_txrx_ring_{dis, en}able ==================== Link: https://lore.kernel.org/r/20240301192549.2993798-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04e1000e: Minor flow correction in e1000_shutdown functionVitaly Lifshits
Add curly braces to avoid entering to an if statement where it is not always required in e1000_shutdown function. This improves code readability and might prevent non-deterministic behaviour in the future. Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240301184806.2634508-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04igc: fix LEDS_CLASS dependencyArnd Bergmann
When IGC is built-in but LEDS_CLASS is a loadable module, there is a link failure: x86_64-linux-ld: drivers/net/ethernet/intel/igc/igc_leds.o: in function `igc_led_setup': igc_leds.c:(.text+0x75c): undefined reference to `devm_led_classdev_register_ext' Add another dependency that prevents this combination. Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240301184806.2634508-4-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04ixgbe: Add 1000BASE-BX supportErnesto Castellotti
Added support for 1000BASE-BX, i.e. Gigabit Ethernet over single strand of single-mode fiber. The initialization of a 1000BASE-BX SFP is the same as 1000BASE-SX/LX with the only difference that the Bit Rate Nominal Value must be checked to make sure it is a Gigabit Ethernet transceiver, as described by the SFF-8472 specification. This was tested with the FS.com SFP-GE-BX 1310/1490nm 10km transceiver: $ ethtool -m eth4 Identifier : 0x03 (SFP) Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) Connector : 0x07 (LC) Transceiver codes : 0x00 0x00 0x00 0x40 0x00 0x00 0x00 0x00 0x00 Transceiver type : Ethernet: BASE-BX10 Encoding : 0x01 (8B/10B) BR, Nominal : 1300MBd Rate identifier : 0x00 (unspecified) Length (SMF,km) : 10km Length (SMF) : 10000m Length (50um) : 0m Length (62.5um) : 0m Length (Copper) : 0m Length (OM3) : 0m Laser wavelength : 1310nm Vendor name : FS Vendor OUI : 64:9d:99 Vendor PN : SFP-GE-BX Vendor rev : Option values : 0x20 0x0a Option : RX_LOS implemented Option : TX_FAULT implemented Option : Power level 3 requirement BR margin, max : 0% BR margin, min : 0% Vendor SN : S2202359108 Date code : 220307 Optical diagnostics support : Yes Laser bias current : 17.650 mA Laser output power : 0.2132 mW / -6.71 dBm Receiver signal average optical power : 0.2740 mW / -5.62 dBm Module temperature : 47.30 degrees C / 117.13 degrees F Module voltage : 3.2576 V Alarm/warning flags implemented : Yes Laser bias current high alarm : Off Laser bias current low alarm : Off Laser bias current high warning : Off Laser bias current low warning : Off Laser output power high alarm : Off Laser output power low alarm : Off Laser output power high warning : Off Laser output power low warning : Off Module temperature high alarm : Off Module temperature low alarm : Off Module temperature high warning : Off Module temperature low warning : Off Module voltage high alarm : Off Module voltage low alarm : Off Module voltage high warning : Off Module voltage low warning : Off Laser rx power high alarm : Off Laser rx power low alarm : Off Laser rx power high warning : Off Laser rx power low warning : Off Laser bias current high alarm threshold : 110.000 mA Laser bias current low alarm threshold : 1.000 mA Laser bias current high warning threshold : 100.000 mA Laser bias current low warning threshold : 1.000 mA Laser output power high alarm threshold : 0.7079 mW / -1.50 dBm Laser output power low alarm threshold : 0.0891 mW / -10.50 dBm Laser output power high warning threshold : 0.6310 mW / -2.00 dBm Laser output power low warning threshold : 0.1000 mW / -10.00 dBm Module temperature high alarm threshold : 90.00 degrees C / 194.00 degrees F Module temperature low alarm threshold : -45.00 degrees C / -49.00 degrees F Module temperature high warning threshold : 85.00 degrees C / 185.00 degrees F Module temperature low warning threshold : -40.00 degrees C / -40.00 degrees F Module voltage high alarm threshold : 3.7950 V Module voltage low alarm threshold : 2.8050 V Module voltage high warning threshold : 3.4650 V Module voltage low warning threshold : 3.1350 V Laser rx power high alarm threshold : 0.7079 mW / -1.50 dBm Laser rx power low alarm threshold : 0.0028 mW / -25.53 dBm Laser rx power high warning threshold : 0.6310 mW / -2.00 dBm Laser rx power low warning threshold : 0.0032 mW / -24.95 dBm Signed-off-by: Ernesto Castellotti <ernesto@castellotti.net> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240301184806.2634508-3-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04intel: make module parameters readable in sys filesystemJon Maxwell
Linux users sometimes need an easy way to check current values of module parameters. For example the module may be manually reloaded with different parameters. Make these visible and readable in the /sys filesystem to allow that. But don't make the "debug" module parameter visible as debugging is enabled via ethtool msglvl. Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240301184806.2634508-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04net: sparx5: Fix use after free inside sparx5_del_mact_entryHoratiu Vultur
Based on the static analyzis of the code it looks like when an entry from the MAC table was removed, the entry was still used after being freed. More precise the vid of the mac_entry was used after calling devm_kfree on the mac_entry. The fix consists in first using the vid of the mac_entry to delete the entry from the HW and after that to free it. Fixes: b37a1bae742f ("net: sparx5: add mactable support") Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240301080608.3053468-1-horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-04ice: avoid unnecessary devm_ usageMaciej Fijalkowski
1. pcaps are free'd right after AQ routines are done, no need for devm_'s 2. a test frame for loopback test in ethtool -t is destroyed at the end of the test so we don't need devm_ here either. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: do not disable Tx queues twice in ice_down()Maciej Fijalkowski
ice_down() clears QINT_TQCTL_CAUSE_ENA_M bit twice, which is not necessary. First clearing happens in ice_vsi_dis_irq() and second in ice_vsi_stop_tx_ring() - remove the first one. While at it, make ice_vsi_dis_irq() static as ice_down() is the only current caller of it. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: cleanup line splitting for context set functionsJacob Keller
The indentation for ice_set_ctx and ice_write_rxq_ctx breaks the function name after the return type. This style of breaking is used a lot throughout the ice driver, even in cases where its not actually helpful for readability. We no longer prefer this style of line splitting in the driver, and new code is avoiding it. Normally, I would leave this alone unless the actual function contents or description needed updating. However, a future change is going to add inverse functions for converting packed context to unpacked context structures. To keep this code uniform with the existing set functions, fix up the style to the modern format of keeping the type on the same line. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: use GENMASK instead of BIT(n) - 1 in pack functionsJacob Keller
The functions used to pack the Tx and Rx context into the hardware format rely on using BIT() and then subtracting 1 to get a bitmask. These functions even have a comment about how x86 machines can't use this method for certain widths because the SHL instructions will not work properly. The Linux kernel already provides the GENMASK macro for generating a suitable bitmask. Further, GENMASK is capable of generating the mask including the shift_width. Since width is the total field width, take care to subtract one to get the final bit position. Since we now include the shifted bits as part of the mask, shift the source value first before applying the mask. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: rename ice_write_* functions to ice_pack_ctx_*Jacob Keller
In ice_common.c there are 4 functions used for converting the unpacked software Tx and Rx context structure data into the packed format used by hardware. These functions have extremely generic names: * ice_write_byte * ice_write_word * ice_write_dword * ice_write_qword When I saw these function names my first thought was "write what? to where?". Understanding what these functions do requires looking at the implementation details. The functions take bits from an unpacked structure and copy them into the packed layout used by hardware. As part of live migration, we will want functions which perform the inverse operation of reading bits from the packed layout and copying them into the unpacked format. Naming these as "ice_read_byte", etc would be very confusing since they appear to write data. In preparation for adding this new inverse operation, rename the existing functions to use the prefix "ice_pack_ctx_". This makes it clear that they perform the bit packing while copying from the unpacked software context structure to the packed hardware context. The inverse operations can then neatly be named ice_unpack_ctx_*, clearly indicating they perform the bit unpacking while copying from the packed hardware context to the unpacked software context structure. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: remove vf->lan_vsi_num fieldJacob Keller
The lan_vsi_num field of the VF structure is no longer used for any purpose. Remove it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: use relative VSI index for VFs instead of PF VSI numberJacob Keller
When initializing over virtchnl, the PF is required to pass a VSI ID to the VF as part of its capabilities exchange. The VF driver reports this value back to the PF in a variety of commands. The PF driver validates that this value matches the value it sent to the VF. Some hardware families such as the E700 series could use this value when reading RSS registers or communicating directly with firmware over the Admin Queue. However, E800 series hardware does not support any of these interfaces and the VF's only use for this value is to report it back to the PF. Thus, there is no requirement that this value be an actual VSI ID value of any kind. The PF driver already does not trust that the VF sends it a real VSI ID. The VSI structure is always looked up from the VF structure. The PF does validate that the VSI ID provided matches a VSI associated with the VF, but otherwise does not use the VSI ID for any purpose. Instead of reporting the VSI number relative to the PF space, report a fixed value of 1. When communicating with the VF over virtchnl, validate that the VSI number is returned appropriately. This avoids leaking information about the firmware of the PF state. Currently the ice driver only supplies a VF with a single VSI. However, it appears that virtchnl has some support for allowing multiple VSIs. I did not attempt to implement this. However, space is left open to allow further relative indexes if additional VSIs are provided in future feature development. For this reason, keep the ice_vc_isvalid_vsi_id function in place to allow extending it for multiple VSIs in the future. This change will also simplify handling of live migration in a future series. Since we no longer will provide a real VSI number to the VF, there will be no need to keep track of this number when migrating to a new host. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: remove unnecessary duplicate checks for VF VSI IDJacob Keller
The ice_vc_fdir_param_check() function validates that the VSI ID of the virtchnl flow director command matches the VSI number of the VF. This is already checked by the call to ice_vc_isvalid_vsi_id() immediately following this. This check is unnecessary since ice_vc_isvalid_vsi_id() already confirms this by checking that the VSI ID can locate the VSI associated with the VF structure. Furthermore, a following change is going to refactor the ice driver to report VSI IDs using a relative index for each VF instead of reporting the PF VSI number. This additional check would break that logic since it enforces that the VSI ID matches the VSI number. Since this check duplicates the logic in ice_vc_isvalid_vsi_id() and gets in the way of refactoring that logic, remove it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: pass VSI pointer into ice_vc_isvalid_q_idJacob Keller
The ice_vc_isvalid_q_id() function takes a VSI index and a queue ID. It looks up the VSI from its index, and then validates that the queue number is valid for that VSI. The VSI ID passed is typically a VSI index from the VF. This VSI number is validated by the PF to ensure that it matches the VSI associated with the VF already. In every flow where ice_vc_isvalid_q_id() is called, the PF driver already has a pointer to the VSI associated with the VF. This pointer is obtained using ice_get_vf_vsi(), rather than looking up the VSI using the index sent by the VF. Since we already know which VSI to operate on, we can modify ice_vc_isvalid_q_id() to take a VSI pointer instead of a VSI index. Pass the VSI we found from ice_get_vf_vsi() instead of re-doing the lookup. This removes some unnecessary computation and scanning of the VSI list. It also removes the last place where the driver directly used the VSI number from the VF. This will pave the way for refactoring to communicate relative VSI numbers to the VF instead of absolute numbers from the PF space. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: remove dealloc vector msg err in idpf_intr_relAlan Brady
This error message is at best not really helpful and at worst misleading. If we're here in idpf_intr_rel we're likely trying to do remove or reset. If we're in reset, this message will fail because we lose the virtchnl on reset and HW is going to clean up those resources regardless in that case. If we're in remove and we get an error here, we're going to reset the device at the end of remove anyway so not a big deal. Just remove this message it's not useful. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: fix minor controlq issuesAlan Brady
While we're here improving virtchnl we can include two minor fixes for the lower level ctrlq flow. This adds a memory barrier to idpf_post_rx_buffs before we update tail on the controlq. We should make sure our writes have had a chance to finish before we tell HW it can touch them. This also removes some defensive programming in idpf_ctrlq_recv. The caller should not be using a num_q_msg value of zero or more than the ring size and it's their responsibility to call functions sanely. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: prevent deinit uninitialized virtchnl coreAlan Brady
In idpf_remove we need to tear down the virtchnl core with idpf_vc_core_deinit so we can free up resources and leave things in a good state. However, in the case where we failed to establish VC communications we may not have ever actually successfully initialized the virtchnl core. This fixes it by setting a bit once we successfully init the virtchnl core. Then, in deinit, we'll check for it before going on further, otherwise we just return. Also clear the bit at the end of deinit so we know it's gone now. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: cleanup virtchnl cruftAlan Brady
We can now remove a bunch of gross code we don't need anymore like the vc state bits and vc_buf_lock since everything is using transaction API now. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: refactor idpf_recv_mb_msgAlan Brady
Now that all the messages are using the transaction API, we can rework idpf_recv_mb_msg quite a lot to simplify it. Due to this, we remove idpf_find_vport as no longer used and alter idpf_recv_event_msg slightly. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: add async_handler for MAC filter messagesAlan Brady
There are situations where the driver needs to add a MAC filter but we're explicitly not allowed to sleep so we can wait for a virtchnl message to complete. This adds an async_handler for asynchronously sent messages for MAC filters so that we can better handle if there's an error of some kind. If success we don't need to do anything else, but if we failed to program the new filter we really should remove it from our list of MAC filters. If we don't remove bad filters, what I expect to happen is after a reset of some kind we try to program the MAC filter again and it fails again. This is clearly wrong and I would expect to be confusing for the user. It could also be the failure is for a delete MAC filter message but those filters get deleted regardless. Not much we can do about a delete failure. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: refactor remaining virtchnl messagesAlan Brady
This takes care of RSS/SRIOV/MAC and other misc virtchnl messages. This again is mostly mechanical. In absence of an async_handler for MAC filters, this will simply generically report any errors from idpf_vc_xn_forward_async. This maintains the existing behavior. Follow up patch will add an async handler for MAC filters to remove bad filters from our list. While we're here we can also make the code much nicer by converting some variables to auto-variables where appropriate. This makes it cleaner and less prone to memory leaking. There's still a bit more cleanup we can do here to remove stuff that's not being used anymore now; follow-up patches will take care of loose ends. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: refactor queue related virtchnl messagesAlan Brady
This reworks queue specific virtchnl messages to use the added transaction API. It is fairly mechanical and generally makes the functions using it more simple. Functions using transaction API no longer need to take the vc_buf_lock since it's not using it anymore. After filling out an idpf_vc_xn_params struct, idpf_vc_xn_exec takes care of the send and recv handling. This also converts those functions where appropriate to use auto-variables instead of manually calling kfree. This greatly simplifies the memory alloc paths and makes them less prone memory leaks. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: refactor vport virtchnl messagesAlan Brady
This reworks the way vport related virtchnl messages work to take advantage of the added transaction API. It is fairly mechanical as, to use the transaction API, the function just needs to fill out an appropriate idpf_vc_xn_params struct to pass to idpf_vc_xn_exec which will take care of the actual send and recv. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: implement virtchnl transaction managerAlan Brady
This starts refactoring how virtchnl messages are handled by adding a transaction manager (idpf_vc_xn_manager). There are two primary motivations here which are to enable handling of multiple messages at once and to make it more robust in general. As it is right now, the driver may only have one pending message at a time and there's no guarantee that the response we receive was actually intended for the message we sent prior. This works by utilizing a "cookie" field of the message descriptor. It is arbitrary what data we put in the cookie and the response is required to have the same cookie the original message was sent with. Then using a "transaction" abstraction that uses the completion API to pair responses to the message it belongs to. The cookie works such that the first half is the index to the transaction in our array, and the second half is a "salt" that gets incremented every message. This enables quick lookups into the array and also ensuring we have the correct message. The salt is necessary because after, for example, a message times out and we deem the response was lost for some reason, we could theoretically reuse the same index but using a different salt ensures that when we do actually get a response it's not the old message that timed out previously finally coming in. Since the number of transactions allocated is U8_MAX and the salt is 8 bits, we can never have a conflict because we can't roll over the salt without using more transactions than we have available. This starts by only converting the VIRTCHNL2_OP_VERSION message to use this new transaction API. Follow up patches will convert all virtchnl messages to use the API. Tested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Igor Bagnucki <igor.bagnucki@intel.com> Co-developed-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04idpf: add idpf_virtchnl.hAlan Brady
idpf.h is quite heavy. We can reduce the burden a fair bit by introducing an idpf_virtchnl.h file. This mostly just moves function declarations but there are many of them. This also makes an attempt to group those declarations in a way that makes some sense instead of mishmashed. Suggested-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Alan Brady <alan.brady@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04Octeontx2-af: Fix an issue in firmware shared data reserved spaceHariprasad Kelam
The last patch which added support to extend the firmware shared data to add channel data information has introduced a bug due to the reserved space not adjusted accordingly. This patch fixes the issue and also adds BUILD_BUG to avoid this regression error. Fixes: 997814491cee ("Octeontx2-af: Fetch MAC channel info from firmware") Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04eth: igc: remove unused embedded struct net_deviceJakub Kicinski
struct net_device poll_dev in struct igc_q_vector was added in one of the initial commits, but never used. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04gve: Add header split ethtool statsJeroen de Borst
To record the stats of header split packets, three stats are added in the driver's ethtool stats. - rx_hsplit_pkt is the split packets count with header split - rx_hsplit_bytes is the received header bytes count with header split - rx_hsplit_unsplit_pkt is the unsplit packet count due to header buffer overflow or zero header length when header split is enabled Currently, it's entering the stats_update critical section more than once per packet. We have plans to avoid that in the future change to let all the stats_update happen in one place at the end of `gve_rx_poll_dqo`. Co-developed-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04gve: Add header split data pathJeroen de Borst
Add header buffers and ethtool support to enable header split via the tcp-data-split flag in ethtool's ringparam config. A coherent dma memory is allocated for the header buffers. There is one header buffer per ring entry by calculating the offset to the header-buffers starting address. The header buffer is always copied directly into the skb and payload is always added as frags. When there is a header buffer overflow or the header length is 0, the driver places the whole unsplit packet in frags. When toggling header split, the driver will call gve_adjust_config to set its queues appropriately. If header split is enabled by the user and the max packet buffer size is no less than 4KB, driver will set the packet buffer size as 4KB to support TCP_ZEROCOPY_RECEIVE. Otherwise the driver will use the default 2KB as the packet buffer size. `ethtool -G <dev> tcp-data-split on/off` is the command to toggle header split. `ethtool -g <dev>` will show the status of header split with the field of `tcp-data-split`. Co-developed-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04gve: Add header split device optionJeroen de Borst
To enable header split via ethtool, we first need to query the device to get the max rx buffer size and header buffer size. Add a device option to get these values and store them in the driver. If the header buffer size received from the device is non-zero, it means header split is supported in the device. Currently the max rx buffer size will only be used when header split is enabled which will set the data_buffer_size_dqo to be the max rx buffer size. Also change the data_buffer_size_dqo from int to u16 since we are modifying it and making it to be consistent with max_rx_buffer_size. Co-developed-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04ionic: change MODULE_AUTHOR to person nameShannon Nelson
The MODULE_AUTHOR macro is supposed to be a person not a company. Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04ionic: Clean RCT ordering issuesBrett Creeley
Clean up complaints from an xmastree.py scan. Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04ionic: Use CQE profile for dimBrett Creeley
Use the kernel's CQE dim table to align better with the driver's use of completion queues, and use the tx moderation when using Tx interrupts. Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04ionic: change the hwstamp likely checkBrett Creeley
An earlier change moved the hwstamp queue check into a helper function with an unlikely(). However, it makes more sense for the caller to decide if it's likely() or unlikely(), so make the change to support that. Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04ionic: reduce the use of netdevShannon Nelson
To help make sure we're only accessing things we really need to access we can cut down on the q->lif->netdev references by using q->dev which is already in cache. Reviewed-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04ionic: Pass local netdev instead of referencing structBrett Creeley
Instead of using q->lif->netdev, just pass the netdev when it's locally defined. Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04ionic: Check stop no restartBrett Creeley
If there is a lot of transmit traffic the driver can get into a situation that the device is starved due to the doorbell never being rung. This can happen if xmit_more is set constantly and __netdev_tx_sent_queue() keeps returning false. Fix this by checking if the queue needs to be stopped right before calling __netdev_tx_sent_queue(). Use MAX_SKB_FRAGS + 1 as the stop condition because that's the maximum number of frags supported for non-TSO transmit. Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-04ionic: Clean up BQL logicBrett Creeley
The driver currently calls netdev_tx_completed_queue() for every Tx completion. However, this API is only meant to be called once per NAPI if any Tx work is done. Make the necessary changes to support calling netdev_tx_completed_queue() only once per NAPI. Also, use the __netdev_tx_sent_queue() API, which supports the xmit_more functionality. Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Signed-off-by: David S. Miller <davem@davemloft.net>