summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/iavf/iavf.h
AgeCommit message (Collapse)Author
2024-04-24iavf: kill "legacy-rx" for goodAlexander Lobakin
Ever since build_skb() became stable, the old way with allocating an skb for storing the headers separately, which will be then copied manually, was slower, less flexible, and thus obsolete. * It had higher pressure on MM since it actually allocates new pages, which then get split and refcount-biased (NAPI page cache); * It implies memcpy() of packet headers (40+ bytes per each frame); * the actual header length was calculated via eth_get_headlen(), which invokes Flow Dissector and thus wastes a bunch of CPU cycles; * XDP makes it even more weird since it requires headroom for long and also tailroom for some time (since mbuf landed). Take a look at the ice driver, which is built around work-arounds to make XDP work with it. Even on some quite low-end hardware (not a common case for 100G NICs) it was performing worse. The only advantage "legacy-rx" had is that it didn't require any reserved headroom and tailroom. But iavf didn't use this, as it always splits pages into two halves of 2k, while that save would only be useful when striding. And again, XDP effectively removes that sole pro. There's a train of features to land in IAVF soon: Page Pool, XDP, XSk, multi-buffer etc. Each new would require adding more and more Danse Macabre for absolutely no reason, besides making hotpath less and less effective. Remove the "feature" with all the related code. This includes at least one very hot branch (typically hit on each new frame), which was either always-true or always-false at least for a complete NAPI bulk of 64 frames, the whole private flags cruft, and so on. Some stats: Function: add/remove: 0/4 grow/shrink: 0/7 up/down: 0/-721 (-721) RO Data: add/remove: 0/1 grow/shrink: 0/0 up/down: 0/-40 (-40) Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-12-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/intel/iavf/iavf_ethtool.c 3a0b5a2929fd ("iavf: Introduce new state machines for flow director") 95260816b489 ("iavf: use iavf_schedule_aq_request() helper") https://lore.kernel.org/all/84e12519-04dc-bd80-bc34-8cf50d7898ce@intel.com/ drivers/net/ethernet/broadcom/bnxt/bnxt.c c13e268c0768 ("bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic") c2f8063309da ("bnxt_en: Refactor RX VLAN acceleration logic.") a7445d69809f ("bnxt_en: Add support for new RX and TPA_START completion types for P7") 1c7fd6ee2fe4 ("bnxt_en: Rename some macros for the P5 chips") https://lore.kernel.org/all/20231211110022.27926ad9@canb.auug.org.au/ drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c bd6781c18cb5 ("bnxt_en: Fix wrong return value check in bnxt_close_nic()") 84793a499578 ("bnxt_en: Skip nic close/open when configuring tstamp filters") https://lore.kernel.org/all/20231214113041.3a0c003c@canb.auug.org.au/ drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c 3d7a3f2612d7 ("net/mlx5: Nack sync reset request when HotPlug is enabled") cecf44ea1a1f ("net/mlx5: Allow sync reset flow when BF MGT interface device is present") https://lore.kernel.org/all/20231211110328.76c925af@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-13iavf: enable symmetric-xor RSS for Toeplitz hash functionAhmed Zaki
Allow the user to set the symmetric Toeplitz hash function via: # ethtool -X eth0 hfunc toeplitz symmetric-xor The driver will reject any new RSS configuration if a field other than (IP src/dst and L4 src/dst ports) is requested for hashing. The symmetric RSS will not be supported on PFs not advertising the ADV RSS Offload flag (ADV_RSS_SUPPORT()), for example the E700 series (i40e). Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://lore.kernel.org/r/20231213003321.605376-9-ahmed.zaki@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-12-12iavf: Introduce new state machines for flow directorPiotr Gardocki
New states introduced: IAVF_FDIR_FLTR_DIS_REQUEST IAVF_FDIR_FLTR_DIS_PENDING IAVF_FDIR_FLTR_INACTIVE Current FDIR state machines (SM) are not adequate to handle a few scenarios in the link DOWN/UP event, reset event and ntuple-feature. For example, when VF link goes DOWN and comes back UP administratively, the expectation is that previously installed filters should also be restored. But with current SM, filters are not restored. So with new SM, during link DOWN filters are marked as INACTIVE in the iavf list but removed from PF. After link UP, SM will transition from INACTIVE to ADD_REQUEST to restore the filter. Similarly, with VF reset, filters will be removed from the PF, but marked as INACTIVE in the iavf list. Filters will be restored after reset completion. Steps to reproduce: ------------------- 1. Create a VF. Here VF is enp8s0. 2. Assign IP addresses to VF and link partner and ping continuously from remote. Here remote IP is 1.1.1.1. 3. Check default RX Queue of traffic. ethtool -S enp8s0 | grep -E "rx-[[:digit:]]+\.packets" 4. Add filter - change default RX Queue (to 15 here) ethtool -U ens8s0 flow-type ip4 src-ip 1.1.1.1 action 15 loc 5 5. Ensure filter gets added and traffic is received on RX queue 15 now. Link event testing: ------------------- 6. Bring VF link down and up. If traffic flows to configured queue 15, test is success, otherwise it is a failure. Reset event testing: -------------------- 7. Reset the VF. If traffic flows to configured queue 15, test is success, otherwise it is a failure. Fixes: 0dbfbabb840d ("iavf: Add framework to enable ethtool ntuple filters") Signed-off-by: Piotr Gardocki <piotrx.gardocki@intel.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Ranganatha Rao <ranganatha.rao@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-10-27iavf: delete the iavf client interfaceMichal Schmidt
The iavf client interface was added in 2017 by commit ed0e894de7c1 ("i40evf: add client interface"), but there have never been any in-tree callers. It's not useful for future development either. The Intel out-of-tree iavf and irdma drivers instead use an auxiliary bus, which is a better solution. Remove the iavf client interface code. Also gone are the client_task work and the client_lock mutex. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231027175941.1340255-9-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-27iavf: rely on netdev's own registered stateMichal Schmidt
The information whether a netdev has been registered is already present in the netdev itself. There's no need for a driver flag with the same meaning. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20231027175941.1340255-6-jacob.e.keller@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-09-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni
Cross-merge networking fixes after downstream PR. No conflicts. Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-09-15iavf: add iavf_schedule_aq_request() helperPetr Oros
Add helper for set iavf aq request AVF_FLAG_AQ_* and immediately schedule watchdog_task. Helper will be used in cases where it is necessary to run aq requests asap Signed-off-by: Petr Oros <poros@redhat.com> Co-developed-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Co-developed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-09-13iavf: Add ability to turn off CRC stripping for VFNorbert Zulinski
Previously CRC stripping was always enabled for VF. Now it is possible to turn off CRC stripping via ethtool: #ethtool -K <interface> rx-fcs on To turn off CRC stripping, first VLAN stripping must be disabled: #ethtool -K <interface> rx-vlan-offload off if any VLAN interfaces exists, otherwise VLAN stripping will be turned off by the driver. In iavf_configure_queues add check if CRC stripping is enabled for VF, if it's enabled then set crc_disabled to false on every VF's queue. In iavf_set_features add check if CRC stripping setting was changed then schedule reset. Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-09-11iavf: Fix promiscuous mode configuration flow messagesBrett Creeley
Currently when configuring promiscuous mode on the AVF we detect a change in the netdev->flags. We use IFF_PROMISC and IFF_ALLMULTI to determine whether or not we need to request/release promiscuous mode and/or multicast promiscuous mode. The problem is that the AQ calls for setting/clearing promiscuous/multicast mode are treated separately. This leads to a case where we can trigger two promiscuous mode AQ calls in a row with the incorrect state. To fix this make a few changes. Use IAVF_FLAG_AQ_CONFIGURE_PROMISC_MODE instead of the previous IAVF_FLAG_AQ_[REQUEST|RELEASE]_[PROMISC|ALLMULTI] flags. In iavf_set_rx_mode() detect if there is a change in the netdev->flags in comparison with adapter->flags and set the IAVF_FLAG_AQ_CONFIGURE_PROMISC_MODE aq_required bit. Then in iavf_process_aq_command() only check for IAVF_FLAG_CONFIGURE_PROMISC_MODE and call iavf_set_promiscuous() if it's set. In iavf_set_promiscuous() check again to see which (if any) promiscuous mode bits have changed when comparing the netdev->flags with the adapter->flags. Use this to set the flags which get sent to the PF driver. Add a spinlock that is used for updating current_netdev_promisc_flags and only allows one promiscuous mode AQ at a time. [1] Fixes the fact that we will only have one AQ call in the aq_required queue at any one time. [2] Streamlines the change in promiscuous mode to only set one AQ required bit. [3] This allows us to keep track of the current state of the flags and also makes it so we can take the most recent netdev->flags promiscuous mode state. [4] This fixes the problem where a change in the netdev->flags can cause IAVF_FLAG_AQ_CONFIGURE_PROMISC_MODE to be set in iavf_set_rx_mode(), but cleared in iavf_set_promiscuous() before the change is ever made via AQ call. Fixes: 47d3483988f6 ("i40evf: Add driver support for promiscuous mode") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-16virtchnl: fix fake 1-elem arrays in structures allocated as `nents + 1`Alexander Lobakin
There are five virtchnl structures, which are allocated and checked in the code as `nents + 1`, meaning that they always have memory for one excessive element regardless of their actual number. This comes from that their sizeof() includes space for 1 element and then they get allocated via struct_size() or its open-coded equivalents, passing the actual number of elements. Expand virtchnl_struct_size() to handle such structures and replace those 1-elem arrays with proper flex ones. Also fix several places which open-code %IAVF_VIRTCHNL_VF_RESOURCE_SIZE. Finally, let the virtchnl_ether_addr_list size be computed automatically when there's no enough space for the whole list, otherwise we have to open-code reverse struct_size() logics. Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Kees Cook <keescook@chromium.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17iavf: fix reset task race with iavf_remove()Ahmed Zaki
The reset task is currently scheduled from the watchdog or adminq tasks. First, all direct calls to schedule the reset task are replaced with the iavf_schedule_reset(), which is modified to accept the flag showing the type of reset. To prevent the reset task from starting once iavf_remove() starts, we need to check the __IAVF_IN_REMOVE_TASK bit before we schedule it. This is now easily added to iavf_schedule_reset(). Finally, remove the check for IAVF_FLAG_RESET_NEEDED in the watchdog task. It is redundant since all callers who set the flag immediately schedules the reset task. Fixes: 3ccd54ef44eb ("iavf: Fix init state closure on remove") Fixes: 14756b2ae265 ("iavf: Fix __IAVF_RESETTING state usage") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17iavf: fix a deadlock caused by rtnl and driver's lock circular dependenciesAhmed Zaki
A driver's lock (crit_lock) is used to serialize all the driver's tasks. Lockdep, however, shows a circular dependency between rtnl and crit_lock. This happens when an ndo that already holds the rtnl requests the driver to reset, since the reset task (in some paths) tries to grab rtnl to either change real number of queues of update netdev features. [566.241851] ====================================================== [566.241893] WARNING: possible circular locking dependency detected [566.241936] 6.2.14-100.fc36.x86_64+debug #1 Tainted: G OE [566.241984] ------------------------------------------------------ [566.242025] repro.sh/2604 is trying to acquire lock: [566.242061] ffff9280fc5ceee8 (&adapter->crit_lock){+.+.}-{3:3}, at: iavf_close+0x3c/0x240 [iavf] [566.242167] but task is already holding lock: [566.242209] ffffffff9976d350 (rtnl_mutex){+.+.}-{3:3}, at: iavf_remove+0x6b5/0x730 [iavf] [566.242300] which lock already depends on the new lock. [566.242353] the existing dependency chain (in reverse order) is: [566.242401] -> #1 (rtnl_mutex){+.+.}-{3:3}: [566.242451] __mutex_lock+0xc1/0xbb0 [566.242489] iavf_init_interrupt_scheme+0x179/0x440 [iavf] [566.242560] iavf_watchdog_task+0x80b/0x1400 [iavf] [566.242627] process_one_work+0x2b3/0x560 [566.242663] worker_thread+0x4f/0x3a0 [566.242696] kthread+0xf2/0x120 [566.242730] ret_from_fork+0x29/0x50 [566.242763] -> #0 (&adapter->crit_lock){+.+.}-{3:3}: [566.242815] __lock_acquire+0x15ff/0x22b0 [566.242869] lock_acquire+0xd2/0x2c0 [566.242901] __mutex_lock+0xc1/0xbb0 [566.242934] iavf_close+0x3c/0x240 [iavf] [566.242997] __dev_close_many+0xac/0x120 [566.243036] dev_close_many+0x8b/0x140 [566.243071] unregister_netdevice_many_notify+0x165/0x7c0 [566.243116] unregister_netdevice_queue+0xd3/0x110 [566.243157] iavf_remove+0x6c1/0x730 [iavf] [566.243217] pci_device_remove+0x33/0xa0 [566.243257] device_release_driver_internal+0x1bc/0x240 [566.243299] pci_stop_bus_device+0x6c/0x90 [566.243338] pci_stop_and_remove_bus_device+0xe/0x20 [566.243380] pci_iov_remove_virtfn+0xd1/0x130 [566.243417] sriov_disable+0x34/0xe0 [566.243448] ice_free_vfs+0x2da/0x330 [ice] [566.244383] ice_sriov_configure+0x88/0xad0 [ice] [566.245353] sriov_numvfs_store+0xde/0x1d0 [566.246156] kernfs_fop_write_iter+0x15e/0x210 [566.246921] vfs_write+0x288/0x530 [566.247671] ksys_write+0x74/0xf0 [566.248408] do_syscall_64+0x58/0x80 [566.249145] entry_SYSCALL_64_after_hwframe+0x72/0xdc [566.249886] other info that might help us debug this: [566.252014] Possible unsafe locking scenario: [566.253432] CPU0 CPU1 [566.254118] ---- ---- [566.254800] lock(rtnl_mutex); [566.255514] lock(&adapter->crit_lock); [566.256233] lock(rtnl_mutex); [566.256897] lock(&adapter->crit_lock); [566.257388] *** DEADLOCK *** The deadlock can be triggered by a script that is continuously resetting the VF adapter while doing other operations requiring RTNL, e.g: while :; do ip link set $VF up ethtool --set-channels $VF combined 2 ip link set $VF down ip link set $VF up ethtool --set-channels $VF combined 4 ip link set $VF down done Any operation that triggers a reset can substitute "ethtool --set-channles" As a fix, add a new task "finish_config" that do all the work which needs rtnl lock. With the exception of iavf_remove(), all work that require rtnl should be called from this task. As for iavf_remove(), at the point where we need to call unregister_netdevice() (and grab rtnl_lock), we make sure the finish_config task is not running (cancel_work_sync()) to safely grab rtnl. Subsequent finish_config work cannot restart after that since the task is guarded by the __IAVF_IN_REMOVE_TASK bit in iavf_schedule_finish_config(). Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-07-17iavf: Wait for reset in callbacks which trigger itMarcin Szycik
There was a fail when trying to add the interface to bonding right after changing the MTU on the interface. It was caused by bonding interface unable to open the interface due to interface being in __RESETTING state because of MTU change. Add new reset_waitqueue to indicate that reset has finished. Add waiting for reset to finish in callbacks which trigger hw reset: iavf_set_priv_flags(), iavf_change_mtu() and iavf_set_ringparam(). We use a 5000ms timeout period because on Hyper-V based systems, this operation takes around 3000-4000ms. In normal circumstances, it doesn't take more than 500ms to complete. Add a function iavf_wait_for_reset() to reuse waiting for reset code and use it also in iavf_set_channels(), which already waits for reset. We don't use error handling in iavf_set_channels() as this could cause the device to be in incorrect state if the reset was scheduled but hit timeout or the waitng function was interrupted by a signal. Fixes: 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count") Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Co-developed-by: Dawid Wesierski <dawidx.wesierski@intel.com> Signed-off-by: Dawid Wesierski <dawidx.wesierski@intel.com> Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-22iavf: make functions static where possiblePrzemek Kitszel
Make all possible functions static. Move iavf_force_wb() up to avoid forward declaration. Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-06-10iavf: remove mask from iavf_irq_enable_queues()Ahmed Zaki
Enable more than 32 IRQs by removing the u32 bit mask in iavf_irq_enable_queues(). There is no need for the mask as there are no callers that select individual IRQs through the bitmask. Also, if the PF allocates more than 32 IRQs, this mask will prevent us from using all of them. Modify the comment in iavf_register.h to show that the maximum number allowed for the IRQ index is 63 as per the iAVF standard 1.0 [1]. link: [1] https://www.intel.com/content/dam/www/public/us/en/documents/product-specifications/ethernet-adaptive-virtual-function-hardware-spec.pdf Fixes: 5eae00c57f5e ("i40evf: main driver core") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230608200226.451861-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: tools/testing/selftests/net/config 62199e3f1658 ("selftests: net: Add VXLAN MDB test") 3a0385be133e ("selftests: add the missing CONFIG_IP_SCTP in net config") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-07iavf: remove active_cvlans and active_svlans bitmapsAhmed Zaki
The VLAN filters info is currently being held in a list and 2 bitmaps (active_cvlans and active_svlans). We are experiencing some racing where data is not in sync in the list and bitmaps. For example, the VLAN is initially added to the list but only when the PF replies, it is added to the bitmap. If a user adds many V2 VLANS before the PF responds: while [ $((i++)) ] ip l add l eth0 name eth0.$i type vlan id $i we might end up with more VLAN list entries than the designated limit. Also, The "ip link show" will show more links added than the PF limit. On the other and, the bitmaps are only used to check the number of VLAN filters and to re-enable the filters when the interface goes from DOWN to UP. This patch gets rid of the bitmaps and uses the list only. To do that, the states of the VLAN filter are modified: 1 - IAVF_VLAN_REMOVE: the entry needs to be totally removed after informing the PF. This is the "ip link del eth0.$i" path. 2 - IAVF_VLAN_DISABLE: (new) the netdev went down. The filter needs to be removed from the PF and then marked INACTIVE. 3 - IAVF_VLAN_INACTIVE: (new) no PF filter exists, but the user did not delete the VLAN. Fixes: 48ccc43ecf10 ("iavf: Add support VIRTCHNL_VF_OFFLOAD_VLAN_V2 during netdev config") Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-04-07iavf: refactor VLAN filter statesAhmed Zaki
The VLAN filter states are currently being saved as individual bits. This is error prone as multiple bits might be mistakenly set. Fix by replacing the bits with a single state enum. Also, add an "ACTIVE" state for filters that are accepted by the PF. Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-03-08iavf: Remove unnecessary aer.h includeBjorn Helgaas
<linux/aer.h> is unused, so remove it. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-06net/sched: move struct tc_mqprio_qopt_offload from pkt_cls.h to pkt_sched.hVladimir Oltean
Since mqprio is a scheduler and not a classifier, move its offload structure to pkt_sched.h, where struct tc_taprio_qopt_offload also lies. Also update some header inclusions in drivers that access this structure, to the best of my abilities. Cc: Igor Russkikh <irusskikh@marvell.com> Cc: Yisen Zhuang <yisen.zhuang@huawei.com> Cc: Salil Mehta <salil.mehta@huawei.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Horatiu Vultur <horatiu.vultur@microchip.com> Cc: Lars Povlsen <lars.povlsen@microchip.com> Cc: Steen Hegelund <Steen.Hegelund@microchip.com> Cc: Daniel Machon <daniel.machon@microchip.com> Cc: UNGLinuxDriver@microchip.com Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: drivers/net/ethernet/intel/ice/ice_main.c 418e53401e47 ("ice: move devlink port creation/deletion") 643ef23bd9dd ("ice: Introduce local var for readability") https://lore.kernel.org/all/20230127124025.0dacef40@canb.auug.org.au/ https://lore.kernel.org/all/20230124005714.3996270-1-anthony.l.nguyen@intel.com/ drivers/net/ethernet/engleder/tsnep_main.c 3d53aaef4332 ("tsnep: Fix TX queue stop/wake for multiple queues") 25faa6a4c5ca ("tsnep: Replace TX spin_lock with __netif_tx_lock") https://lore.kernel.org/all/20230127123604.36bb3e99@canb.auug.org.au/ net/netfilter/nf_conntrack_proto_sctp.c 13bd9b31a969 ("Revert "netfilter: conntrack: add sctp DATA_SENT state"") a44b7651489f ("netfilter: conntrack: unify established states for SCTP paths") f71cb8f45d09 ("netfilter: conntrack: sctp: use nf log infrastructure for invalid packets") https://lore.kernel.org/all/20230127125052.674281f9@canb.auug.org.au/ https://lore.kernel.org/all/d36076f3-6add-a442-6d4b-ead9f7ffff86@tessares.net/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-25virtchnl: i40e/iavf: rename iwarp to rdmaJesse Brandeburg
Since the latest Intel hardware does both IWARP and ROCE, rename the term IWARP in the virtchnl header to be RDMA. Do this for both upper and lower case instances. Many of the non-virtchnl.h changes were done with regular expression replacements using perl like: perl -p -i -e 's/_IWARP/_RDMA/' <files> perl -p -i -e 's/_iwarp/_rdma/' <files> and I had to pick up a few instances manually. The virtchnl.h header has some comments and clarity added around when to use certain defines. note: had to fix a checkpatch warning for a long line by wrapping one of the lines I changed. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Jakub Andrysiak <jakub.andrysiak@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-01-20iavf: fix temporary deadlock and failure to set MAC addressMichal Schmidt
We are seeing an issue where setting the MAC address on iavf fails with EAGAIN after the 2.5s timeout expires in iavf_set_mac(). There is the following deadlock scenario: iavf_set_mac(), holding rtnl_lock, waits on: iavf_watchdog_task (within iavf_wq) to send a message to the PF, and iavf_adminq_task (within iavf_wq) to receive a response from the PF. In this adapter state (>=__IAVF_DOWN), these tasks do not need to take rtnl_lock, but iavf_wq is a global single-threaded workqueue, so they may get stuck waiting for another adapter's iavf_watchdog_task to run iavf_init_config_adapter(), which does take rtnl_lock. The deadlock resolves itself by the timeout in iavf_set_mac(), which results in EAGAIN returned to userspace. Let's break the deadlock loop by changing iavf_wq into a per-adapter workqueue, so that one adapter's tasks are not blocked by another's. Fixes: 35a2443d0910 ("iavf: Add waiting for response from PF in set mac") Co-developed-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-18iavf: remove INITIAL_MAC_SET to allow gARP to work properlyStefan Assmann
IAVF_FLAG_INITIAL_MAC_SET prevents waiting on iavf_is_mac_set_handled() the first time the MAC is set. This breaks gratuitous ARP because the MAC address has not been updated yet when the gARP packet is sent out. Current behaviour: $ echo 1 > /sys/class/net/ens4f0/device/sriov_numvfs iavf 0000:88:02.0: MAC address: ee:04:19:14:ec:ea $ ip addr add 192.168.1.1/24 dev ens4f0v0 $ ip link set dev ens4f0v0 up $ echo 1 > /proc/sys/net/ipv4/conf/ens4f0v0/arp_notify $ ip link set ens4f0v0 addr 00:11:22:33:44:55 07:23:41.676611 ee:04:19:14:ec:ea > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.1.1 tell 192.168.1.1, length 28 With IAVF_FLAG_INITIAL_MAC_SET removed: $ echo 1 > /sys/class/net/ens4f0/device/sriov_numvfs iavf 0000:88:02.0: MAC address: 3e:8a:16:a2:37:6d $ ip addr add 192.168.1.1/24 dev ens4f0v0 $ ip link set dev ens4f0v0 up $ echo 1 > /proc/sys/net/ipv4/conf/ens4f0v0/arp_notify $ ip link set ens4f0v0 addr 00:11:22:33:44:55 07:28:01.836608 00:11:22:33:44:55 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.1.1 tell 192.168.1.1, length 28 Fixes: 35a2443d0910 ("iavf: Add waiting for response from PF in set mac") Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netPaolo Abeni
Conflicts: net/ax25/af_ax25.c d7c4c9e075f8c ("ax25: fix incorrect dev_tracker usage") d62607c3fe459 ("net: rename reference+tracking helpers") drivers/net/netdevsim/fib.c 180a6a3ee60a ("netdevsim: fib: Fix reference count leak on route deletion failure") 012ec02ae441 ("netdevsim: convert driver to use unlocked devlink API during init/fini") Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-29iavf: Fix 'tc qdisc show' listing too many queuesPrzemyslaw Patynowski
Fix tc qdisc show dev <ethX> root displaying too many fq_codel qdiscs. tc_modify_qdisc, which is caller of ndo_setup_tc, expects driver to call netif_set_real_num_tx_queues, which prepares qdiscs. Without this patch, fq_codel qdiscs would not be adjusted to number of queues on VF. e.g.: tc qdisc show dev <ethX> qdisc mq 0: root qdisc fq_codel 0: parent :4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc fq_codel 0: parent :3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc fq_codel 0: parent :2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc fq_codel 0: parent :1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 tc qdisc add dev <ethX> root mqprio num_tc 2 map 1 0 0 0 0 0 0 0 queues 1@0 1@1 hw 1 mode channel shaper bw_rlimit max_rate 5000Mbit 150Mbit tc qdisc show dev <ethX> qdisc mqprio 8003: root tc 2 map 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues:(0:0) (1:1) mode:channel shaper:bw_rlimit max_rate:5Gbit 150Mbit qdisc fq_codel 0: parent 8003:4 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc fq_codel 0: parent 8003:3 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc fq_codel 0: parent 8003:2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc fq_codel 0: parent 8003:1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 While after fix: tc qdisc add dev <ethX> root mqprio num_tc 2 map 1 0 0 0 0 0 0 0 queues 1@0 1@1 hw 1 mode channel shaper bw_rlimit max_rate 5000Mbit 150Mbit tc qdisc show dev <ethX> #should show 2, shows 4 qdisc mqprio 8004: root tc 2 map 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 queues:(0:0) (1:1) mode:channel shaper:bw_rlimit max_rate:5Gbit 150Mbit qdisc fq_codel 0: parent 8004:2 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 qdisc fq_codel 0: parent 8004:1 limit 10240p flows 1024 quantum 1514 target 5ms interval 100ms memory_limit 32Mb ecn drop_batch 64 Fixes: d5b33d024496 ("i40evf: add ndo_setup_tc callback to i40evf") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Co-developed-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Co-developed-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-29iavf: Fix max_rate limitingPrzemyslaw Patynowski
Fix max_rate option in TC, check for proper quanta boundaries. Check for minimum value provided and if it fits expected 50Mbps quanta. Without this patch, iavf could send settings for max_rate limiting that would be accepted from by PF even the max_rate option is less than expected 50Mbps quanta. It results in no rate limiting on traffic as rate limiting will be floored to 0. Example: tc qdisc add dev $vf root mqprio num_tc 3 map 0 2 1 queues \ 2@0 2@2 2@4 hw 1 mode channel shaper bw_rlimit \ max_rate 50Mbps 500Mbps 500Mbps Should limit TC0 to circa 50 Mbps tc qdisc add dev $vf root mqprio num_tc 3 map 0 2 1 queues \ 2@0 2@2 2@4 hw 1 mode channel shaper bw_rlimit \ max_rate 0Mbps 100Kbit 500Mbps Should return error Fixes: d5b33d024496 ("i40evf: add ndo_setup_tc callback to i40evf") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Jun Zhang <xuejun.zhang@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18iavf: Fix missing state logsPrzemyslaw Patynowski
Fix debug prints, by adding missing state prints. Extend iavf_state_str by strings for __IAVF_INIT_EXTENDED_CAPS and __IAVF_INIT_CONFIG_ADAPTER. Without this patch, when enabling debug prints for iavf.h, user will see: iavf 0000:06:0e.0: state transition from:__IAVF_INIT_GET_RESOURCES to:__IAVF_UNKNOWN_STATE iavf 0000:06:0e.0: state transition from:__IAVF_UNKNOWN_STATE to:__IAVF_UNKNOWN_STATE Fixes: 605ca7c5c670 ("iavf: Fix kernel BUG in free_msi_irqs") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Jun Zhang <xuejun.zhang@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-18iavf: Disallow changing rx/tx-frames and rx/tx-frames-irqPrzemyslaw Patynowski
Remove from supported_coalesce_params ETHTOOL_COALESCE_MAX_FRAMES and ETHTOOL_COALESCE_MAX_FRAMES_IRQ. As tx-frames-irq allowed user to change budget for iavf_clean_tx_irq, remove work_limit and use define for budget. Without this patch there would be possibility to change rx/tx-frames and rx/tx-frames-irq, which for rx/tx-frames did nothing, while for rx/tx-frames-irq it changed rx/tx-frames and only changed budget for cleaning NAPI poll. Fixes: fbb7ddfef253 ("i40evf: core ethtool functionality") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Jun Zhang <xuejun.zhang@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-18iavf: Fix VLAN_V2 addition/rejectionPrzemyslaw Patynowski
Fix VLAN addition, so that PF driver does not reject whole VLAN batch. Add VLAN reject handling, so rejected VLANs, won't litter VLAN filter list. Fix handling of active_(c/s)vlans, so it will be possible to re-add VLAN filters for user. Without this patch, after changing trust to off, with VLAN filters saturated, no VLAN is added, due to PF rejecting addition. Fixes: 92fc50859872 ("iavf: Restrict maximum VLAN filters for VIRTCHNL_VF_OFFLOAD_VLAN_V2") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-07iavf: Add waiting for response from PF in set macMateusz Palczewski
Make iavf_set_mac synchronous by waiting for a response from a PF. Without this iavf_set_mac is always returning success even though set_mac can be rejected by a PF. This ensures that when set_mac exits netdev MAC is updated. This is needed for sending ARPs with correct MAC after changing VF's MAC. This is also needed by bonding module. Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
net/dsa/dsa2.c commit afb3cc1a397d ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails") commit e83d56537859 ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master") https://lore.kernel.org/all/20220307101436.7ae87da0@canb.auug.org.au/ drivers/net/ethernet/intel/ice/ice.h commit 97b0129146b1 ("ice: Fix error with handling of bonding MTU") commit 43113ff73453 ("ice: add TTY for GNSS module for E810T device") https://lore.kernel.org/all/20220310112843.3233bcf1@canb.auug.org.au/ drivers/staging/gdm724x/gdm_lte.c commit fc7f750dc9d1 ("staging: gdm724x: fix use after free in gdm_lte_rx()") commit 4bcc4249b4cf ("staging: Use netif_rx().") https://lore.kernel.org/all/20220308111043.1018a59d@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-08iavf: Fix adopting new combined settingMichal Maloszewski
In some cases overloaded flag IAVF_FLAG_REINIT_ITR_NEEDED which should indicate that interrupts need to be completely reinitialized during reset leads to RTNL deadlocks using ethtool -C while a reset is in progress. To fix, it was added a new flag IAVF_FLAG_REINIT_MSIX_NEEDED used to trigger MSI-X reinit. New combined setting is fixed adopt after VF reset. This has been implemented by call reinit interrupt scheme during VF reset. Without this fix new combined setting has never been adopted. Fixes: 209f2f9c7181 ("iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiation") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Signed-off-by: Michal Maloszewski <michal.maloszewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
net/batman-adv/hard-interface.c commit 690bb6fb64f5 ("batman-adv: Request iflink once in batadv-on-batadv check") commit 6ee3c393eeb7 ("batman-adv: Demote batadv-on-batadv skip error message") https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/ net/smc/af_smc.c commit 4d08b7b57ece ("net/smc: Fix cleanup when register ULP fails") commit 462791bbfa35 ("net/smc: add sysctl interface for SMC") https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-01iavf: stop leaking iavf_status as "errno" valuesMateusz Palczewski
Several functions in the iAVF core files take status values of the enum iavf_status and convert them into integer values. This leads to confusion as functions return both Linux errno values and status codes intermixed. Reporting status codes as if they were "errno" values can lead to confusion when reviewing error logs. Additionally, it can lead to unexpected behavior if a return value is not interpreted properly. Fix this by introducing iavf_status_to_errno, a switch that explicitly converts from the status codes into an appropriate error value. Also introduce a virtchnl_status_to_errno function for the one case where we were returning both virtchnl status codes and iavf_status codes in the same function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-03-01iavf: refactor processing of VLAN V2 capability messageMateusz Palczewski
In order to handle the capability exchange necessary for VIRTCHNL_VF_OFFLOAD_VLAN_V2, the driver must send a VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS message. This must occur prior to __IAVF_CONFIG_ADAPTER, and the driver must wait for the response from the PF. To handle this, the __IAVF_INIT_GET_OFFLOAD_VLAN_V2_CAPS state was introduced. This state is intended to process the response from the VLAN V2 caps message. This works ok, but is difficult to extend to adding more extended capability exchange. Existing (and future) AVF features are relying more and more on these sort of extended ops for processing additional capabilities. Just like VLAN V2, this exchange must happen prior to __IAVF_CONFIG_ADPATER. Since we only send one outstanding AQ message at a time during init, it is not clear where to place this state. Adding more capability specific states becomes a mess. Instead of having the "previous" state send a message and then transition into a capability-specific state, introduce __IAVF_EXTENDED_CAPS state. This state will use a list of extended_caps that determines what messages to send and receive. As long as there are extended_caps bits still set, the driver will remain in this state performing one send or one receive per state machine loop. Refactor the VLAN V2 negotiation to use this new state, and remove the capability-specific state. This makes it significantly easier to add a new similar capability exchange going forward. Extended capabilities are processed by having an associated SEND and RECV extended capability bit. During __IAVF_EXTENDED_CAPS, the driver checks these bits in order by feature, first the send bit for a feature, then the recv bit for a feature. Each send flag will call a function that sends the necessary response, while each receive flag will wait for the response from the PF. If a given feature can't be negotiated with the PF, the associated flags will be cleared in order to skip processing of that feature. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25iavf: Fix locking for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPSSlawomir Laba
iavf_virtchnl_completion is called under crit_lock but when the code for VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS is called, this lock is released in order to obtain rtnl_lock to avoid ABBA deadlock with unregister_netdev. Along with the new way iavf_remove behaves, there exist many risks related to the lock release and attmepts to regrab it. The driver faces crashes related to races between unregister_netdev and netdev_update_features. Yet another risk is that the driver could already obtain the crit_lock in order to destroy it and iavf_virtchnl_completion could crash or block forever. Make iavf_virtchnl_completion never relock crit_lock in it's call paths. Extract rtnl_lock locking logic to the driver for unregister_netdev in order to set the netdev_registered flag inside the lock. Introduce a new flag that will inform adminq_task to perform the code from VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS right after it finishes processing messages. Guard this code with remove flags so it's never called when the driver is in remove state. Fixes: 5951a2b9812d ("iavf: Fix VLAN feature flags after VFR") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25iavf: Fix init state closure on removeSlawomir Laba
When init states of the adapter work, the errors like lack of communication with the PF might hop in. If such events occur the driver restores previous states in order to retry initialization in a proper way. When remove task kicks in, this situation could lead to races with unregistering the netdevice as well as resources cleanup. With the commit introducing the waiting in remove for init to complete, this problem turns into an endless waiting if init never recovers from errors. Introduce __IAVF_IN_REMOVE_TASK bit to indicate that the remove thread has started. Make __IAVF_COMM_FAILED adapter state respect the __IAVF_IN_REMOVE_TASK bit and set the __IAVF_INIT_FAILED state and return without any action instead of trying to recover. Make __IAVF_INIT_FAILED adapter state respect the __IAVF_IN_REMOVE_TASK bit and return without any further actions. Make the loop in the remove handler break when adapter has __IAVF_INIT_FAILED state set. Fixes: 898ef1cb1cb2 ("iavf: Combine init and watchdog state machines") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-25iavf: Rework mutexes for better synchronisationSlawomir Laba
The driver used to crash in multiple spots when put to stress testing of the init, reset and remove paths. The user would experience call traces or hangs when creating, resetting, removing VFs. Depending on the machines, the call traces are happening in random spots, like reset restoring resources racing with driver remove. Make adapter->crit_lock mutex a mandatory lock for guarding the operations performed on all workqueues and functions dealing with resource allocation and disposal. Make __IAVF_REMOVE a final state of the driver respected by workqueues that shall not requeue, when they fail to obtain the crit_lock. Make the IRQ handler not to queue the new work for adminq_task when the __IAVF_REMOVE state is set. Fixes: 5ac49f3c2702 ("iavf: use mutexes for locking of critical sections") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Phani Burra <phani.r.burra@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 offload enable/disableBrett Creeley
The new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability added support that allows the VF to support 802.1Q and 802.1ad VLAN insertion and stripping if successfully negotiated via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS. Multiple changes were needed to support this new functionality. 1. Added new aq_required flags to support any kind of VLAN stripping and insertion offload requests via virtchnl. 2. Added the new method iavf_set_vlan_offload_features() that's used during VF initialization, VF reset, and iavf_set_features() to set the aq_required bits based on the current VLAN offload configuration of the VF's netdev. 3. Added virtchnl handling for VIRTCHNL_OP_ENABLE_STRIPPING_V2, VIRTCHNL_OP_DISABLE_STRIPPING_V2, VIRTCHNL_OP_ENABLE_INSERTION_V2, and VIRTCHNL_OP_ENABLE_INSERTION_V2. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 hotpathBrett Creeley
The new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability added support that allows the PF to set the location of the Tx and Rx VLAN tag for insertion and stripping offloads. In order to support this functionality a few changes are needed. 1. Add a new method to cache the VLAN tag location based on negotiated capabilities for the Tx and Rx ring flags. This needs to be called in the initialization and reset paths. 2. Refactor the transmit hotpath to account for the new Tx ring flags. When IAVF_TXR_FLAGS_VLAN_LOC_L2TAG2 is set, then the driver needs to insert the VLAN tag in the L2TAG2 field of the transmit descriptor. When the IAVF_TXRX_FLAGS_VLAN_LOC_L2TAG1 is set, then the driver needs to use the l2tag1 field of the data descriptor (same behavior as before). 3. Refactor the iavf_tx_prepare_vlan_flags() function to simplify transmit hardware VLAN offload functionality by only depending on the skb_vlan_tag_present() function. This can be done because the OS won't request transmit offload for a VLAN unless the driver told the OS it's supported and enabled. 4. Refactor the receive hotpath to account for the new Rx ring flags and VLAN ethertypes. This requires checking the Rx ring flags and descriptor status bits to determine the location of the VLAN tag. Also, since only a single ethertype can be supported at a time, check the enabled netdev features before specifying a VLAN ethertype in __vlan_hwaccel_put_tag(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17iavf: Add support VIRTCHNL_VF_OFFLOAD_VLAN_V2 during netdev configBrett Creeley
Based on VIRTCHNL_VF_OFFLOAD_VLAN_V2, the VF can now support more VLAN capabilities (i.e. 802.1AD offloads and filtering). In order to communicate these capabilities to the netdev layer, the VF needs to parse its VLAN capabilities based on whether it was able to negotiation VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2 or neither of these. In order to support this, add the following functionality: iavf_get_netdev_vlan_hw_features() - This is used to determine the VLAN features that the underlying hardware supports and that can be toggled off/on based on the negotiated capabiltiies. For example, if VIRTCHNL_VF_OFFLOAD_VLAN_V2 was negotiated, then any capability marked with VIRTCHNL_VLAN_TOGGLE can be toggled on/off by the VF. If VIRTCHNL_VF_OFFLOAD_VLAN was negotiated, then only VLAN insertion and/or stripping can be toggled on/off. iavf_get_netdev_vlan_features() - This is used to determine the VLAN features that the underlying hardware supports and that should be enabled by default. For example, if VIRTHCNL_VF_OFFLOAD_VLAN_V2 was negotiated, then any supported capability that has its ethertype_init filed set should be enabled by default. If VIRTCHNL_VF_OFFLOAD_VLAN was negotiated, then filtering, stripping, and insertion should be enabled by default. Also, refactor iavf_fix_features() to take into account the new capabilities. To do this, query all the supported features (enabled by default and toggleable) and make sure the requested change is supported. If VIRTCHNL_VF_OFFLOAD_VLAN_V2 is successfully negotiated, there is no need to check VIRTCHNL_VLAN_TOGGLE here because the driver already told the netdev layer which features can be toggled via netdev->hw_features during iavf_process_config(), so only those features will be requested to change. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-17iavf: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2 negotiationBrett Creeley
In order to support the new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability the VF driver needs to rework it's initialization state machine and reset flow. This has to be done because successful negotiation of VIRTCHNL_VF_OFFLOAD_VLAN_V2 requires the VF driver to perform a second capability request via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS before configuring the adapter and its netdev. Add the VIRTCHNL_VF_OFFLOAD_VLAN_V2 bit when sending the VIRTHCNL_OP_GET_VF_RESOURECES message. The underlying PF will either support VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2 or neither. Both of these offloads should never be supported together. Based on this, add 2 new states to the initialization state machine: __IAVF_INIT_GET_OFFLOAD_VLAN_V2_CAPS __IAVF_INIT_CONFIG_ADAPTER The __IAVF_INIT_GET_OFFLOAD_VLAN_V2_CAPS state is used to request/store the new VLAN capabilities if and only if VIRTCHNL_VLAN_OFFLOAD_VLAN_V2 was successfully negotiated in the __IAVF_INIT_GET_RESOURCES state. The __IAVF_INIT_CONFIG_ADAPTER state is used to configure the adapter/netdev after the resource requests have finished. The VF will move into this state regardless of whether it successfully negotiated VIRTCHNL_VF_OFFLOAD_VLAN or VIRTCHNL_VF_OFFLOAD_VLAN_V2. Also, add a the new flag IAVF_FLAG_AQ_GET_OFFLOAD_VLAN_V2_CAPS and set it during VF reset. If VIRTCHNL_VF_OFFLOAD_VLAN_V2 was successfully negotiated then the VF will request its VLAN capabilities via VIRTCHNL_OP_GET_OFFLOAD_VLAN_V2_CAPS during the reset. This is needed because the PF may change/modify the VF's configuration during VF reset (i.e. modifying the VF's port VLAN configuration). This also, required the VF to call netdev_update_features() since its VLAN features may change during VF reset. Make sure to call this under rtnl_lock(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-30iavf: Refactor iavf_mac_filter struct memory usageJedrzej Jagielski
iavf_mac_filter struct contained couple boolean flags using up more memory than is necessary. Change the flags to be bitfields in an anonymous struct so all the flags now fit in one byte. Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-19iavf: Fix VLAN feature flags after VFRBrett Creeley
When a VF goes through a reset, it's possible for the VF's feature set to change. For example it may lose the VIRTCHNL_VF_OFFLOAD_VLAN capability after VF reset. Unfortunately, the driver doesn't correctly deal with this situation and errors are seen from downing/upping the interface and/or moving the interface in/out of a network namespace. When setting the interface down/up we see the following errors after the VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from the VF: ice 0000:51:00.1: VF 1 failed opcode 12, retval: -64 iavf 0000:51:09.1: Failed to add VLAN filter, error IAVF_NOT_SUPPORTED ice 0000:51:00.1: VF 1 failed opcode 13, retval: -64 iavf 0000:51:09.1: Failed to delete VLAN filter, error IAVF_NOT_SUPPORTED These add/delete errors are happening because the VLAN filters are tracked internally to the driver and regardless of the VLAN_ALLOWED() setting the driver tries to delete/re-add them over virtchnl. Fix the delete failure by making sure to delete any VLAN filter tracking in the driver when a removal request is made, while preventing the virtchnl request. This makes it so the driver's VLAN list is up to date and the errors are Fix the add failure by making sure the check for VLAN_ALLOWED() during reset is done after the VF receives its capability list from the PF via VIRTCHNL_OP_GET_VF_RESOURCES. If VLAN functionality is not allowed, then prevent requesting re-adding the filters over virtchnl. When moving the interface into a network namespace we see the following errors after the VIRTCHNL_VF_OFFLOAD_VLAN capability was taken away from the VF: iavf 0000:51:09.1 enp81s0f1v1: NIC Link is Up Speed is 25 Gbps Full Duplex iavf 0000:51:09.1 temp_27: renamed from enp81s0f1v1 iavf 0000:51:09.1 mgmt: renamed from temp_27 iavf 0000:51:09.1 dev27: set_features() failed (-22); wanted 0x020190001fd54833, left 0x020190001fd54bb3 These errors are happening because we aren't correctly updating the netdev capabilities and dealing with ndo_fix_features() and ndo_set_features() correctly. Fix this by only reporting errors in the driver's ndo_set_features() callback when VIRTCHNL_VF_OFFLOAD_VLAN is not allowed and any attempt to enable the VLAN features is made. Also, make sure to disable VLAN insertion, filtering, and stripping since the VIRTCHNL_VF_OFFLOAD_VLAN flag applies to all of them and not just VLAN stripping. Also, after we process the capabilities in the VF reset path, make sure to call netdev_update_features() in case the capabilities have changed in order to update the netdev's feature set to match the VF's actual capabilities. Lastly, make sure to always report success on VLAN filter delete when VIRTCHNL_VF_OFFLOAD_VLAN is not supported. The changed flow in iavf_del_vlans() allows the stack to delete previosly existing VLAN filters even if VLAN filtering is not allowed. This makes it so the VLAN filter list is up to date. Fixes: 8774370d268f ("i40e/i40evf: support for VF VLAN tag stripping control") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-19iavf: Fix refreshing iavf adapter stats on ethtool requestJedrzej Jagielski
Currently iavf adapter statistics are refreshed only in a watchdog task, triggered approximately every two seconds, which causes some ethtool requests to return outdated values. Add explicit statistics refresh when requested by ethtool -S. Fixes: b476b0030e61 ("iavf: Move commands processing to the separate function") Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-15iavf: Restore VLAN filters after link downAkeem G Abodunrin
Restore VLAN filters after the link is brought down, and up - since all filters are deleted from HW during the netdev link down routine. Fixes: ed1f5b58ea01 ("i40evf: remove VLAN filters on close") Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-29iavf: Fix kernel BUG in free_msi_irqsPrzemyslaw Patynowski
Fix driver not freeing VF's traffic irqs, prior to calling pci_disable_msix in iavf_remove. There were possible 2 erroneous states in which, iavf_close would not be called. One erroneous state is fixed by allowing netdev to register, when state is already running. It was possible for VF adapter to enter state loop from running to resetting, where iavf_open would subsequently fail. If user would then unload driver/remove VF pci, iavf_close would not be called, as the netdev was not registered, leaving traffic pcis still allocated. Fixed this by breaking loop, allowing netdev to open device when adapter state is __IAVF_RUNNING and it is not explicitily downed. Other possiblity is entering to iavf_remove from __IAVF_RESETTING state, where iavf_close would not free irqs, but just return 0. Fixed this by checking for last adapter state and then removing irqs. Kernel panic: [ 2773.628585] kernel BUG at drivers/pci/msi.c:375! ... [ 2773.631567] RIP: 0010:free_msi_irqs+0x180/0x1b0 ... [ 2773.640939] Call Trace: [ 2773.641572] pci_disable_msix+0xf7/0x120 [ 2773.642224] iavf_reset_interrupt_capability.part.41+0x15/0x30 [iavf] [ 2773.642897] iavf_remove+0x12e/0x500 [iavf] [ 2773.643578] pci_device_remove+0x3b/0xc0 [ 2773.644266] device_release_driver_internal+0x103/0x1f0 [ 2773.644948] pci_stop_bus_device+0x69/0x90 [ 2773.645576] pci_stop_and_remove_bus_device+0xe/0x20 [ 2773.646215] pci_iov_remove_virtfn+0xba/0x120 [ 2773.646862] sriov_disable+0x2f/0xe0 [ 2773.647531] ice_free_vfs+0x2f8/0x350 [ice] [ 2773.648207] ice_sriov_configure+0x94/0x960 [ice] [ 2773.648883] ? _kstrtoull+0x3b/0x90 [ 2773.649560] sriov_numvfs_store+0x10a/0x190 [ 2773.650249] kernfs_fop_write+0x116/0x190 [ 2773.650948] vfs_write+0xa5/0x1a0 [ 2773.651651] ksys_write+0x4f/0xb0 [ 2773.652358] do_syscall_64+0x5b/0x1a0 [ 2773.653075] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 22ead37f8af8 ("i40evf: Add longer wait after remove module") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>