summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/ice/ice_vf_lib.c
AgeCommit message (Collapse)Author
2025-04-11ice: receive LLDP on trusted VFsMateusz Pacuszka
When a trusted VF tries to configure an LLDP multicast address, configure a rule that would mirror the traffic to this VF, untrusted VFs are not allowed to receive LLDP at all, so the request to add LLDP MAC address will always fail for them. Add a forwarding LLDP filter to a trusted VF when it tries to add an LLDP multicast MAC address. The MAC address has to be added after enabling trust (through restarting the LLDP service). Signed-off-by: Mateusz Pacuszka <mateuszx.pacuszka@intel.com> Co-developed-by: Larysa Zaremba <larysa.zaremba@intel.com> Signed-off-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-25ice: Fix deinitializing VF in error pathMarcin Szycik
If ice_ena_vfs() fails after calling ice_create_vf_entries(), it frees all VFs without removing them from snapshot PF-VF mailbox list, leading to list corruption. Reproducer: devlink dev eswitch set $PF1_PCI mode switchdev ip l s $PF1 up ip l s $PF1 promisc on sleep 1 echo 1 > /sys/class/net/$PF1/device/sriov_numvfs sleep 1 echo 1 > /sys/class/net/$PF1/device/sriov_numvfs Trace (minimized): list_add corruption. next->prev should be prev (ffff8882e241c6f0), but was 0000000000000000. (next=ffff888455da1330). kernel BUG at lib/list_debug.c:29! RIP: 0010:__list_add_valid_or_report+0xa6/0x100 ice_mbx_init_vf_info+0xa7/0x180 [ice] ice_initialize_vf_entry+0x1fa/0x250 [ice] ice_sriov_configure+0x8d7/0x1520 [ice] ? __percpu_ref_switch_mode+0x1b1/0x5d0 ? __pfx_ice_sriov_configure+0x10/0x10 [ice] Sometimes a KASAN report can be seen instead with a similar stack trace: BUG: KASAN: use-after-free in __list_add_valid_or_report+0xf1/0x100 VFs are added to this list in ice_mbx_init_vf_info(), but only removed in ice_free_vfs(). Move the removing to ice_free_vf_entries(), which is also being called in other places where VFs are being removed (including ice_free_vfs() itself). Fixes: 8cd8a6b17d27 ("ice: move VF overflow message count into struct ice_mbx_vf_info") Reported-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Closes: https://lore.kernel.org/intel-wired-lan/PH0PR11MB50138B635F2E5CEB7075325D961F2@PH0PR11MB5013.namprd11.prod.outlook.com Reviewed-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250224190647.3601930-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.12-rc3). No conflicts and no adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-08ice: add E830 HW VF mailbox message limit supportPaul Greenwalt
E830 adds hardware support to prevent the VF from overflowing the PF mailbox with VIRTCHNL messages. E830 will use the hardware feature (ICE_F_MBX_LIMIT) instead of the software solution ice_is_malicious_vf(). To prevent a VF from overflowing the PF, the PF sets the number of messages per VF that can be in the PF's mailbox queue (ICE_MBX_OVERFLOW_WATERMARK). When the PF processes a message from a VF, the PF decrements the per VF message count using the E830_MBX_VF_DEC_TRIG register. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08ice: Fix increasing MSI-X on VFMarcin Szycik
Increasing MSI-X value on a VF leads to invalid memory operations. This is caused by not reallocating some arrays. Reproducer: modprobe ice echo 0 > /sys/bus/pci/devices/$PF_PCI/sriov_drivers_autoprobe echo 1 > /sys/bus/pci/devices/$PF_PCI/sriov_numvfs echo 17 > /sys/bus/pci/devices/$VF0_PCI/sriov_vf_msix_count Default MSI-X is 16, so 17 and above triggers this issue. KASAN reports: BUG: KASAN: slab-out-of-bounds in ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] Read of size 8 at addr ffff8888b937d180 by task bash/28433 (...) Call Trace: (...) ? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] kasan_report+0xed/0x120 ? ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] ice_vsi_alloc_ring_stats+0x38d/0x4b0 [ice] ice_vsi_cfg_def+0x3360/0x4770 [ice] ? mutex_unlock+0x83/0xd0 ? __pfx_ice_vsi_cfg_def+0x10/0x10 [ice] ? __pfx_ice_remove_vsi_lkup_fltr+0x10/0x10 [ice] ice_vsi_cfg+0x7f/0x3b0 [ice] ice_vf_reconfig_vsi+0x114/0x210 [ice] ice_sriov_set_msix_vec_count+0x3d0/0x960 [ice] sriov_vf_msix_count_store+0x21c/0x300 (...) Allocated by task 28201: (...) ice_vsi_cfg_def+0x1c8e/0x4770 [ice] ice_vsi_cfg+0x7f/0x3b0 [ice] ice_vsi_setup+0x179/0xa30 [ice] ice_sriov_configure+0xcaa/0x1520 [ice] sriov_numvfs_store+0x212/0x390 (...) To fix it, use ice_vsi_rebuild() instead of ice_vf_reconfig_vsi(). This causes the required arrays to be reallocated taking the new queue count into account (ice_vsi_realloc_stat_arrays()). Set req_txq and req_rxq before ice_vsi_rebuild(), so that realloc uses the newly set queue count. Additionally, ice_vsi_rebuild() does not remove VSI filters (ice_fltr_remove_all()), so ice_vf_init_host_cfg() is no longer necessary. Reported-by: Jacob Keller <jacob.e.keller@intel.com> Fixes: 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-30ice: clear port vlan config during resetMichal Swiatkowski
Since commit 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()") VF VSI is only reconfigured instead of recreated. The context configuration from previous setting is still the same. If any of the config needs to be cleared it needs to be cleared explicitly. Previously there was assumption that port vlan will be cleared automatically. Now, when VSI is only reconfigured we have to do it in the code. Not clearing port vlan configuration leads to situation when the driver VSI config is different than the VSI config in HW. Traffic can't be passed after setting and clearing port vlan, because of invalid VSI config in HW. Example reproduction: > ip a a dev $(VF) $(VF_IP_ADDRESS) > ip l s dev $(VF) up > ping $(VF_IP_ADDRESS) ping is working fine here > ip link set eth5 vf 0 vlan 100 > ip link set eth5 vf 0 vlan 0 > ping $(VF_IP_ADDRESS) ping isn't working Fixes: 2a2cb4c6c181 ("ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()") Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Piotr Tyda <piotr.tyda@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-09-06ice: make representor code genericMichal Swiatkowski
Keep the same flow of port representor creation, but instead of general attach function create helpers for specific representor type. Store function pointer for add and remove representor. Type of port representor can be also known based on VSI type, but it is more clean to have it directly saved in port representor structure. Add devlink lock for whole port representor creation and destruction. Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-06-21ice: update representor when VSI is readyMichal Swiatkowski
In case of reset of VF VSI can be reallocated. To handle this case it should be properly updated. Reload representor as vsi->vsi_num can be different than the one stored when representor was created. Instead of only changing antispoof do whole VSI configuration for eswitch. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-05-06ice: refactor struct ice_vsi_cfg_params to be inside of struct ice_vsiMateusz Polchlopek
Refactor struct ice_vsi_cfg_params to be embedded into struct ice_vsi. Prior to that the members of the struct were scattered around ice_vsi, and were copy-pasted for purposes of reinit. Now we have struct handy, and it is easier to have something sticky in the flags field. Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Vaishnavi Tipireddy <vaishnavi.tipireddy@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-04-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/ti/icssg/icssg_prueth.c net/mac80211/chan.c 89884459a0b9 ("wifi: mac80211: fix idle calculation with multi-link") 87f5500285fb ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()") https://lore.kernel.org/all/20240422105623.7b1fbda2@canb.auug.org.au/ net/unix/garbage.c 1971d13ffa84 ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().") 4090fa373f0e ("af_unix: Replace garbage collection algorithm.") drivers/net/ethernet/ti/icssg/icssg_prueth.c drivers/net/ethernet/ti/icssg/icssg_common.c 4dcd0e83ea1d ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()") e2dc7bfd677f ("net: ti: icssg-prueth: Move common functions into a separate file") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-25ice: fix LAG and VF lock dependency in ice_reset_vf()Jacob Keller
9f74a3dfcf83 ("ice: Fix VF Reset paths when interface in a failed over aggregate"), the ice driver has acquired the LAG mutex in ice_reset_vf(). The commit placed this lock acquisition just prior to the acquisition of the VF configuration lock. If ice_reset_vf() acquires the configuration lock via the ICE_VF_RESET_LOCK flag, this could deadlock with ice_vc_cfg_qs_msg() because it always acquires the locks in the order of the VF configuration lock and then the LAG mutex. Lockdep reports this violation almost immediately on creating and then removing 2 VF: ====================================================== WARNING: possible circular locking dependency detected 6.8.0-rc6 #54 Tainted: G W O ------------------------------------------------------ kworker/60:3/6771 is trying to acquire lock: ff40d43e099380a0 (&vf->cfg_lock){+.+.}-{3:3}, at: ice_reset_vf+0x22f/0x4d0 [ice] but task is already holding lock: ff40d43ea1961210 (&pf->lag_mutex){+.+.}-{3:3}, at: ice_reset_vf+0xb7/0x4d0 [ice] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&pf->lag_mutex){+.+.}-{3:3}: __lock_acquire+0x4f8/0xb40 lock_acquire+0xd4/0x2d0 __mutex_lock+0x9b/0xbf0 ice_vc_cfg_qs_msg+0x45/0x690 [ice] ice_vc_process_vf_msg+0x4f5/0x870 [ice] __ice_clean_ctrlq+0x2b5/0x600 [ice] ice_service_task+0x2c9/0x480 [ice] process_one_work+0x1e9/0x4d0 worker_thread+0x1e1/0x3d0 kthread+0x104/0x140 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x1b/0x30 -> #0 (&vf->cfg_lock){+.+.}-{3:3}: check_prev_add+0xe2/0xc50 validate_chain+0x558/0x800 __lock_acquire+0x4f8/0xb40 lock_acquire+0xd4/0x2d0 __mutex_lock+0x9b/0xbf0 ice_reset_vf+0x22f/0x4d0 [ice] ice_process_vflr_event+0x98/0xd0 [ice] ice_service_task+0x1cc/0x480 [ice] process_one_work+0x1e9/0x4d0 worker_thread+0x1e1/0x3d0 kthread+0x104/0x140 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x1b/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&pf->lag_mutex); lock(&vf->cfg_lock); lock(&pf->lag_mutex); lock(&vf->cfg_lock); *** DEADLOCK *** 4 locks held by kworker/60:3/6771: #0: ff40d43e05428b38 ((wq_completion)ice){+.+.}-{0:0}, at: process_one_work+0x176/0x4d0 #1: ff50d06e05197e58 ((work_completion)(&pf->serv_task)){+.+.}-{0:0}, at: process_one_work+0x176/0x4d0 #2: ff40d43ea1960e50 (&pf->vfs.table_lock){+.+.}-{3:3}, at: ice_process_vflr_event+0x48/0xd0 [ice] #3: ff40d43ea1961210 (&pf->lag_mutex){+.+.}-{3:3}, at: ice_reset_vf+0xb7/0x4d0 [ice] stack backtrace: CPU: 60 PID: 6771 Comm: kworker/60:3 Tainted: G W O 6.8.0-rc6 #54 Hardware name: Workqueue: ice ice_service_task [ice] Call Trace: <TASK> dump_stack_lvl+0x4a/0x80 check_noncircular+0x12d/0x150 check_prev_add+0xe2/0xc50 ? save_trace+0x59/0x230 ? add_chain_cache+0x109/0x450 validate_chain+0x558/0x800 __lock_acquire+0x4f8/0xb40 ? lockdep_hardirqs_on+0x7d/0x100 lock_acquire+0xd4/0x2d0 ? ice_reset_vf+0x22f/0x4d0 [ice] ? lock_is_held_type+0xc7/0x120 __mutex_lock+0x9b/0xbf0 ? ice_reset_vf+0x22f/0x4d0 [ice] ? ice_reset_vf+0x22f/0x4d0 [ice] ? rcu_is_watching+0x11/0x50 ? ice_reset_vf+0x22f/0x4d0 [ice] ice_reset_vf+0x22f/0x4d0 [ice] ? process_one_work+0x176/0x4d0 ice_process_vflr_event+0x98/0xd0 [ice] ice_service_task+0x1cc/0x480 [ice] process_one_work+0x1e9/0x4d0 worker_thread+0x1e1/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0x104/0x140 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> To avoid deadlock, we must acquire the LAG mutex only after acquiring the VF configuration lock. Fix the ice_reset_vf() to acquire the LAG mutex only after we either acquire or check that the VF configuration lock is held. Fixes: 9f74a3dfcf83 ("ice: Fix VF Reset paths when interface in a failed over aggregate") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240423182723.740401-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-12ice: set vf->num_msix in ice_initialize_vf_entry()Jacob Keller
Commit fe1c5ca2fe76 ("ice: implement num_msix field per VF") updated the driver to allow for per-VF MSI-X configuration. The initial defaults were set in ice_create_vf_entries(). This logic is better placed in ice_initialize_vf_entry(). Indeed, that function already sets vf->num_vf_qs, as well as initializes the allow list via calling ice_vc_set_default_allowlist(). Move this logic into ice_initialize_vf_entry(). This makes the code clear, and ensures that these VF fields will be initialized properly for both SR-IOV VFs and the upcoming Scalable IOV VFs. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-03-04ice: remove vf->lan_vsi_num fieldJacob Keller
The lan_vsi_num field of the VF structure is no longer used for any purpose. Remove it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-01-02ice: replace ice_vf_recreate_vsi() with ice_vf_reconfig_vsi()Jacob Keller
The ice_vf_create_vsi() function and its VF ops helper introduced by commit a4c785e8162e ("ice: convert vf_ops .vsi_rebuild to .create_vsi") are used during an individual VF reset to re-create the VSI. This was done in order to ensure that the VSI gets properly reconfigured within the hardware. This is somewhat heavy handed as we completely release the VSI memory and structure, and then create a new VSI. This can also potentially force a change of the VSI index as we will re-use the first open slot in the VSI array which may not be the same. As part of implementing devlink reload, commit 6624e780a577 ("ice: split ice_vsi_setup into smaller functions") split VSI setup into smaller functions, introducing both ice_vsi_cfg() and ice_vsi_decfg() which can be used to configure or deconfigure an existing software VSI structure. Rather than completely removing the VSI and adding a new one using the .create_vsi() VF operation, simply use ice_vsi_decfg() to remove the current configuration. Save the VSI type and then call ice_vsi_cfg() to reconfigure the VSI as the same type that it was before. The existing reset logic assumes that all hardware filters will be removed, so also call ice_fltr_remove_all() before re-configuring the VSI. This new operation does not re-create the VSI, so rename it to ice_vf_reconfig_vsi(). The new approach can safely share the exact same flow for both SR-IOV VFs as well as the Scalable IOV VFs being worked on. This uses less code and is a better abstraction over fully deleting the VSI and adding a new one. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Petr Oros <poros@redhat.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-29ice: Fix VF Reset paths when interface in a failed over aggregateDave Ertman
There is an error when an interface has the following conditions: - PF is in an aggregate (bond) - PF has VFs created on it - bond is in a state where it is failed-over to the secondary interface - A VF reset is issued on one or more of those VFs The issue is generated by the originating PF trying to rebuild or reconfigure the VF resources. Since the bond is failed over to the secondary interface the queue contexts are in a modified state. To fix this issue, have the originating interface reclaim its resources prior to the tear-down and rebuild or reconfigure. Then after the process is complete, move the resources back to the currently active interface. There are multiple paths that can be used depending on what triggered the event, so create a helper function to move the queues and use paired calls to the helper (back to origin, process, then move back to active interface) under the same lag_mutex lock. Fixes: 1e0f9881ef79 ("ice: Flesh out implementation of support for SRIOV on bonded interface") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://lore.kernel.org/r/20231127212340.1137657-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-11-13ice: adjust switchdev rebuild pathMichal Swiatkowski
There is no need to use specific functions for rebuilding path. Let's use current implementation by removing all representors and as the result remove switchdev environment. It will be added in devices rebuild path. For example during adding VFs, port representors for them also will be created. Rebuild control plane VSI before removing representors with INIT_VSI flag set to reinit VSI in hardware after reset. Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-13ice: make representor code genericMichal Swiatkowski
Representor code needs to be independent from specific device type, like in this case VF. Make generic add / remove representor function and specific add VF / rem VF function. New device types will follow this scheme. In bridge offload code there is a need to get representor pointer based on VSI. Implement helper function to achieve that. Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-11-13ice: remove VF pointer reference in eswitch codeMichal Swiatkowski
Make eswitch code generic by removing VF pointer reference in functions. It is needed to support eswitch mode for other type of devices. Previously queue id used for Rx was based on VF number. Use ::q_id saved in port representor instead. After adding or removing port representor ::q_id value can change. It isn't good idea to iterate over representors list using this value. Use xa_find starting from the first one instead to get next port representor to remap. The number of port representors has to be equal to ::num_rx/tx_q. Warn if it isn't true. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-10-20ice: store VF's pci_dev ptr in ice_vfPrzemek Kitszel
Extend struct ice_vf by vfdev. Calculation of vfdev falls more nicely into ice_create_vf_entries(). Caching of vfdev enables simplification of ice_restore_all_vfs_msi_state(). Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-08-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR. Conflicts: include/net/inet_sock.h f866fbc842de ("ipv4: fix data-races around inet->inet_id") c274af224269 ("inet: introduce inet->inet_flags") https://lore.kernel.org/all/679ddff6-db6e-4ff6-b177-574e90d0103d@tessares.net/ Adjacent changes: drivers/net/bonding/bond_alb.c e74216b8def3 ("bonding: fix macvlan over alb bond support") f11e5bd159b0 ("bonding: support balance-alb with openvswitch") drivers/net/ethernet/broadcom/bgmac.c d6499f0b7c7c ("net: bgmac: Return PTR_ERR() for fixed_phy_register()") 23a14488ea58 ("net: bgmac: Fix return value check for fixed_phy_register()") drivers/net/ethernet/broadcom/genet/bcmmii.c 32bbe64a1386 ("net: bcmgenet: Fix return value check for fixed_phy_register()") acf50d1adbf4 ("net: bcmgenet: Return PTR_ERR() for fixed_phy_register()") net/sctp/socket.c f866fbc842de ("ipv4: fix data-races around inet->inet_id") b09bde5c3554 ("inet: move inet->mc_loop to inet->inet_frags") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-08-21ice: Fix NULL pointer deref during VF resetPetr Oros
During stress test with attaching and detaching VF from KVM and simultaneously changing VFs spoofcheck and trust there was a NULL pointer dereference in ice_reset_vf that VF's VSI is null. More than one instance of ice_reset_vf() can be running at a given time. When we rebuild the VSI in ice_reset_vf, another reset can be triaged from ice_service_task. In this case we can access the currently uninitialized VSI and cause panic. The window for this racing condition has been around for a long time but it's much worse after commit 227bf4500aaa ("ice: move VSI delete outside deconfig") because the reset runs faster. ice_reset_vf() using vf->cfg_lock and when we move this lock before accessing to the VF VSI, we can fix BUG for all cases. Panic occurs sometimes in ice_vsi_is_rx_queue_active() and sometimes in ice_vsi_stop_all_rx_rings() With our reproducer, we can hit BUG: ~8h before commit 227bf4500aaa ("ice: move VSI delete outside deconfig"). ~20m after commit 227bf4500aaa ("ice: move VSI delete outside deconfig"). After this fix we are not able to reproduce it after ~48h There was commit cf90b74341ee ("ice: Fix call trace with null VSI during VF reset") which also tried to fix this issue, but it was only partially resolved and the bug still exists. [ 6420.658415] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 6420.665382] #PF: supervisor read access in kernel mode [ 6420.670521] #PF: error_code(0x0000) - not-present page [ 6420.675659] PGD 0 [ 6420.677679] Oops: 0000 [#1] PREEMPT SMP NOPTI [ 6420.682038] CPU: 53 PID: 326472 Comm: kworker/53:0 Kdump: loaded Not tainted 5.14.0-317.el9.x86_64 #1 [ 6420.691250] Hardware name: Dell Inc. PowerEdge R750/04V528, BIOS 1.6.5 04/15/2022 [ 6420.698729] Workqueue: ice ice_service_task [ice] [ 6420.703462] RIP: 0010:ice_vsi_is_rx_queue_active+0x2d/0x60 [ice] [ 6420.705860] ice 0000:ca:00.0: VF 0 is now untrusted [ 6420.709494] Code: 00 00 66 83 bf 76 04 00 00 00 48 8b 77 10 74 3e 31 c0 eb 0f 0f b7 97 76 04 00 00 48 83 c0 01 39 c2 7e 2b 48 8b 97 68 04 00 00 <0f> b7 0c 42 48 8b 96 20 13 00 00 48 8d 94 8a 00 00 12 00 8b 12 83 [ 6420.714426] ice 0000:ca:00.0 ens7f0: Setting MAC 22:22:22:22:22:00 on VF 0. VF driver will be reinitialized [ 6420.733120] RSP: 0018:ff778d2ff383fdd8 EFLAGS: 00010246 [ 6420.733123] RAX: 0000000000000000 RBX: ff2acf1916294000 RCX: 0000000000000000 [ 6420.733125] RDX: 0000000000000000 RSI: ff2acf1f2c6401a0 RDI: ff2acf1a27301828 [ 6420.762346] RBP: ff2acf1a27301828 R08: 0000000000000010 R09: 0000000000001000 [ 6420.769476] R10: ff2acf1916286000 R11: 00000000019eba3f R12: ff2acf19066460d0 [ 6420.776611] R13: ff2acf1f2c6401a0 R14: ff2acf1f2c6401a0 R15: 00000000ffffffff [ 6420.783742] FS: 0000000000000000(0000) GS:ff2acf28ffa80000(0000) knlGS:0000000000000000 [ 6420.791829] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6420.797575] CR2: 0000000000000000 CR3: 00000016ad410003 CR4: 0000000000773ee0 [ 6420.804708] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 6420.811034] vfio-pci 0000:ca:01.0: enabling device (0000 -> 0002) [ 6420.811840] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 6420.811841] PKRU: 55555554 [ 6420.811842] Call Trace: [ 6420.811843] <TASK> [ 6420.811844] ice_reset_vf+0x9a/0x450 [ice] [ 6420.811876] ice_process_vflr_event+0x8f/0xc0 [ice] [ 6420.841343] ice_service_task+0x23b/0x600 [ice] [ 6420.845884] ? __schedule+0x212/0x550 [ 6420.849550] process_one_work+0x1e2/0x3b0 [ 6420.853563] ? rescuer_thread+0x390/0x390 [ 6420.857577] worker_thread+0x50/0x3a0 [ 6420.861242] ? rescuer_thread+0x390/0x390 [ 6420.865253] kthread+0xdd/0x100 [ 6420.868400] ? kthread_complete_and_exit+0x20/0x20 [ 6420.873194] ret_from_fork+0x1f/0x30 [ 6420.876774] </TASK> [ 6420.878967] Modules linked in: vfio_pci vfio_pci_core vfio_iommu_type1 vfio iavf vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nft_counter nf_tables bridge stp llc sctp ip6_udp_tunnel udp_tunnel nfp tls nfnetlink bluetooth mlx4_en mlx4_core rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs rfkill sunrpc intel_rapl_msr intel_rapl_common i10nm_edac nfit libnvdimm ipmi_ssif x86_pkg_temp_thermal intel_powerclamp coretemp irdma kvm_intel i40e kvm iTCO_wdt dcdbas ib_uverbs irqbypass iTCO_vendor_support mgag200 mei_me ib_core dell_smbios isst_if_mmio isst_if_mbox_pci rapl i2c_algo_bit drm_shmem_helper intel_cstate drm_kms_helper syscopyarea sysfillrect isst_if_common sysimgblt intel_uncore fb_sys_fops dell_wmi_descriptor wmi_bmof intel_vsec mei i2c_i801 acpi_ipmi ipmi_si i2c_smbus ipmi_devintf intel_pch_thermal acpi_power_meter pcspk r Fixes: efe41860008e ("ice: Fix memory corruption in VF driver") Fixes: f23df5220d2b ("ice: Fix spurious interrupt during removal of trusted VF") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-21Revert "ice: Fix ice VF reset during iavf initialization"Petr Oros
This reverts commit 7255355a0636b4eff08d5e8139c77d98f151c4fc. After this commit we are not able to attach VF to VM: virsh attach-interface v0 hostdev --managed 0000:41:01.0 --mac 52:52:52:52:52:52 error: Failed to attach interface error: Cannot set interface MAC to 52:52:52:52:52:52 for ifname enp65s0f0np0 vf 0: Resource temporarily unavailable ice_check_vf_ready_for_cfg() already contain waiting for reset. New condition in ice_check_vf_ready_for_reset() causing only problems. Fixes: 7255355a0636 ("ice: Fix ice VF reset during iavf initialization") Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17ice: Utilize assign_bit() helperTony Nguyen
The if/else check for bit setting can be replaced by using the assign_bit() helper so do so. Suggested-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-08-17ice: refactor ice_vf_lib to make functions staticJan Sokolowski
As following methods are not used outside ice_vf_lib, they can be made static: ice_vf_rebuild_host_vlan_cfg ice_vf_rebuild_host_tx_rate_cfg ice_vf_set_host_trust_cfg ice_vf_rebuild_host_mac_cfg ice_vf_rebuild_aggregator_node_cfg ice_vf_rebuild_host_cfg ice_set_vf_state_qs_dis ice_vf_set_initialized In order to achieve that, the order in which these were defined was reorganized. Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-19ice: use src VSI instead of src MAC in slow-pathMichal Swiatkowski
The use of a source MAC to direct packets from the VF to the corresponding port representor is only ok if there is only one MAC on a VF. To support this functionality when the number of MACs on a VF is greater, it is necessary to match a source VSI instead of a source MAC. Let's use the new switch API that allows matching on metadata. If MAC isn't used in match criteria there is no need to handle adding rule after virtchnl command. Instead add new rule while port representor is being configured. Remove rule_added field, checking for sp_rule can be used instead. Remove also checking for switchdev running in deleting rule as it can be called from unroll context when running flag isn't set. Checking for sp_rule covers both context (with and without running flag). Rules are added in eswitch configuration flow, so there is no need to have replay function. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Piotr Raczynski <piotr.raczynski@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: drivers/net/ethernet/freescale/fec_main.c 6ead9c98cafc ("net: fec: remove the xdp_return_frame when lack of tx BDs") 144470c88c5d ("net: fec: using the standard return codes when xdp xmit errors") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-16ice: Fix ice VF reset during iavf initializationDawid Wesierski
Fix the current implementation that causes ice_trigger_vf_reset() to start resetting the VF even when the VF-NIC is still initializing. When we reset NIC with ice driver it can interfere with iavf-vf initialization e.g. during consecutive resets induced by ice iavf ice | | |<-----------------| | ice resets vf iavf | reset | start | |<-----------------| | ice resets vf | causing iavf | initialization | error | | iavf reset end This leads to a series of -53 errors (failed to init adminq) from the IAVF. Change the state of the vf_state field to be not active when the IAVF is still initializing. Make sure to wait until receiving the message on the message box to ensure that the vf is ready and initializded. In simple terms we use the ACTIVE flag to make sure that the ice driver knows if the iavf is ready for another reset iavf ice | | | | |<------------- ice resets vf iavf vf_state != ACTIVE reset | start | | | | | iavf | reset-------> vf_state == ACTIVE end ice resets vf | | | | Fixes: c54d209c78b8 ("ice: Wait for VF to be reset/ready before configuration") Signed-off-by: Dawid Wesierski <dawidx.wesierski@intel.com> Signed-off-by: Kamil Maziarz <kamil.maziarz@intel.com> Acked-by: Jacob Keller <Jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-05-16ice: refactor VF control VSI interrupt handlingPiotr Raczynski
All VF control VSIs share the same interrupt vector. Currently, a helper function dedicated for that directly sets ice_vsi::base_vector. Use helper that returns pointer to first found VF control VSI instead. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Piotr Raczynski <piotr.raczynski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-03-13ice: move VF overflow message count into struct ice_mbx_vf_infoJacob Keller
The ice driver has some logic in ice_vf_mbx.c used to detect potentially malicious VF behavior with regards to overflowing the PF mailbox. This logic currently stores message counts in struct ice_mbx_vf_counter.vf_cntr as an array. This array is allocated during initialization with ice_mbx_init_snapshot. This logic makes sense for SR-IOV where all VFs are allocated at once up front. However, in the future with Scalable IOV this logic will not work. VFs can be added and removed dynamically. We could try to keep the vf_cntr array for the maximum possible number of VFs, but this is a waste of memory. Use the recently introduced struct ice_mbx_vf_info structure to store the message count. Pass a pointer to the mbx_info for a VF instead of using its VF ID. Replace the array of VF message counts with a linked list that tracks all currently active mailbox tracking info structures. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-03-13ice: track malicious VFs in new ice_mbx_vf_info structureJacob Keller
Currently the PF tracks malicious VFs in a malvfs bitmap which is used by the ice_mbx_clear_malvf and ice_mbx_report_malvf functions. This bitmap is used to ensure that we only report a VF as malicious once rather than continuously spamming the event log. This mechanism of storage for the malicious indication works well enough for SR-IOV. However, it will not work with Scalable IOV. This is because Scalable IOV VFs can be allocated dynamically and might change VF ID when their underlying VSI changes. To support this, the mailbox overflow logic will need to be refactored. First, introduce a new ice_mbx_vf_info structure which will be used to store data about a VF. Embed this structure in the struct ice_vf, and ensure it gets initialized when a new VF is created. For now this only stores the malicious indicator bit. Pass a pointer to the VF's mbx_info structure instead of using a bitmap to keep track of these bits. A future change will extend this structure and the rest of the logic associated with the overflow detection. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-03-13ice: convert ice_mbx_clear_malvf to void and use WARNJacob Keller
The ice_mbx_clear_malvf function checks for a few error conditions before clearing the appropriate data. These error conditions are really warnings that should never occur in a properly initialized driver. Every caller of ice_mbx_clear_malvf just prints a dev_dbg message on failure which will generally be ignored. Convert this function to void and switch the error return values to WARN_ON. This will make any potentially misconfiguration more visible and makes future refactors that involve changing how we store the malicious VF data easier. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06ice: remove unnecessary virtchnl_ether_addr struct useJacob Keller
The dev_lan_addr and hw_lan_addr members of ice_vf are used only to store the MAC address for the VF. They are defined using virtchnl_ether_addr, but only the .addr sub-member is actually used. Drop the use of virtchnl_ether_addr and just use a u8 array of length [ETH_ALEN]. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06ice: introduce .irq_close VF operationJacob Keller
The Scalable IOV implementation will require notifying the VDCM driver when an IRQ must be closed. This allows the VDCM to handle releasing stale IRQ context values and properly reconfigure. To handle this, introduce a new optional .irq_close callback to the VF operations structure. This will be implemented by Scalable IOV to handle the shutdown of the IRQ context. Since the SR-IOV implementation does not need this, we must check that its non-NULL before calling it. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06ice: introduce clear_reset_state operationJacob Keller
When hardware is reset, the VF relies on the VFGEN_RSTAT register to detect when the VF is finished resetting. This is a tri-state register where 0 indicates a reset is in progress, 1 indicates the hardware is done resetting, and 2 indicates that the software is done resetting. Currently the PF driver relies on the device hardware resetting VFGEN_RSTAT when a global reset occurs. This works ok, but it does mean that the VF might not immediately notice a reset when the driver first detects that the global reset is occurring. This is also problematic for Scalable IOV, because there is no read/write equivalent VFGEN_RSTAT register for the Scalable VSI type. Instead, the Scalable IOV VFs will need to emulate this register. To support this, introduce a new VF operation, clear_reset_state, which is called when the PF driver first detects a global reset. The Single Root IOV implementation can just write to VFGEN_RSTAT to ensure it's cleared immediately, without waiting for the actual hardware reset to begin. The Scalable IOV implementation will use this as part of its tracking of the reset status to allow properly reporting the emulated VFGEN_RSTAT to the VF driver. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06ice: convert vf_ops .vsi_rebuild to .create_vsiJacob Keller
The .vsi_rebuild function exists for ice_reset_vf. It is used to release and re-create the VSI during a single-VF reset. This function is only called when we need to re-create the VSI, and not when rebuilding an existing VSI. This makes the single-VF reset process different from the process used to restore functionality after a hardware reset such as the PF reset or EMP reset. When we add support for Scalable IOV VFs, the implementation will be very similar. The primary difference will be in the fact that each VF type uses a different underlying VSI type in hardware. Move the common functionality into a new ice_vf_recreate VSI function. This will allow the two IOV paths to share this functionality. Rework the .vsi_rebuild vf_op into .create_vsi, only performing the task of creating a new VSI. This creates a nice dichotomy between the ice_vf_rebuild_vsi and ice_vf_recreate_vsi, and should make it more clear why the two flows atre distinct. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06ice: introduce ice_vf_init_host_cfg functionJacob Keller
Introduce a new generic helper ice_vf_init_host_cfg which performs common host configuration initialization tasks that will need to be done for both Single Root IOV and the new Scalable IOV implementation. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06ice: add a function to initialize vf entryJacob Keller
Some of the initialization code for Single Root IOV VFs will need to be reused when we introduce Scalable IOV. Pull this code out into a new ice_initialize_vf_entry helper function. Co-developed-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Signed-off-by: Harshitha Ramamurthy <harshitha.ramamurthy@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06ice: Pull common tasks into ice_vf_post_vsi_rebuildJacob Keller
The Single Root IOV implementation of .post_vsi_rebuild performs some tasks that will ultimately need to be shared with the Scalable IOV implementation such as rebuilding the host configuration. Refactor by introducing a new wrapper function, ice_vf_post_vsi_rebuild which performs the tasks that will be shared between SR-IOV and Scalable IOV. Move the ice_vf_rebuild_host_cfg and ice_vf_set_initialized calls into this wrapper. Then call the implementation specific post_vsi_rebuild handler afterwards. This ensures that we will properly re-initialize filters and expected settings for both SR-IOV and Scalable IOV. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06ice: move ice_vf_vsi_release into ice_vf_lib.cJacob Keller
The ice_vf_vsi_release function will be used in a future change to refactor the .vsi_rebuild function. Move this over to ice_vf_lib.c so that it can be used there. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-06ice: refactor VSI setup to use parameter structureJacob Keller
The ice_vsi_setup function, ice_vsi_alloc, and ice_vsi_cfg functions have grown a large number of parameters. These parameters are used to initialize a new VSI, as well as re-configure an existing VSI Any time we want to add a new parameter to this function chain, even if it will usually be unset, we have to change many call sites due to changing the function signature. A future change is going to refactor ice_vsi_alloc and ice_vsi_cfg to move the VSI configuration and initialization all into ice_vsi_cfg. Before this, refactor the VSI setup flow to use a new ice_vsi_cfg_params structure. This will contain the configuration (mainly pointers) used to initialize a VSI. Pass this from ice_vsi_setup into the related functions such as ice_vsi_alloc, ice_vsi_cfg, and ice_vsi_cfg_def. Introduce a helper, ice_vsi_to_params to convert an existing VSI to the parameters used to initialize it. This will aid in the flows where we rebuild an existing VSI. Since we also pass the ICE_VSI_FLAG_INIT to more functions which do not need (or cannot yet have) the VSI parameters, lets make this clear by renaming the function parameter to vsi_flags and using a u32 instead of a signed integer. The name vsi_flags also makes it clear that we may extend the flags in the future. This change will make it easier to refactor the setup flow in the future, and will reduce the complexity required to add a new parameter for configuration in the future. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2023-02-03ice: split ice_vsi_setup into smaller functionsMichal Swiatkowski
Main goal is to reuse the same functions in VSI config and rebuild paths. To do this split ice_vsi_setup into smaller pieces and reuse it during rebuild. ice_vsi_alloc() should only alloc memory, not set the default values for VSI. Move setting defaults to separate function. This will allow config of already allocated VSI, for example in reload path. The path is mostly moving code around without introducing new functionality. Functions ice_vsi_cfg() and ice_vsi_decfg() were added, but they are using code that already exist. Use flag to pass information about VSI initialization during rebuild instead of using boolean value. Co-developed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-11-14ice: virtchnl rss hena supportMd Fahad Iqbal Polash
Add support for 2 virtchnl msgs: VIRTCHNL_OP_SET_RSS_HENA VIRTCHNL_OP_GET_RSS_HENA_CAPS The first one allows VFs to clear all previously programmed RSS configuration and customize it. The second one returns the RSS HENA bits allowed by the hardware. Introduce ice_err_to_virt_err which converts kernel specific errors to virtchnl errors. Signed-off-by: Md Fahad Iqbal Polash <md.fahad.iqbal.polash@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09ice: Fix spurious interrupt during removal of trusted VFNorbert Zulinski
Previously, during removal of trusted VF when VF is down there was number of spurious interrupt equal to number of queues on VF. Add check if VF already has inactive queues. If VF is disabled and has inactive rx queues then do not disable rx queues. Add check in ice_vsi_stop_tx_ring if it's VF's vsi and if VF is disabled. Fixes: efe41860008e ("ice: Fix memory corruption in VF driver") Signed-off-by: Norbert Zulinski <norbertx.zulinski@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-17ice: Fix VF not able to send tagged traffic with no VLAN filtersSylwester Dziedziuch
VF was not able to send tagged traffic when it didn't have any VLAN interfaces and VLAN anti-spoofing was enabled. Fix this by allowing VFs with no VLAN filters to send tagged traffic. After VF adds a VLAN interface it will be able to send tagged traffic matching VLAN filters only. Testing hints: 1. Spawn VF 2. Send tagged packet from a VF 3. The packet should be sent out and not dropped 4. Add a VLAN interface on VF 5. Send tagged packet on that VLAN interface 6. Packet should be sent out and not dropped 7. Send tagged packet with id different than VLAN interface 8. Packet should be dropped Fixes: daf4dd16438b ("ice: Refactor spoofcheck configuration functions") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-08-11ice: Fix call trace with null VSI during VF resetMichal Jaron
During stress test with attaching and detaching VF from KVM and simultaneously changing VFs spoofcheck and trust there was a call trace in ice_reset_vf that VF's VSI is null. [145237.352797] WARNING: CPU: 46 PID: 840629 at drivers/net/ethernet/intel/ice/ice_vf_lib.c:508 ice_reset_vf+0x3d6/0x410 [ice] [145237.352851] Modules linked in: ice(E) vfio_pci vfio_pci_core vfio_virqfd vfio_iommu_type1 vfio iavf dm_mod xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc sunrpc intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iTCO_wdt iTC O_vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl ipmi_si intel_cstate ipmi_devintf joydev intel_uncore m ei_me ipmi_msghandler i2c_i801 pcspkr mei lpc_ich ioatdma i2c_smbus acpi_pad acpi_power_meter ip_tables xfs libcrc32c i2c_algo_bit drm_sh mem_helper drm_kms_helper sd_mod t10_pi crc64_rocksoft syscopyarea crc64 sysfillrect sg sysimgblt fb_sys_fops drm i40e ixgbe ahci libahci libata crc32c_intel mdio dca wmi fuse [last unloaded: ice] [145237.352917] CPU: 46 PID: 840629 Comm: kworker/46:2 Tainted: G S W I E 5.19.0-rc6+ #24 [145237.352921] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015 [145237.352923] Workqueue: ice ice_service_task [ice] [145237.352948] RIP: 0010:ice_reset_vf+0x3d6/0x410 [ice] [145237.352984] Code: 30 ec f3 cc e9 28 fd ff ff 0f b7 4b 50 48 c7 c2 48 19 9c c0 4c 89 ee 48 c7 c7 30 fe 9e c0 e8 d1 21 9d cc 31 c0 e9 a 9 fe ff ff <0f> 0b b8 ea ff ff ff e9 c1 fc ff ff 0f 0b b8 fb ff ff ff e9 91 fe [145237.352987] RSP: 0018:ffffb453e257fdb8 EFLAGS: 00010246 [145237.352990] RAX: ffff8bd0040181c0 RBX: ffff8be68db8f800 RCX: 0000000000000000 [145237.352991] RDX: 000000000000ffff RSI: 0000000000000000 RDI: ffff8be68db8f800 [145237.352993] RBP: ffff8bd0040181c0 R08: 0000000000001000 R09: ffff8bcfd520e000 [145237.352995] R10: 0000000000000000 R11: 00008417b5ab0bc0 R12: 0000000000000005 [145237.352996] R13: ffff8bcee061c0d0 R14: ffff8bd004019640 R15: 0000000000000000 [145237.352998] FS: 0000000000000000(0000) GS:ffff8be5dfb00000(0000) knlGS:0000000000000000 [145237.353000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [145237.353002] CR2: 00007fd81f651d68 CR3: 0000001a0fe10001 CR4: 00000000001726e0 [145237.353003] Call Trace: [145237.353008] <TASK> [145237.353011] ice_process_vflr_event+0x8d/0xb0 [ice] [145237.353049] ice_service_task+0x79f/0xef0 [ice] [145237.353074] process_one_work+0x1c8/0x390 [145237.353081] ? process_one_work+0x390/0x390 [145237.353084] worker_thread+0x30/0x360 [145237.353087] ? process_one_work+0x390/0x390 [145237.353090] kthread+0xe8/0x110 [145237.353094] ? kthread_complete_and_exit+0x20/0x20 [145237.353097] ret_from_fork+0x22/0x30 [145237.353103] </TASK> Remove WARN_ON() from check if VSI is null in ice_reset_vf. Add "VF is already removed\n" in dev_dbg(). This WARN_ON() is unnecessary and causes call trace, despite that call trace, driver still works. There is no need for this warn because this piece of code is responsible for disabling VF's Tx/Rx queues when VF is disabled, but when VF is already removed there is no need to do reset or disable queues. Fixes: efe41860008e ("ice: Fix memory corruption in VF driver") Signed-off-by: Michal Jaron <michalx.jaron@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28ice: Fix promiscuous mode not turning offMichal Wilczynski
When trust is turned off for the VF, the expectation is that promiscuous and allmulticast filters are removed. Currently default VSI filter is not getting cleared in this flow. Example: ip link set enp236s0f0 vf 0 trust on ip link set enp236s0f0v0 promisc on ip link set enp236s0f0 vf 0 trust off /* promiscuous mode is still enabled on VF0 */ Remove switch filters for both cases. This commit fixes above behavior by removing default VSI filters and allmulticast filters when vf-true-promisc-support is OFF. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-07-28ice: Introduce enabling promiscuous mode on multiple VF'sMichal Wilczynski
In current implementation default VSI switch filter is only able to forward traffic to a single VSI. This limits promiscuous mode with private flag 'vf-true-promisc-support' to a single VF. Enabling it on the second VF won't work. Also allmulticast support doesn't seem to be properly implemented when vf-true-promisc-support is true. Use standard ice_add_rule_internal() function that already implements forwarding to multiple VSI's instead of constructing AQ call manually. Add switch filter for allmulticast mode when vf-true-promisc-support is enabled. The same filter is added regardless of the flag - it doesn't matter for this case. Remove unnecessary fields in switch structure. From now on book keeping will be done by ice_add_rule_internal(). Refactor unnecessarily passed function arguments. To test: 1) Create 2 VM's, and two VF's. Attach VF's to VM's. 2) Enable promiscuous mode on both of them and check if traffic is seen on both of them. Signed-off-by: Michal Wilczynski <michal.wilczynski@intel.com> Tested-by: Marek Szlosek <marek.szlosek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-06-14ice: Fix memory corruption in VF driverPrzemyslaw Patynowski
Disable VF's RX/TX queues, when it's disabled. VF can have queues enabled, when it requests a reset. If PF driver assumes that VF is disabled, while VF still has queues configured, VF may unmap DMA resources. In such scenario device still can map packets to memory, which ends up silently corrupting it. Previously, VF driver could experience memory corruption, which lead to crash: [ 5119.170157] BUG: unable to handle kernel paging request at 00001b9780003237 [ 5119.170166] PGD 0 P4D 0 [ 5119.170173] Oops: 0002 [#1] PREEMPT_RT SMP PTI [ 5119.170181] CPU: 30 PID: 427592 Comm: kworker/u96:2 Kdump: loaded Tainted: G W I --------- - - 4.18.0-372.9.1.rt7.166.el8.x86_64 #1 [ 5119.170189] Hardware name: Dell Inc. PowerEdge R740/014X06, BIOS 2.3.10 08/15/2019 [ 5119.170193] Workqueue: iavf iavf_adminq_task [iavf] [ 5119.170219] RIP: 0010:__page_frag_cache_drain+0x5/0x30 [ 5119.170238] Code: 0f 0f b6 77 51 85 f6 74 07 31 d2 e9 05 df ff ff e9 90 fe ff ff 48 8b 05 49 db 33 01 eb b4 0f 1f 80 00 00 00 00 0f 1f 44 00 00 <f0> 29 77 34 74 01 c3 48 8b 07 f6 c4 80 74 0f 0f b6 77 51 85 f6 74 [ 5119.170244] RSP: 0018:ffffa43b0bdcfd78 EFLAGS: 00010282 [ 5119.170250] RAX: ffffffff896b3e40 RBX: ffff8fb282524000 RCX: 0000000000000002 [ 5119.170254] RDX: 0000000049000000 RSI: 0000000000000000 RDI: 00001b9780003203 [ 5119.170259] RBP: ffff8fb248217b00 R08: 0000000000000022 R09: 0000000000000009 [ 5119.170262] R10: 2b849d6300000000 R11: 0000000000000020 R12: 0000000000000000 [ 5119.170265] R13: 0000000000001000 R14: 0000000000000009 R15: 0000000000000000 [ 5119.170269] FS: 0000000000000000(0000) GS:ffff8fb1201c0000(0000) knlGS:0000000000000000 [ 5119.170274] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 5119.170279] CR2: 00001b9780003237 CR3: 00000008f3e1a003 CR4: 00000000007726e0 [ 5119.170283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 5119.170286] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 5119.170290] PKRU: 55555554 [ 5119.170292] Call Trace: [ 5119.170298] iavf_clean_rx_ring+0xad/0x110 [iavf] [ 5119.170324] iavf_free_rx_resources+0xe/0x50 [iavf] [ 5119.170342] iavf_free_all_rx_resources.part.51+0x30/0x40 [iavf] [ 5119.170358] iavf_virtchnl_completion+0xd8a/0x15b0 [iavf] [ 5119.170377] ? iavf_clean_arq_element+0x210/0x280 [iavf] [ 5119.170397] iavf_adminq_task+0x126/0x2e0 [iavf] [ 5119.170416] process_one_work+0x18f/0x420 [ 5119.170429] worker_thread+0x30/0x370 [ 5119.170437] ? process_one_work+0x420/0x420 [ 5119.170445] kthread+0x151/0x170 [ 5119.170452] ? set_kthread_struct+0x40/0x40 [ 5119.170460] ret_from_fork+0x35/0x40 [ 5119.170477] Modules linked in: iavf sctp ip6_udp_tunnel udp_tunnel mlx4_en mlx4_core nfp tls vhost_net vhost vhost_iotlb tap tun xt_CHECKSUM ipt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc intel_rapl_msr iTCO_wdt iTCO_vendor_support dell_smbios wmi_bmof dell_wmi_descriptor dcdbas kvm_intel kvm irqbypass intel_rapl_common isst_if_common skx_edac irdma nfit libnvdimm x86_pkg_temp_thermal i40e intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ib_uverbs rapl ipmi_ssif intel_cstate intel_uncore mei_me pcspkr acpi_ipmi ib_core mei lpc_ich i2c_i801 ipmi_si ipmi_devintf wmi ipmi_msghandler acpi_power_meter xfs libcrc32c sd_mod t10_pi sg mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ice ahci drm libahci crc32c_intel libata tg3 megaraid_sas [ 5119.170613] i2c_algo_bit dm_mirror dm_region_hash dm_log dm_mod fuse [last unloaded: iavf] [ 5119.170627] CR2: 00001b9780003237 Fixes: ec4f5a436bdf ("ice: Check if VF is disabled for Opcode and other operations") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Co-developed-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-05-05ice: add a function comment for ice_cfg_mac_antispoofJacob Keller
This function definition was missing a comment describing its implementation. Add one. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>