summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)Author
2022-02-10ice: Avoid RTNL lock when re-creating auxiliary deviceDave Ertman
If a call to re-create the auxiliary device happens in a context that has already taken the RTNL lock, then the call flow that recreates auxiliary device can hang if there is another attempt to claim the RTNL lock by the auxiliary driver. To avoid this, any call to re-create auxiliary devices that comes from an source that is holding the RTNL lock (e.g. netdev notifier when interface exits a bond) should execute in a separate thread. To accomplish this, add a flag to the PF that will be evaluated in the service task and dealt with there. Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Jonathan Toppins <jtoppins@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-10ice: Fix KASAN error in LAG NETDEV_UNREGISTER handlerDave Ertman
Currently, the same handler is called for both a NETDEV_BONDING_INFO LAG unlink notification as for a NETDEV_UNREGISTER call. This is causing a problem though, since the netdev_notifier_info passed has a different structure depending on which event is passed. The problem manifests as a call trace from a BUG: KASAN stack-out-of-bounds error. Fix this by creating a handler specific to NETDEV_UNREGISTER that only is passed valid elements in the netdev_notifier_info struct for the NETDEV_UNREGISTER event. Also included is the removal of an unbalanced dev_put on the peer_netdev and related braces. Fixes: 6a8b357278f5 ("ice: Respond to a NETDEV_UNREGISTER event for LAG") Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Acked-by: Jonathan Toppins <jtoppins@redhat.com> Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-10ice: fix IPIP and SIT TSO offloadJesse Brandeburg
The driver was avoiding offload for IPIP (at least) frames due to parsing the inner header offsets incorrectly when trying to check lengths. This length check works for VXLAN frames but fails on IPIP frames because skb_transport_offset points to the inner header in IPIP frames, which meant the subtraction of transport_header from inner_network_header returns a negative value (-20). With the code before this patch, everything continued to work, but GSO was being used to segment, causing throughputs of 1.5Gb/s per thread. After this patch, throughput is more like 10Gb/s per thread for IPIP traffic. Fixes: e94d44786693 ("ice: Implement filter sync, NDO operations and bump version") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-10ice: fix an error code in ice_cfg_phy_fec()Dan Carpenter
Propagate the error code from ice_get_link_default_override() instead of returning success. Fixes: ea78ce4dab05 ("ice: add link lenient and default override support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-04ixgbevf: Require large buffers for build_skb on 82599VFSamuel Mendoza-Jonas
From 4.17 onwards the ixgbevf driver uses build_skb() to build an skb around new data in the page buffer shared with the ixgbe PF. This uses either a 2K or 3K buffer, and offsets the DMA mapping by NET_SKB_PAD + NET_IP_ALIGN. When using a smaller buffer RXDCTL is set to ensure the PF does not write a full 2K bytes into the buffer, which is actually 2K minus the offset. However on the 82599 virtual function, the RXDCTL mechanism is not available. The driver attempts to work around this by using the SET_LPE mailbox method to lower the maximm frame size, but the ixgbe PF driver ignores this in order to keep the PF and all VFs in sync[0]. This means the PF will write up to the full 2K set in SRRCTL, causing it to write NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the buffer. With 4K pages split into two buffers, this means it either writes NET_SKB_PAD + NET_IP_ALIGN bytes past the first buffer (and into the second), or NET_SKB_PAD + NET_IP_ALIGN bytes past the end of the DMA mapping. Avoid this by only enabling build_skb when using "large" buffers (3K). These are placed in each half of an order-1 page, preventing the PF from writing past the end of the mapping. [0]: Technically it only ever raises the max frame size, see ixgbe_set_vf_lpe() in ixgbe_sriov.c Fixes: f15c5ba5b6cd ("ixgbevf: add support for using order 1 pages to receive large frames") Signed-off-by: Samuel Mendoza-Jonas <samjonas@amazon.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-01Merge branch '1GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2022-02-01 This series contains updates to e1000e driver only. Sasha removes CSME handshake with TGL platform as this is not supported and is causing hardware unit hangs to be reported. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: Handshake with CSME starts from ADL platforms e1000e: Separate ADP board type from TGP ==================== Link: https://lore.kernel.org/r/20220201173754.580305-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-01e1000e: Handshake with CSME starts from ADL platformsSasha Neftin
Handshake with CSME/AMT on none provisioned platforms during S0ix flow is not supported on TGL platform and can cause to HW unit hang. Update the handshake with CSME flow to start from the ADL platform. Fixes: 3e55d231716e ("e1000e: Add handshake with the CSME to support S0ix") Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-02-01e1000e: Separate ADP board type from TGPSasha Neftin
We have the same LAN controller on different PCH's. Separate ADP board type from a TGP which will allow for specific fixes to be applied for ADP platforms. Suggested-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Suggested-by: Dima Ruinskiy <dima.ruinskiy@intel.com> Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31i40e: Fix reset path while removing the driverKaren Sornek
Fix the crash in kernel while dereferencing the NULL pointer, when the driver is unloaded and simultaneously the VSI rings are being stopped. The hardware requires 50msec in order to finish RX queues disable. For this purpose the driver spins in mdelay function for the operation to be completed. For example changing number of queues which requires reset would fail in the following call stack: 1) i40e_prep_for_reset 2) i40e_pf_quiesce_all_vsi 3) i40e_quiesce_vsi 4) i40e_vsi_close 5) i40e_down 6) i40e_vsi_stop_rings 7) i40e_vsi_control_rx -> disable requires the delay of 50msecs 8) continue back in i40e_down function where i40e_clean_tx_ring(vsi->tx_rings[i]) is going to crash When the driver was spinning vsi_release called i40e_vsi_free_arrays where the vsi->tx_rings resources were freed and the pointer was set to NULL. Fixes: 5b6d4a7f20b0 ("i40e: Fix crash during removing i40e driver") Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-31i40e: Fix reset bw limit when DCB enabled with 1 TCJedrzej Jagielski
There was an AQ error I40E_AQ_RC_EINVAL when trying to reset bw limit as part of bw allocation setup. This was caused by trying to reset bw limit with DCB enabled. Bw limit should not be reset when DCB is enabled. The code was relying on the pf->flags to check if DCB is enabled but if only 1 TC is available this flag will not be set even though DCB is enabled. Add a check for number of TC and if it is 1 don't try to reset bw limit even if pf->flags shows DCB as disabled. Fixes: fa38e30ac73f ("i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled") Suggested-by: Alexander Lobakin <alexandr.lobakin@intel.com> # Flatten the condition Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Imam Hassan Reza Biswas <imam.hassan.reza.biswas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-20i40e: fix unsigned stat widthsJoe Damato
Change i40e_update_vsi_stats and struct i40e_vsi to use u64 fields to match the width of the stats counters in struct i40e_rx_queue_stats. Update debugfs code to use the correct format specifier for u64. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Joe Damato <jdamato@fastly.com> Reported-by: kernel test robot <lkp@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-20i40e: Fix for failed to init adminq while VF resetKaren Sornek
Fix for failed to init adminq: -53 while VF is resetting via MAC address changing procedure. Added sync module to avoid reading deadbeef value in reinit adminq during software reset. Without this patch it is possible to trigger VF reset procedure during reinit adminq. This resulted in an incorrect reading of value from the AQP registers and generated the -53 error. Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-20i40e: Fix queues reservation for XDPSylwester Dziedziuch
When XDP was configured on a system with large number of CPUs and X722 NIC there was a call trace with NULL pointer dereference. i40e 0000:87:00.0: failed to get tracking for 256 queues for VSI 0 err -12 i40e 0000:87:00.0: setup of MAIN VSI failed BUG: kernel NULL pointer dereference, address: 0000000000000000 RIP: 0010:i40e_xdp+0xea/0x1b0 [i40e] Call Trace: ? i40e_reconfig_rss_queues+0x130/0x130 [i40e] dev_xdp_install+0x61/0xe0 dev_xdp_attach+0x18a/0x4c0 dev_change_xdp_fd+0x1e6/0x220 do_setlink+0x616/0x1030 ? ahci_port_stop+0x80/0x80 ? ata_qc_issue+0x107/0x1e0 ? lock_timer_base+0x61/0x80 ? __mod_timer+0x202/0x380 rtnl_setlink+0xe5/0x170 ? bpf_lsm_binder_transaction+0x10/0x10 ? security_capable+0x36/0x50 rtnetlink_rcv_msg+0x121/0x350 ? rtnl_calcit.isra.0+0x100/0x100 netlink_rcv_skb+0x50/0xf0 netlink_unicast+0x1d3/0x2a0 netlink_sendmsg+0x22a/0x440 sock_sendmsg+0x5e/0x60 __sys_sendto+0xf0/0x160 ? __sys_getsockname+0x7e/0xc0 ? _copy_from_user+0x3c/0x80 ? __sys_setsockopt+0xc8/0x1a0 __x64_sys_sendto+0x20/0x30 do_syscall_64+0x33/0x40 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f83fa7a39e0 This was caused by PF queue pile fragmentation due to flow director VSI queue being placed right after main VSI. Because of this main VSI was not able to resize its queue allocation for XDP resulting in no queues allocated for main VSI when XDP was turned on. Fix this by always allocating last queue in PF queue pile for a flow director VSI. Fixes: 41c445ff0f48 ("i40e: main driver core") Fixes: 74608d17fe29 ("i40e: add support for XDP_TX action") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-20i40e: Fix issue when maximum queues is exceededJedrzej Jagielski
Before this patch VF interface vanished when maximum queue number was exceeded. Driver tried to add next queues even if there was not enough space. PF sent incorrect number of queues to the VF when there were not enough of them. Add an additional condition introduced to check available space in 'qp_pile' before proceeding. This condition makes it impossible to add queues if they number is greater than the number resulting from available space. Also add the search for free space in PF queue pair piles. Without this patch VF interfaces are not seen when available space for queues has been exceeded and following logs appears permanently in dmesg: "Unable to get VF config (-32)". "VF 62 failed opcode 3, retval: -5" "Unable to get VF config due to PF error condition, not retrying" Fixes: 7daa6bf3294e ("i40e: driver core headers") Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Jaroslaw Gawin <jaroslawx.gawin@intel.com> Signed-off-by: Slawomir Laba <slawomirx.laba@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-20i40e: Increase delay to 1 s after global EMP resetJedrzej Jagielski
Recently simplified i40e_rebuild causes that FW sometimes is not ready after NVM update, the ping does not return. Increase the delay in case of EMP reset. Old delay of 300 ms was introduced for specific cards for 710 series. Now it works for all the cards and delay was increased. Fixes: 1fa51a650e1d ("i40e: Add delay after EMP reset for firmware to recover") Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-13Merge tag 'irq-core-2022-01-13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates for the interrupt subsystem: Core: - Provide a new interface for affinity hints to provide a separation between hint and actual affinity change which has become a hidden property of the current interface - Fix up the in tree usage of the affinity hint interfaces Drivers: - No new irqchip drivers! - Fix GICv3 redistributor table reservation with RT across kexec - Fix GICv4.1 redistributor view of the VPE table across kexec - Add support for extra interrupts on spear-shirq - Make obtaining some interrupts optional for the Renesas drivers - Various cleanups and bug fixes" * tag 'irq-core-2022-01-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (25 commits) irqchip/renesas-intc-irqpin: Use platform_get_irq_optional() to get the interrupt irqchip/renesas-irqc: Use platform_get_irq_optional() to get the interrupt irqchip/gic-v4: Disable redistributors' view of the VPE table at boot time irqchip/ingenic-tcu: Use correctly sized arguments for bit field irqchip/gic-v2m: Add const to of_device_id irqchip/imx-gpcv2: Mark imx_gpcv2_instance with __ro_after_init irqchip/spear-shirq: Add support for IRQ 0..6 irqchip/gic-v3-its: Limit memreserve cpuhp state lifetime irqchip/gic-v3-its: Postpone LPI pending table freeing and memreserve irqchip/gic-v3-its: Give the percpu rdist struct its own flags field net/mlx4: Use irq_update_affinity_hint() net/mlx5: Use irq_set_affinity_and_hint() hinic: Use irq_set_affinity_and_hint() scsi: lpfc: Use irq_set_affinity() mailbox: Use irq_update_affinity_hint() ixgbe: Use irq_update_affinity_hint() be2net: Use irq_update_affinity_hint() enic: Use irq_update_affinity_hint() RDMA/irdma: Use irq_update_affinity_hint() scsi: mpt3sas: Use irq_set_affinity_and_hint() ...
2022-01-08Merge branch 'linus' into irq/core, to fix conflictIngo Molnar
Conflicts: drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2022-01-07iavf: remove an unneeded variableJason Wang
The variable `ret_code' used for returning is never changed in function `iavf_shutdown_adminq'. So that it can be removed and just return its initial value 0 at the end of `iavf_shutdown_adminq' function. Signed-off-by: Jason Wang <wangborong@cdjrlc.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-07i40e: remove variables set but not usedYang Li
The code that uses variables pe_cntx_size and pe_filt_size has been removed, so they should be removed as well. Eliminate the following clang warnings: drivers/net/ethernet/intel/i40e/i40e_common.c:4139:20: warning: variable 'pe_filt_size' set but not used. drivers/net/ethernet/intel/i40e/i40e_common.c:4139:6: warning: variable 'pe_cntx_size' set but not used. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-07i40e: Remove non-inclusive languageMateusz Palczewski
Remove non-inclusive language from the driver. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-07i40e: Update FW API versionMateusz Palczewski
Update FW API versions to the newest supported NVM images. Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-07i40e: Minimize amount of busy-waiting during AQ sendJedrzej Jagielski
The i40e_asq_send_command will now use a non blocking usleep_range if possible (non-atomic context), instead of busy-waiting udelay. The usleep_range function uses hrtimers to provide better performance and removes the negative impact of busy-waiting in time-critical environments. 1. Rename i40e_asq_send_command to i40e_asq_send_command_atomic and add 5th parameter to inform if called from an atomic context. Call inside usleep_range (if non-atomic) or udelay (if atomic). 2. Change i40e_asq_send_command to invoke i40e_asq_send_command_atomic(..., false). 3. Change two functions: - i40e_aq_set_vsi_uc_promisc_on_vlan - i40e_aq_set_vsi_mc_promisc_on_vlan to explicitly use i40e_asq_send_command_atomic(..., true) instead of i40e_asq_send_command, as they use spinlocks and do some work in an atomic context. All other calls to i40e_asq_send_command remain unchanged. Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-07i40e: Add ensurance of MacVlan resources for every trusted VFKaren Sornek
Trusted VF can use up every resource available, leaving nothing to other trusted VFs. Introduce define, which calculates MacVlan resources available based on maximum available MacVlan resources, bare minimum for each VF and number of currently allocated VFs. Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-06ice: Use bitmap_free() to free bitmapChristophe JAILLET
kfree() and bitmap_free() are the same. But using the latter is more consistent when freeing memory allocated with bitmap_zalloc(). Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-06ice: Optimize a few bitmap operationsChristophe JAILLET
When a bitmap is local to a function, it is safe to use the non-atomic __[set|clear]_bit(). No concurrent accesses can occur. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-06ice: Slightly simply ice_find_free_recp_res_idxChristophe JAILLET
The 'possible_idx' bitmap is set just after it is zeroed, so we can save the first step. The 'free_idx' bitmap is used only at the end of the function as the result of a bitmap xor operation. So there is no need to explicitly zero it before. So, slightly simply the code and remove 2 useless 'bitmap_zero()' call Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-06ice: improve switchdev's slow-pathWojciech Drewek
In current switchdev implementation, every VF PR is assigned to individual ring on switchdev ctrl VSI. For slow-path traffic, there is a mapping VF->ring done in software based on src_vsi value (by calling ice_eswitch_get_target_netdev function). With this change, HW solution is introduced which is more efficient. For each VF, src MAC (VF's MAC) filter will be created, which forwards packets to the corresponding switchdev ctrl VSI queue based on src MAC address. This filter has to be removed and then replayed in case of resetting one VF. Keep information about this rule in repr->mac_rule, thanks to that we know which rule has to be removed and replayed for a given VF. In case of CORE/GLOBAL all rules are removed automatically. We have to take care of readding them. This is done by ice_replay_vsi_adv_rule. When driver leaves switchdev mode, remove all advanced rules from switchdev ctrl VSI. This is done by ice_rem_adv_rule_for_vsi. Flag repr->rule_added is needed because in some cases reset might be triggered before VF sends request to add MAC. Co-developed-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-06ice: replay advanced rules after resetVictor Raj
ice_replay_vsi_adv_rule will replay advanced rules for a given VSI. Exit this function when list of rules for given recipe is empty. Do not add rule when given vsi_handle does not match vsi_handle from the rule info. Use ICE_MAX_NUM_RECIPES instead of ICE_SW_LKUP_LAST in order to find advanced rules as well. Signed-off-by: Victor Raj <victor.raj@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-04iavf: Fix limit of total number of queues to active queues of VFKaren Sornek
In the absence of this validation, if the user requests to configure queues more than the enabled queues, it results in sending the requested number of queues to the kernel stack (due to the asynchronous nature of VF response), in which case the stack might pick a queue to transmit that is not enabled and result in Tx hang. Fix this bug by limiting the total number of queues allocated for VF to active queues of VF. Fixes: d5b33d024496 ("i40evf: add ndo_setup_tc callback to i40evf") Signed-off-by: Ashwin Vijayavel <ashwin.vijayavel@intel.com> Signed-off-by: Karen Sornek <karen.sornek@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-04i40e: Fix incorrect netdev's real number of RX/TX queuesJedrzej Jagielski
There was a wrong queues representation in sysfs during driver's reinitialization in case of online cpus number is less than combined queues. It was caused by stopped NetworkManager, which is responsible for calling vsi_open function during driver's initialization. In specific situation (ex. 12 cpus online) there were 16 queues in /sys/class/net/<iface>/queues. In case of modifying queues with value higher, than number of online cpus, then it caused write errors and other errors. Add updating of sysfs's queues representation during driver initialization. Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Lukasz Cieplicki <lukaszx.cieplicki@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-04i40e: Fix for displaying message regarding NVM versionMateusz Palczewski
When loading the i40e driver, it prints a message like: 'The driver for the device detected a newer version of the NVM image v1.x than expected v1.y. Please install the most recent version of the network driver.' This is misleading as the driver is working as expected. Fix that by removing the second part of message and changing it from dev_info to dev_dbg. Fixes: 4fb29bddb57f ("i40e: The driver now prints the API version in error message") Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-04i40e: fix use-after-free in i40e_sync_filters_subtask()Di Zhu
Using ifconfig command to delete the ipv6 address will cause the i40e network card driver to delete its internal mac_filter and i40e_service_task kernel thread will concurrently access the mac_filter. These two processes are not protected by lock so causing the following use-after-free problems. print_address_description+0x70/0x360 ? vprintk_func+0x5e/0xf0 kasan_report+0x1b2/0x330 i40e_sync_vsi_filters+0x4f0/0x1850 [i40e] i40e_sync_filters_subtask+0xe3/0x130 [i40e] i40e_service_task+0x195/0x24c0 [i40e] process_one_work+0x3f5/0x7d0 worker_thread+0x61/0x6c0 ? process_one_work+0x7d0/0x7d0 kthread+0x1c3/0x1f0 ? kthread_park+0xc0/0xc0 ret_from_fork+0x35/0x40 Allocated by task 2279810: kasan_kmalloc+0xa0/0xd0 kmem_cache_alloc_trace+0xf3/0x1e0 i40e_add_filter+0x127/0x2b0 [i40e] i40e_add_mac_filter+0x156/0x190 [i40e] i40e_addr_sync+0x2d/0x40 [i40e] __hw_addr_sync_dev+0x154/0x210 i40e_set_rx_mode+0x6d/0xf0 [i40e] __dev_set_rx_mode+0xfb/0x1f0 __dev_mc_add+0x6c/0x90 igmp6_group_added+0x214/0x230 __ipv6_dev_mc_inc+0x338/0x4f0 addrconf_join_solict.part.7+0xa2/0xd0 addrconf_dad_work+0x500/0x980 process_one_work+0x3f5/0x7d0 worker_thread+0x61/0x6c0 kthread+0x1c3/0x1f0 ret_from_fork+0x35/0x40 Freed by task 2547073: __kasan_slab_free+0x130/0x180 kfree+0x90/0x1b0 __i40e_del_filter+0xa3/0xf0 [i40e] i40e_del_mac_filter+0xf3/0x130 [i40e] i40e_addr_unsync+0x85/0xa0 [i40e] __hw_addr_sync_dev+0x9d/0x210 i40e_set_rx_mode+0x6d/0xf0 [i40e] __dev_set_rx_mode+0xfb/0x1f0 __dev_mc_del+0x69/0x80 igmp6_group_dropped+0x279/0x510 __ipv6_dev_mc_dec+0x174/0x220 addrconf_leave_solict.part.8+0xa2/0xd0 __ipv6_ifa_notify+0x4cd/0x570 ipv6_ifa_notify+0x58/0x80 ipv6_del_addr+0x259/0x4a0 inet6_addr_del+0x188/0x260 addrconf_del_ifaddr+0xcc/0x130 inet6_ioctl+0x152/0x190 sock_do_ioctl+0xd8/0x2b0 sock_ioctl+0x2e5/0x4c0 do_vfs_ioctl+0x14e/0xa80 ksys_ioctl+0x7c/0xa0 __x64_sys_ioctl+0x42/0x50 do_syscall_64+0x98/0x2c0 entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: 41c445ff0f48 ("i40e: main driver core") Signed-off-by: Di Zhu <zhudi2@huawei.com> Signed-off-by: Rui Zhang <zhangrui182@huawei.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-04i40e: Fix to not show opcode msg on unsuccessful VF MAC changeMateusz Palczewski
Hide i40e opcode information sent during response to VF in case when untrusted VF tried to change MAC on the VF interface. This is implemented by adding an additional parameter 'hide' to the response sent to VF function that hides the display of error information, but forwards the error code to VF. Previously it was not possible to send response with some error code to VF without displaying opcode information. Fixes: 5c3c48ac6bf5 ("i40e: implement virtual device interface") Signed-off-by: Grzegorz Szczurek <grzegorzx.szczurek@intel.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Reviewed-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2022-01-04net: fixup build after bpf header changesJakub Kicinski
Recent bpf-next merge brought in header changes which uncovered includes missing in net-next which were not present in bpf-next. Build problems happen only on less-popular arches like hppa, sparc, alpha etc. I could repro the build problem with ice but not the mlx5 problem Abdul was reporting. mlx5 does look like it should include filter.h, anyway. Reported-by: Abdul Haleem <abdhalee@linux.vnet.ibm.com> Fixes: e63a02348958 ("Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next") Link: https://lore.kernel.org/all/7c03768d-d948-c935-a7ab-b1f963ac7eed@linux.vnet.ibm.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-31Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf-next 2021-12-30 The following pull-request contains BPF updates for your *net-next* tree. We've added 72 non-merge commits during the last 20 day(s) which contain a total of 223 files changed, 3510 insertions(+), 1591 deletions(-). The main changes are: 1) Automatic setrlimit in libbpf when bpf is memcg's in the kernel, from Andrii. 2) Beautify and de-verbose verifier logs, from Christy. 3) Composable verifier types, from Hao. 4) bpf_strncmp helper, from Hou. 5) bpf.h header dependency cleanup, from Jakub. 6) get_func_[arg|ret|arg_cnt] helpers, from Jiri. 7) Sleepable local storage, from KP. 8) Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support, from Kumar. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-30Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c commit 077cdda764c7 ("net/mlx5e: TC, Fix memory leak with rules with internal port") commit 31108d142f36 ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'") commit 4390c6edc0fb ("net/mlx5: Fix some error handling paths in 'mlx5e_tc_add_fdb_flow()'") https://lore.kernel.org/all/20211229065352.30178-1-saeed@kernel.org/ net/smc/smc_wr.c commit 49dc9013e34b ("net/smc: Use the bitmap API when applicable") commit 349d43127dac ("net/smc: fix kernel panic caused by race of smc_sock") bitmap_zero()/memset() is removed by the fix Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-30ice: Add flow director support for channel modeKiran Patil
Add support to enable flow-director filter when multiple TCs are configured. Flow director filter can be configured using ethtool (--config-ntuple option). When multiple TCs are configured, each TC is mapped to an unique HW VSI. So VSI corresponding to queue used in filter is identified and flow director context is updated with correct VSI while configuring ntuple filter in HW. Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-29igb: support EXTTS on 82580/i354/i350Ruud Bos
Support for the PTP pin function on 82580/i354/i350 based adapters. Because the time registers of these adapters do not have the nice split in second rollovers as the i210 has, the implementation is slightly more complex compared to the i210 implementation. Signed-off-by: Ruud Bos <kernel.hbk@gmail.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-29igb: support PEROUT on 82580/i354/i350Ruud Bos
Support for the PEROUT PTP pin function on 82580/i354/i350 based adapters. Because the time registers of these adapters do not have the nice split in second rollovers as the i210 has, the implementation is slightly more complex compared to the i210 implementation. Signed-off-by: Ruud Bos <kernel.hbk@gmail.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-29igb: move PEROUT and EXTTS isr logic to separate functionsRuud Bos
Remove code duplication in the tsync interrupt handler function by moving this logic to separate functions. This keeps the interrupt handler readable and allows the new functions to be extended for adapter types other than i210. Signed-off-by: Ruud Bos <kernel.hbk@gmail.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-29igb: move SDP config initialization to separate functionRuud Bos
Allow reuse of SDP config struct initialization by moving it to a separate function. Signed-off-by: Ruud Bos <kernel.hbk@gmail.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-29net: Don't include filter.h from net/sock.hJakub Kicinski
sock.h is pretty heavily used (5k objects rebuilt on x86 after it's touched). We can drop the include of filter.h from it and add a forward declaration of struct sk_filter instead. This decreases the number of rebuilt objects when bpf.h is touched from ~5k to ~1k. There's a lot of missing includes this was masking. Primarily in networking tho, this time. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org
2021-12-28igc: Fix TX timestamp support for non-MSI-X platformsJames McLaughlin
Time synchronization was not properly enabled on non-MSI-X platforms. Fixes: 2c344ae24501 ("igc: Add support for TX timestamping") Signed-off-by: James McLaughlin <james.mclaughlin@qsc.com> Reviewed-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28igc: Do not enable crosstimestamping for i225-V modelsVinicius Costa Gomes
It was reported that when PCIe PTM is enabled, some lockups could be observed with some integrated i225-V models. While the issue is investigated, we can disable crosstimestamp for those models and see no loss of functionality, because those models don't have any support for time synchronization. Fixes: a90ec8483732 ("igc: Add support for PTP getcrosststamp()") Link: https://lore.kernel.org/all/924175a188159f4e03bd69908a91e606b574139b.camel@gmx.de/ Reported-by: Stefan Dietrich <roots@gmx.de> Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28ixgbevf: switch to napi_build_skb()Alexander Lobakin
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. ixgbevf driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28ixgbe: switch to napi_build_skb()Alexander Lobakin
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. ixgbe driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28igc: switch to napi_build_skb()Alexander Lobakin
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. igc driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Nechama Kraus <nechamax.kraus@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28igb: switch to napi_build_skb()Alexander Lobakin
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. igb driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-28ice: switch to napi_build_skb()Alexander Lobakin
napi_build_skb() reuses per-cpu NAPI skbuff_head cache in order to save some cycles on freeing/allocating skbuff_heads on every new Rx or completed Tx. ice driver runs Tx completion polling cycle right before the Rx one and uses napi_consume_skb() to feed the cache with skbuff_heads of completed entries, so it's never empty and always warm at that moment. Switch to the napi_build_skb() to relax mm pressure on heavy Rx. Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>