summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel
AgeCommit message (Collapse)Author
2021-02-01i40e: Revert "i40e: don't report link up for a VF who hasn't enabled queues"Aleksandr Loktionov
This reverts commit 2ad1274fa35ace5c6360762ba48d33b63da2396c VF queues were not brought up when PF was brought up after being downed if the VF driver disabled VFs queues during PF down. This could happen in some older or external VF driver implementations. The problem was that PF driver used vf->queues_enabled as a condition to decide what link-state it would send out which caused the issue. Remove the check for vf->queues_enabled in the VF link notify. Now VF will always be notified of the current link status. Also remove the queues_enabled member from i40e_vf structure as it is not used anymore. Otherwise VNF implementation was broken and caused a link flap. The original commit was a workaround to avoid breaking existing VFs though it's really a fault of the VF code not the PF. The commit should be safe to revert as all of the VFs we know of have been fixed. Also, since we now know there is a related bug in the workaround, removing it is preferred. Fixes: 2ad1274fa35a ("i40e: don't report link up for a VF who hasn't enabled") Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-01igc: check return value of ret_val in igc_config_fc_after_link_upKevin Lo
Check return value from ret_val to make error check actually work. Fixes: 4eb8080143a9 ("igc: Add setup link functionality") Signed-off-by: Kevin Lo <kevlo@kevlo.org> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-01igc: set the default return value to -IGC_ERR_NVM in igc_write_nvm_srwrKevin Lo
This patch sets the default return value to -IGC_ERR_NVM in igc_write_nvm_srwr. Without this change it wouldn't lead to a shadow RAM write EEWR timeout. Fixes: ab4056126813 ("igc: Add NVM support") Signed-off-by: Kevin Lo <kevlo@kevlo.org> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-02-01igc: Report speed and duplex as unknown when device is runtime suspendedKai-Heng Feng
Similar to commit 165ae7a8feb5 ("igb: Report speed and duplex as unknown when device is runtime suspended"), if we try to read speed and duplex sysfs while the device is runtime suspended, igc will complain and stops working: [ 123.449883] igc 0000:03:00.0 enp3s0: PCIe link lost, device now detached [ 123.450052] BUG: kernel NULL pointer dereference, address: 0000000000000008 [ 123.450056] #PF: supervisor read access in kernel mode [ 123.450058] #PF: error_code(0x0000) - not-present page [ 123.450059] PGD 0 P4D 0 [ 123.450064] Oops: 0000 [#1] SMP NOPTI [ 123.450068] CPU: 0 PID: 2525 Comm: udevadm Tainted: G U W OE 5.10.0-1002-oem #2+rkl2-Ubuntu [ 123.450078] RIP: 0010:igc_rd32+0x1c/0x90 [igc] [ 123.450080] Code: c0 5d c3 b8 fd ff ff ff c3 0f 1f 44 00 00 0f 1f 44 00 00 55 89 f0 48 89 e5 41 56 41 55 41 54 49 89 c4 53 48 8b 57 08 48 01 d0 <44> 8b 28 41 83 fd ff 74 0c 5b 44 89 e8 41 5c 41 5d 4 [ 123.450083] RSP: 0018:ffffb0d100d6fcc0 EFLAGS: 00010202 [ 123.450085] RAX: 0000000000000008 RBX: ffffb0d100d6fd30 RCX: 0000000000000000 [ 123.450087] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff945a12716c10 [ 123.450089] RBP: ffffb0d100d6fce0 R08: ffff945a12716550 R09: ffff945a09874000 [ 123.450090] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000008 [ 123.450092] R13: ffff945a12716000 R14: ffff945a037da280 R15: ffff945a037da290 [ 123.450094] FS: 00007f3b34c868c0(0000) GS:ffff945b89200000(0000) knlGS:0000000000000000 [ 123.450096] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 123.450098] CR2: 0000000000000008 CR3: 00000001144de006 CR4: 0000000000770ef0 [ 123.450100] PKRU: 55555554 [ 123.450101] Call Trace: [ 123.450111] igc_ethtool_get_link_ksettings+0xd6/0x1b0 [igc] [ 123.450118] __ethtool_get_link_ksettings+0x71/0xb0 [ 123.450123] duplex_show+0x74/0xc0 [ 123.450129] dev_attr_show+0x1d/0x40 [ 123.450134] sysfs_kf_seq_show+0xa1/0x100 [ 123.450137] kernfs_seq_show+0x27/0x30 [ 123.450142] seq_read+0xb7/0x400 [ 123.450148] ? common_file_perm+0x72/0x170 [ 123.450151] kernfs_fop_read+0x35/0x1b0 [ 123.450155] vfs_read+0xb5/0x1b0 [ 123.450157] ksys_read+0x67/0xe0 [ 123.450160] __x64_sys_read+0x1a/0x20 [ 123.450164] do_syscall_64+0x38/0x90 [ 123.450168] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 123.450170] RIP: 0033:0x7f3b351fe142 [ 123.450173] Code: c0 e9 c2 fe ff ff 50 48 8d 3d 3a ca 0a 00 e8 f5 19 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24 [ 123.450174] RSP: 002b:00007fffef2ec138 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 123.450177] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3b351fe142 [ 123.450179] RDX: 0000000000001001 RSI: 00005644c047f070 RDI: 0000000000000003 [ 123.450180] RBP: 00007fffef2ec340 R08: 00005644c047f070 R09: 00007f3b352d9320 [ 123.450182] R10: 00005644c047c010 R11: 0000000000000246 R12: 00005644c047cbf0 [ 123.450184] R13: 00005644c047e6d0 R14: 0000000000000003 R15: 00007fffef2ec140 [ 123.450189] Modules linked in: rfcomm ccm cmac algif_hash algif_skcipher af_alg bnep toshiba_acpi industrialio toshiba_haps hp_accel lis3lv02d btusb btrtl btbcm btintel bluetooth ecdh_generic ecc joydev input_leds nls_iso8859_1 snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common snd_soc_hdac_hda snd_hda_codec_hdmi snd_sof_xtensa_dsp snd_sof_intel_hda snd_sof snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_intel snd_intel_dspcfg soundwire_intel soundwire_generic_allocation soundwire_cadence snd_hda_codec snd_hda_core ath10k_pci snd_hwdep intel_rapl_msr intel_rapl_common ath10k_core soundwire_bus snd_soc_core x86_pkg_temp_thermal ath intel_powerclamp snd_compress ac97_bus snd_pcm_dmaengine mac80211 snd_pcm coretemp snd_seq_midi snd_seq_midi_event snd_rawmidi kvm_intel cfg80211 snd_seq snd_seq_device snd_timer mei_hdcp kvm libarc4 snd crct10dif_pclmul ghash_clmulni_intel aesni_intel mei_me dell_wmi [ 123.450266] dell_smbios soundcore sparse_keymap dcdbas crypto_simd cryptd mei dell_uart_backlight glue_helper ee1004 wmi_bmof intel_wmi_thunderbolt dell_wmi_descriptor mac_hid efi_pstore acpi_pad acpi_tad intel_cstate sch_fq_codel parport_pc ppdev lp parport ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear dm_mirror dm_region_hash dm_log hid_generic usbhid hid i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec crc32_pclmul rc_core drm intel_lpss_pci i2c_i801 ahci igc intel_lpss i2c_smbus idma64 xhci_pci libahci virt_dma xhci_pci_renesas wmi video pinctrl_tigerlake [ 123.450335] CR2: 0000000000000008 [ 123.450338] ---[ end trace 9f731e38b53c35cc ]--- The more generic approach will be wrap get_link_ksettings() with begin() and complete() callbacks, and calls runtime resume and runtime suspend routine respectively. However, igc is like igb, runtime resume routine uses rtnl_lock() which upper ethtool layer also uses. So to prevent a deadlock on rtnl, take a different approach, use pm_runtime_suspended() to avoid reading register while device is runtime suspended. Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Acked-by: Sasha Neftin <sasha.neftin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26igc: fix link speed advertisingCorinna Vinschen
Link speed advertising in igc has two problems: - When setting the advertisement via ethtool, the link speed is converted to the legacy 32 bit representation for the intel PHY code. This inadvertently drops ETHTOOL_LINK_MODE_2500baseT_Full_BIT (being beyond bit 31). As a result, any call to `ethtool -s ...' drops the 2500Mbit/s link speed from the PHY settings. Only reloading the driver alleviates that problem. Fix this by converting the ETHTOOL_LINK_MODE_2500baseT_Full_BIT to the Intel PHY ADVERTISE_2500_FULL bit explicitly. - Rather than checking the actual PHY setting, the .get_link_ksettings function always fills link_modes.advertising with all link speeds the device is capable of. Fix this by checking the PHY autoneg_advertised settings and report only the actually advertised speeds up to ethtool. Fixes: 8c5ad0dae93c ("igc: Add ethtool support") Signed-off-by: Corinna Vinschen <vinschen@redhat.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26i40e: acquire VSI pointer only after VF is initializedStefan Assmann
This change simplifies the VF initialization check and also minimizes the delay between acquiring the VSI pointer and using it. As known by the commit being fixed, there is a risk of the VSI pointer getting changed. Therefore minimize the delay between getting and using the pointer. Fixes: 9889707b06ac ("i40e: Fix crash caused by stress setting of VF MAC addresses") Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26ice: Fix MSI-X vector fallback logicBrett Creeley
The current MSI-X enablement logic tries to enable best-case MSI-X vectors and if that fails we only support a bare-minimum set. This includes a single MSI-X for 1 Tx and 1 Rx queue and a single MSI-X for the OICR interrupt. Unfortunately, the driver fails to load when we don't get as many MSI-X as requested for a couple reasons. First, the code to allocate MSI-X in the driver tries to allocate num_online_cpus() MSI-X for LAN traffic without caring about the number of MSI-X actually enabled/requested from the kernel for LAN traffic. So, when calling ice_get_res() for the PF VSI, it returns failure because the number of available vectors is less than requested. Fix this by not allowing the PF VSI to allocation more than pf->num_lan_msix MSI-X vectors and pf->num_lan_msix Rx/Tx queues. Limiting the number of queues is done because we don't want more than 1 Tx/Rx queue per interrupt due to performance conerns. Second, the driver assigns pf->num_lan_msix = 2, to account for LAN traffic and the OICR. However, pf->num_lan_msix is only meant for LAN MSI-X. This is causing a failure when the PF VSI tries to allocate/reserve the minimum pf->num_lan_msix because the OICR MSI-X has already been reserved, so there may not be enough MSI-X vectors left. Fix this by setting pf->num_lan_msix = 1 for the failure case. Then the ICE_MIN_MSIX accounts for the LAN MSI-X and the OICR MSI-X needed for the failure case. Update the related defines used in ice_ena_msix_range() to align with the above behavior and remove the unused RDMA defines because RDMA is currently not supported. Also, remove the now incorrect comment. Fixes: 152b978a1f90 ("ice: Rework ice_ena_msix_range") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26ice: Don't allow more channels than LAN MSI-X availableBrett Creeley
Currently users could create more channels than LAN MSI-X available. This is happening because there is no check against pf->num_lan_msix when checking the max allowed channels and will cause performance issues if multiple Tx and Rx queues are tied to a single MSI-X. Fix this by not allowing more channels than LAN MSI-X available in pf->num_lan_msix. Fixes: 87324e747fde ("ice: Implement ethtool ops for channels") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26ice: update dev_addr in ice_set_mac_address even if HW filter existsNick Nunley
Fix the driver to copy the MAC address configured in ndo_set_mac_address into dev_addr, even if the MAC filter already exists in HW. In some situations (e.g. bonding) the netdev's dev_addr could have been modified outside of the driver, with no change to the HW filter, so the driver cannot assume that they match. Fixes: 757976ab16be ("ice: Fix check for removing/adding mac filters") Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26ice: Implement flow for IPv6 next header (extension header)Nick Nunley
This patch is based on a similar change to i40e by Slawomir Laba: "i40e: Implement flow for IPv6 next header (extension header)". When a packet contains an IPv6 header with next header which is an extension header and not a protocol one, the kernel function skb_transport_header called with such sk_buff will return a pointer to the extension header and not to the TCP one. The above explained call caused a problem with packet processing for skb with encapsulation for tunnel with ICE_TX_CTX_EIPT_IPV6. The extension header was not skipped at all. The ipv6_skip_exthdr function does check if next header of the IPV6 header is an extension header and doesn't modify the l4_proto pointer if it points to a protocol header value so its safe to omit the comparison of exthdr and l4.hdr pointers. The ipv6_skip_exthdr can return value -1. This means that the skipping process failed and there is something wrong with the packet so it will be dropped. Fixes: a4e82a81f573 ("ice: Add support for tunnel offloads") Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-26ice: fix FDir IPv6 flexbyteHenry Tieman
The packet classifier would occasionally misrecognize an IPv6 training packet when the next protocol field was 0. The correct value for unspecified protocol is IPPROTO_NONE. Fixes: 165d80d6adab ("ice: Support IPv6 Flow Director filters") Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-01-13i40e: fix potential NULL pointer dereferencingCristian Dumitrescu
Currently, the function i40e_construct_skb_zc only frees the input xdp buffer when the output skb is successfully built. On error, the function i40e_clean_rx_irq_zc does not commit anything for the current packet descriptor and simply exits the packet descriptor processing loop, with the plan to restart the processing of this descriptor on the next invocation. Therefore, on error the ring next-to-clean pointer should not advance, the xdp i.e. *bi buffer should not be freed and the current buffer info should not be invalidated by setting *bi to NULL. Therefore, the *bi should only be set to NULL when the function i40e_construct_skb_zc is successful, otherwise a NULL *bi will be dereferenced when the work for the current descriptor is eventually restarted. Fixes: 3b4f0b66c2b3 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL") Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/r/20210111181138.49757-1-cristian.dumitrescu@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-23e1000e: Export S0ix flags to ethtoolMario Limonciello
This flag can be used by an end user to disable S0ix flows on a buggy system or by an OEM for development purposes. If you need this flag to be persisted across reboots, it's suggested to use a udev rule to call adjust it until the kernel could have your configuration in a disallow list. Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Yijun Shen <Yijun.shen@dell.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-23Revert "e1000e: disable s0ix entry and exit flows for ME systems"Mario Limonciello
commit e086ba2fccda ("e1000e: disable s0ix entry and exit flows for ME systems") disabled s0ix flows for systems that have various incarnations of the i219-LM ethernet controller. This changed caused power consumption regressions on the following shipping Dell Comet Lake based laptops: * Latitude 5310 * Latitude 5410 * Latitude 5410 * Latitude 5510 * Precision 3550 * Latitude 5411 * Latitude 5511 * Precision 3551 * Precision 7550 * Precision 7750 This commit was introduced because of some regressions on certain Thinkpad laptops. This comment was potentially caused by an earlier commit 632fbd5eb5b0e ("e1000e: fix S0ix flows for cable connected case"). or it was possibly caused by a system not meeting platform architectural requirements for low power consumption. Other changes made in the driver with extended timeouts are expected to make the driver more impervious to platform firmware behavior. Fixes: e086ba2fccda ("e1000e: disable s0ix entry and exit flows for ME systems") Reviewed-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Yijun Shen <Yijun.shen@dell.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-23e1000e: bump up timeout to wait when ME un-configures ULP modeMario Limonciello
Per guidance from Intel ethernet architecture team, it may take up to 1 second for unconfiguring ULP mode. However in practice this seems to be taking up to 2 seconds on some Lenovo machines. Detect scenarios that take more than 1 second but less than 2.5 seconds and emit a warning on resume for those scenarios. Suggested-by: Aaron Ma <aaron.ma@canonical.com> Suggested-by: Sasha Netfin <sasha.neftin@intel.com> Suggested-by: Hans de Goede <hdegoede@redhat.com> CC: Mark Pearson <markpearson@lenovo.com> Fixes: f15bb6dde738cc8fa0 ("e1000e: Add support for S0ix") BugLink: https://bugs.launchpad.net/bugs/1865570 Link: https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20200323191639.48826-1-aaron.ma@canonical.com/ Link: https://lkml.org/lkml/2020/12/13/15 Link: https://lkml.org/lkml/2020/12/14/708 Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Tested-by: Yijun Shen <Yijun.shen@dell.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-23e1000e: Only run S0ix flows if shutdown succeededMario Limonciello
If the shutdown failed, the part will be thawed and running S0ix flows will put it into an undefined state. Reported-by: Alexander Duyck <alexander.duyck@gmail.com> Reviewed-by: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Tested-by: Yijun Shen <Yijun.shen@dell.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-17iavf: fix double-release of rtnl_lockJakub Kicinski
This code does not jump to exit on an error in iavf_lan_add_device(), so the rtnl_unlock() from the normal path will follow. Fixes: b66c7bc1cd4d ("iavf: Refactor init state machine") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-17i40e: Fix Error I40E_AQ_RC_EINVAL when removing VFsSylwester Dziedziuch
When removing VFs for PF added to bridge there was an error I40E_AQ_RC_EINVAL. It was caused by not properly resetting and reinitializing PF when adding/removing VFs. Changed how reset is performed when adding/removing VFs to properly reinitialize PFs VSI. Fixes: fc60861e9b00 ("i40e: start up in VEPA mode by default") Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-16i40e, xsk: clear the status bits for the next_to_use descriptorBjörn Töpel
On the Rx side, the next_to_use index points to the next item in the HW ring to be refilled/allocated, and next_to_clean points to the next item to potentially be processed. When the HW Rx ring is fully refilled, i.e. no packets has been processed, the next_to_use will be next_to_clean - 1. When the ring is fully processed next_to_clean will be equal to next_to_use. The latter case is where a bug is triggered. If the next_to_use bits are not cleared, and the "fully processed" state is entered, a stale descriptor can be processed. The skb-path correctly clear the status bit for the next_to_use descriptor, but the AF_XDP zero-copy path did not do that. This change adds the status bits clearing of the next_to_use descriptor. Fixes: 3b4f0b66c2b3 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-16ice, xsk: clear the status bits for the next_to_use descriptorBjörn Töpel
On the Rx side, the next_to_use index points to the next item in the HW ring to be refilled/allocated, and next_to_clean points to the next item to potentially be processed. When the HW Rx ring is fully refilled, i.e. no packets has been processed, the next_to_use will be next_to_clean - 1. When the ring is fully processed next_to_clean will be equal to next_to_use. The latter case is where a bug is triggered. If the next_to_use bits are not cleared, and the "fully processed" state is entered, a stale descriptor can be processed. The skb-path correctly clear the status bit for the next_to_use descriptor, but the AF_XDP zero-copy path did not do that. This change adds the status bits clearing of the next_to_use descriptor. Fixes: 2d4238f55697 ("ice: Add support for AF_XDP") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-14ice, xsk: Move Rx allocation out of while-loopBjörn Töpel
Instead doing the check for allocation in each loop, move it outside the while loop and do it every NAPI loop. This change boosts the xdpsock rxdrop scenario with 15% more packets-per-second. Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/r/20201211085410.59350-1-bjorn.topel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
xdp_return_frame_bulk() needs to pass a xdp_buff to __xdp_return(). strlcpy got converted to strscpy but here it makes no functional difference, so just keep the right code. Conflicts: net/netfilter/nf_tables_api.c Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-10igc: Add new device IDSasha Neftin
Add new device ID for the next step of the silicon and reflect the I226_K part. Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09Merge branch '1GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2020-12-09 This series contains updates to igb, ixgbe, i40e, and ice drivers. Sven Auhagen fixes issues with igb XDP: return correct error value in XDP xmit back, increase header padding to include space for double VLAN, add an extack error when Rx buffer is too small for frame size, set metasize if it is set in xdp, change xdp_do_flush_map to xdp_do_flush, and update trans_start to avoid possible Tx timeout. Björn fixes an issue where an Rx buffer can be reused prematurely with XDP redirect for ixgbe, i40e, and ice drivers. The following are changes since commit 323a391a220c4a234cb1e678689d7f4c3b73f863: can: isotp: isotp_setsockopt(): block setsockopt on bound sockets and are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue 1GbE ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-12-09e1000e: fix S0ix flow to allow S0i3.2 subset entryVitaly Lifshits
Changed a configuration in the flows to align with architecture requirements to achieve S0i3.2 substate. This helps both i219V and i219LM configurations. Also fixed a typo in the previous commit 632fbd5eb5b0 ("e1000e: fix S0ix flows for cable connected case"). Fixes: 632fbd5eb5b0 ("e1000e: fix S0ix flows for cable connected case"). Signed-off-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Mario Limonciello <mario.limonciello@dell.com> Link: https://lore.kernel.org/r/20201208185632.151052-1-mario.limonciello@dell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-09ice: avoid premature Rx buffer reuseBjörn Töpel
The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Fixes: efc2214b6047 ("ice: Add support for XDP") Reported-and-analyzed-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ixgbe: avoid premature Rx buffer reuseBjörn Töpel
The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Fixes: 6453073987ba ("ixgbe: add initial support for xdp redirect") Reported-and-analyzed-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09i40e: avoid premature Rx buffer reuseBjörn Töpel
The page recycle code, incorrectly, relied on that a page fragment could not be freed inside xdp_do_redirect(). This assumption leads to that page fragments that are used by the stack/XDP redirect can be reused and overwritten. To avoid this, store the page count prior invoking xdp_do_redirect(). Longer explanation: Intel NICs have a recycle mechanism. The main idea is that a page is split into two parts. One part is owned by the driver, one part might be owned by someone else, such as the stack. t0: Page is allocated, and put on the Rx ring +--------------- used by NIC ->| upper buffer (rx_buffer) +--------------- | lower buffer +--------------- page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX t1: Buffer is received, and passed to the stack (e.g.) +--------------- | upper buff (skb) +--------------- used by NIC ->| lower buffer (rx_buffer) +--------------- page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX - 1 t2: Buffer is received, and redirected +--------------- | upper buff (skb) +--------------- used by NIC ->| lower buffer (rx_buffer) +--------------- Now, prior calling xdp_do_redirect(): page count == USHRT_MAX rx_buffer->pagecnt_bias == USHRT_MAX - 2 This means that buffer *cannot* be flipped/reused, because the skb is still using it. The problem arises when xdp_do_redirect() actually frees the segment. Then we get: page count == USHRT_MAX - 1 rx_buffer->pagecnt_bias == USHRT_MAX - 2 From a recycle perspective, the buffer can be flipped and reused, which means that the skb data area is passed to the Rx HW ring! To work around this, the page count is stored prior calling xdp_do_redirect(). Note that this is not optimal, since the NIC could actually reuse the "lower buffer" again. However, then we need to track whether XDP_REDIRECT consumed the buffer or not. Fixes: d9314c474d4f ("i40e: add support for XDP_REDIRECT") Reported-and-analyzed-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: avoid transmit queue timeout in xdp pathSven Auhagen
Since we share the transmit queue with the network stack, it is possible that we run into a transmit queue timeout. This will reset the queue. This happens under high load when XDP is using the transmit queue pretty much exclusively. netdev_start_xmit() sets the trans_start variable of the transmit queue to jiffies which is later utilized by dev_watchdog(), so to avoid timeout, let stack know that XDP xmit happened by bumping the trans_start within XDP Tx routines to jiffies. Fixes: 9cbc948b5a20 ("igb: add XDP support") Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: use xdp_do_flushSven Auhagen
Since it is a new XDP implementation change xdp_do_flush_map to xdp_do_flush. Fixes: 9cbc948b5a20 ("igb: add XDP support") Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: skb add metasize for xdpSven Auhagen
add metasize if it is set in xdp Fixes: 9cbc948b5a20 ("igb: add XDP support") Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: XDP extack message on errorSven Auhagen
Add an extack error message when the RX buffer size is too small for the frame size. Fixes: 9cbc948b5a20 ("igb: add XDP support") Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: take VLAN double header into accountSven Auhagen
Increase the packet header padding to include double VLAN tagging. This patch uses a macro for this. Fixes: 9cbc948b5a20 ("igb: add XDP support") Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09igb: XDP xmit back fix error codeSven Auhagen
The igb XDP xmit back function should only return defined error codes. Fixes: 9cbc948b5a20 ("igb: add XDP support") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Sven Auhagen <sven.auhagen@voleatech.de> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: Add space to unknown speedSimon Perron Caissy
Add space to the end of 'Unknown' string in order to avoid concatenation with 'bps' string when formatting netdev log message. Signed-off-by: Simon Perron Caissy <simon.perron.caissy@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: join format strings to same line as ice_debugJacob Keller
When printing messages with ice_debug, align the printed string to the origin line of the message in order to ease debugging and tracking messages back to their source. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: silence static analysis warningBruce Allan
sparse warns about cast to/from restricted types which is not an actual problem; silence the warning. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: cleanup misleading commentBruce Allan
The maximum Admin Queue buffer size and NVM shadow RAM sector size are both 4 Kilobytes. Some comments refer to those as 4Kb which can be confused with 4 Kilobits. Update the comments to use the commonly used KB symbol instead. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: Remove vlan_ena from vsi structureNick Nunley
vlan_ena was introduced to track whether VLAN filters are enabled on the device, but 1) checking for num_vlan > 1 already gives us this information, and is currently used in this way throughout the code 2) the logic for vlan_ena is broken when multiple VLANs are active Just remove vlan_ena and use num_vlan instead. Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: Remove gate to OROM initJeb Cramer
Remove the gate that prevents the OROM and netlist info from being populated. The NVM now has the appropriate section for software to reference the versioning info. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: Enable Support for FW Override (E82X)Jeb Cramer
The driver is able to override the firmware when it comes to supporting a more lenient link mode. This feature was limited to E810 devices. It is now extended to E82X devices. Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: don't always return an error for Get PHY Abilities AQ commandPaul M Stillwell Jr
There are times when the driver shouldn't return an error when the Get PHY abilities AQ command (0x0600) returns an error. Instead the driver should log that the error occurred and continue on. This allows the driver to load even though the AQ command failed. The user can then later determine the reason for the failure and correct it. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-09ice: cleanup stack hogBruce Allan
In ice_flow_add_prof_sync(), struct ice_flow_prof_params has recently grown in size hogging stack space when allocated there. Hogging stack space should be avoided. Change allocation to be on the heap when needed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Harikumar Bokkena <harikumarx.bokkena@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-04Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-12-03 The main changes are: 1) Support BTF in kernel modules, from Andrii. 2) Introduce preferred busy-polling, from Björn. 3) bpf_ima_inode_hash() and bpf_bprm_opts_set() helpers, from KP Singh. 4) Memcg-based memory accounting for bpf objects, from Roman. 5) Allow bpf_{s,g}etsockopt from cgroup bind{4,6} hooks, from Stanislav. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (118 commits) selftests/bpf: Fix invalid use of strncat in test_sockmap libbpf: Use memcpy instead of strncpy to please GCC selftests/bpf: Add fentry/fexit/fmod_ret selftest for kernel module selftests/bpf: Add tp_btf CO-RE reloc test for modules libbpf: Support attachment of BPF tracing programs to kernel modules libbpf: Factor out low-level BPF program loading helper bpf: Allow to specify kernel module BTFs when attaching BPF programs bpf: Remove hard-coded btf_vmlinux assumption from BPF verifier selftests/bpf: Add CO-RE relocs selftest relying on kernel module BTF selftests/bpf: Add support for marking sub-tests as skipped selftests/bpf: Add bpf_testmod kernel module for testing libbpf: Add kernel module BTF support for CO-RE relocations libbpf: Refactor CO-RE relocs to not assume a single BTF object libbpf: Add internal helper to load BTF data by FD bpf: Keep module's btf_data_size intact after load bpf: Fix bpf_put_raw_tracepoint()'s use of __module_address() selftests/bpf: Add Userspace tests for TCP_WINDOW_CLAMP bpf: Adds support for setting window clamp samples/bpf: Fix spelling mistake "recieving" -> "receiving" bpf: Fix cold build of test_progs-no_alu32 ... ==================== Link: https://lore.kernel.org/r/20201204021936.85653-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-12-01xsk: Propagate napi_id to XDP socket Rx pathBjörn Töpel
Add napi_id to the xdp_rxq_info structure, and make sure the XDP socket pick up the napi_id in the Rx path. The napi_id is used to find the corresponding NAPI structure for socket busy polling. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/bpf/20201130185205.196029-7-bjorn.topel@gmail.com
2020-11-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Trivial conflict in CAN, keep the net-next + the byteswap wrapper. Conflicts: drivers/net/can/usb/gs_usb.c Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-11-24igbvf: Refactor tracesKaren Sornek
Refactoring "PF still resetting" and changing "Failed to add vlan id" to "Vlan id is not added" messages because previous version looked like a bug - it informed about changes that worked as designed but might confuse users Signed-off-by: Karen Sornek <karen.sornek@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-24i40e: report correct VF link speed when link state is set to enableStefan Assmann
When the virtual link state was set to "enable" ethtool would report link speed as 40000Mb/s regardless of the underlying device. Report the correct link speed. Example from a XXV710 NIC. Before: $ ip link set ens3f0 vf 0 state auto $ ethtool enp8s2 | grep Speed Speed: 25000Mb/s $ ip link set ens3f0 vf 0 state enable $ ethtool enp8s2 | grep Speed Speed: 40000Mb/s After: $ ip link set ens3f0 vf 0 state auto $ ethtool enp8s2 | grep Speed Speed: 25000Mb/s $ ip link set ens3f0 vf 0 state enable $ ethtool enp8s2 | grep Speed Speed: 25000Mb/s Signed-off-by: Stefan Assmann <sassmann@kpanic.de> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-24i40e: remove redundant assignmentMarek Majtyka
Remove a redundant assignment of the software ring pointer in the i40e driver. The variable is assigned twice with no use in between, so just get rid of the first occurrence. Fixes: 3b4f0b66c2b3 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL") Signed-off-by: Marek Majtyka <marekx.majtyka@intel.com> Acked-by: Björn Töpel <bjorn.topel@intel.com> Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-11-23net: don't include ethtool.h from netdevice.hJakub Kicinski
linux/netdevice.h is included in very many places, touching any of its dependecies causes large incremental builds. Drop the linux/ethtool.h include, linux/netdevice.h just needs a forward declaration of struct ethtool_ops. Fix all the places which made use of this implicit include. Acked-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Shannon Nelson <snelson@pensando.io> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Link: https://lore.kernel.org/r/20201120225052.1427503-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>