summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/intel/ice/ice_base.c
AgeCommit message (Collapse)Author
2021-12-17ice: xsk: allocate separate memory for XDP SW ringMaciej Fijalkowski
Currently, the zero-copy data path is reusing the memory region that was initially allocated for an array of struct ice_rx_buf for its own purposes. This is error prone as it is based on the ice_rx_buf struct always being the same size or bigger than what the zero-copy path needs. There can also be old values present in that array giving rise to errors when the zero-copy path uses it. Fix this by freeing the ice_rx_buf region and allocating a new array for the zero-copy path that has the right length and is initialized to zero. Fixes: 57f7f8b6bc0b ("ice: Use xdp_buf instead of rx_buf for xsk zero-copy") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-11-03ice: Fix not stopping Tx queues for VFsBrett Creeley
When a VF is removed and/or reset its Tx queues need to be stopped from the PF. This is done by calling the ice_dis_vf_qs() function, which calls ice_vsi_stop_lan_tx_rings(). Currently ice_dis_vf_qs() is protected by the VF state bit ICE_VF_STATE_QS_ENA. Unfortunately, this is causing the Tx queues to not be disabled in some cases and when the VF tries to re-enable/reconfigure its Tx queues over virtchnl the op is failing. This is because a VF can be reset and/or removed before the ICE_VF_STATE_QS_ENA bit is set, but the Tx queues were already configured via ice_vsi_cfg_single_txq() in the VIRTCHNL_OP_CONFIG_VSI_QUEUES op. However, the ICE_VF_STATE_QS_ENA bit is set on a successful VIRTCHNL_OP_ENABLE_QUEUES, which will always happen after the VIRTCHNL_OP_CONFIG_VSI_QUEUES op. This was causing the following error message when loading the ice driver, creating VFs, and modifying VF trust in an endless loop: [35274.192484] ice 0000:88:00.0: Failed to set LAN Tx queue context, error: ICE_ERR_PARAM [35274.193074] ice 0000:88:00.0: VF 0 failed opcode 6, retval: -5 [35274.193640] iavf 0000:88:01.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6 Fix this by always calling ice_dis_vf_qs() and silencing the error message in ice_vsi_stop_tx_ring() since the calling code ignores the return anyway. Also, all other places that call ice_vsi_stop_tx_ring() catch the error, so this doesn't affect those flows since there was no change to the values the function returns. Other solutions were considered (i.e. tracking which VF queues had been "started/configured" in VIRTCHNL_OP_CONFIG_VSI_QUEUES, but it seemed more complicated than it was worth. This solution also brings in the chance for other unexpected conditions due to invalid state bit checks. So, the proposed solution seemed like the best option since there is no harm in failing to stop Tx queues that were never started. This issue can be seen using the following commands: for i in {0..50}; do rmmod ice modprobe ice sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on sleep 2 echo 0 > /sys/class/net/ens785f0/device/sriov_numvfs echo 0 > /sys/class/net/ens785f1/device/sriov_numvfs sleep 1 echo 1 > /sys/class/net/ens785f0/device/sriov_numvfs echo 1 > /sys/class/net/ens785f1/device/sriov_numvfs ip link set ens785f1 vf 0 trust on ip link set ens785f0 vf 0 trust on done Fixes: 77ca27c41705 ("ice: add support for virtchnl_queue_select.[tx|rx]_queues bitmap") Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-20ice: Add infrastructure for mqprio support via ndo_setup_tcKiran Patil
Add infrastructure required for "ndo_setup_tc:qdisc_mqprio". ice_vsi_setup is modified to configure traffic classes based on mqprio data received from the stack. This includes low-level functions to configure min, max rate-limit parameters in hardware for traffic classes. Each traffic class gets mapped to a hardware channel (VSI) which can be individually configured with different bandwidth parameters. Co-developed-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Tarun Singh <tarun.k.singh@intel.com> Signed-off-by: Kiran Patil <kiran.patil@intel.com> Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-15ice: split ice_ring onto Tx/Rx separate structsMaciej Fijalkowski
While it was convenient to have a generic ring structure that served both Tx and Rx sides, next commits are going to introduce several Tx-specific fields, so in order to avoid hurting the Rx side, let's pull out the Tx ring onto new ice_tx_ring and ice_rx_ring structs. Rx ring could be handled by the old ice_ring which would reduce the code churn within this patch, but this would make things asymmetric. Make the union out of the ring container within ice_q_vector so that it is possible to iterate over newly introduced ice_tx_ring. Remove the @size as it's only accessed from control path and it can be calculated pretty easily. Change definitions of ice_update_ring_stats and ice_fetch_u64_stats_per_ring so that they are ring agnostic and can be used for both Rx and Tx rings. Sizes of Rx and Tx ring structs are 256 and 192 bytes, respectively. In Rx ring xdp_rxq_info occupies its own cacheline, so it's the major difference now. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-15ice: move ice_container_type onto ice_ring_containerMaciej Fijalkowski
Currently ice_container_type is scoped only for ice_ethtool.c. Next commit that will split the ice_ring struct onto Rx/Tx specific ring structs is going to also modify the type of linked list of rings that is within ice_ring_container. Therefore, the functions that are taking the ice_ring_container as an input argument will need to be aware of a ring type that will be looked up. Embed ice_container_type within ice_ring_container and initialize it properly when allocating the q_vectors. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-10-07ice: introduce new type of VSI for switchdevGrzegorz Nitka
New type of VSI has to be defined for switchdev control plane VSI. Number of allocated Tx and Rx queue has to be equal to amount of VFs, because each port representor should have one Tx and Rx queue. Also to not increase number of used irqs too much, control plane VSI uses only one q_vector and handle all queues in one irq. To allow handling all queues in one irq , new function to clean msix for eswitch was introduced. This function will schedule napi for each representor instead of scheduling it only for one like in normal clean irq function. Only one additional msix has to be requested. Always try to request it in ice_ena_msix_range function. Signed-off-by: Grzegorz Nitka <grzegorz.nitka@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11ice: enable transmit timestamps for E810 devicesJacob Keller
Add support for enabling Tx timestamp requests for outgoing packets on E810 devices. The ice hardware can support multiple outstanding Tx timestamp requests. When sending a descriptor to hardware, a Tx timestamp request is made by setting a request bit, and assigning an index that represents which Tx timestamp index to store the timestamp in. Hardware makes no effort to synchronize the index use, so it is up to software to ensure that Tx timestamp indexes are not re-used before the timestamp is reported back. To do this, introduce a Tx timestamp tracker which will keep track of currently in-use indexes. In the hot path, if a packet has a timestamp request, an index will be requested from the tracker. Unfortunately, this does require a lock as the indexes are shared across all queues on a PHY. There are not enough indexes to reliably assign only 1 to each queue. For the E810 devices, the timestamp indexes are not shared across PHYs, so each port can have its own tracking. Once hardware captures a timestamp, an interrupt is fired. In this interrupt, trigger a new work item that will figure out which timestamp was completed, and report the timestamp back to the stack. This function loops through the Tx timestamp indexes and checks whether there is now a valid timestamp. If so, it clears the PHY timestamp indication in the PHY memory, locks and removes the SKB and bit in the tracker, then reports the timestamp to the stack. It is possible in some cases that a timestamp request will be initiated but never completed. This might occur if the packet is dropped by software or hardware before it reaches the PHY. Add a task to the periodic work function that will check whether a timestamp request is more than a few seconds old. If so, the timestamp index is cleared in the PHY, and the SKB is released. Just as with Rx timestamps, the Tx timestamps are only 40 bits wide, and use the same overall logic for extending to 64 bits of nanoseconds. With this change, E810 devices should be able to perform basic PTP functionality. Future changes will extend the support to cover the E822-based devices. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-11ice: enable receive hardware timestampingJacob Keller
Add SIOCGHWTSTAMP and SIOCSHWTSTAMP ioctl handlers to respond to requests to enable timestamping support. If the request is for enabling Rx timestamps, set a bit in the Rx descriptors to indicate that receive timestamps should be reported. Hardware captures receive timestamps in the PHY which only captures part of the timer, and reports only 40 bits into the Rx descriptor. The upper 32 bits represent the contents of GLTSYN_TIME_L at the point of packet reception, while the lower 8 bits represent the upper 8 bits of GLTSYN_TIME_0. The networking and PTP stack expect 64 bit timestamps in nanoseconds. To support this, implement some logic to extend the timestamps by using the full PHC time. If the Rx timestamp was captured prior to the PHC time, then the real timestamp is PHC - (lower_32_bits(PHC) - timestamp) If the Rx timestamp was captured after the PHC time, then the real timestamp is PHC + (timestamp - lower_32_bits(PHC)) These calculations are correct as long as neither the PHC timestamp nor the Rx timestamps are more than 2^32-1 nanseconds old. Further, we can detect when the Rx timestamp is before or after the PHC as long as the PHC timestamp is no more than 2^31-1 nanoseconds old. In that case, we calculate the delta between the lower 32 bits of the PHC and the Rx timestamp. If it's larger than 2^31-1 then the Rx timestamp must have been captured in the past. If it's smaller, then the Rx timestamp must have been captured after PHC time. Add an ice_ptp_extend_32b_ts function that relies on a cached copy of the PHC time and implements this algorithm to calculate the proper upper 32bits of the Rx timestamps. Cache the PHC time periodically in all of the Rx rings. This enables each Rx ring to simply call the extension function with a recent copy of the PHC time. By ensuring that the PHC time is kept up to date periodically, we ensure this algorithm doesn't use stale data and produce incorrect results. To cache the time, introduce a kworker and a kwork item to periodically store the Rx time. It might seem like we should use the .do_aux_work interface of the PTP clock. This doesn't work because all PFs must cache this time, but only one PF owns the PTP clock device. Thus, the ice driver will manage its own kthread instead of relying on the PTP do_aux_work handler. With this change, the driver can now report Rx timestamps on all incoming packets. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-06-07ice: Refactor ice_setup_rx_ctxKrzysztof Kazimierczak
Move AF_XDP logic and buffer allocation out of ice_setup_rx_ctx() to a new function ice_vsi_cfg_rxq(), so the function actually sets up the Rx context. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Co-developed-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com>
2021-04-14ice: refactor ITR data structuresJesse Brandeburg
Use a dedicated bitfield in order to both increase the amount of checking around the length of ITR writes as well as simplify the checks of dynamic mode. Basically unpack the "high bit means dynamic" logic into bitfields. Also, remove some unused ITR defines. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-04-14ice: refactor interrupt moderation writesJesse Brandeburg
Introduce several new helpers for writing ITR and GLINT_RATE registers, and refactor the code calling them. This resulted in removal of several duplicate functions and rolled a bunch of simple code back into the calling routines. In particular this removes some code that was doing both a store and a set in a helper function, which seems better done as separate tasks in the caller (and generally takes less lines of code even with a tiny bit of repetition). Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: Add Support for XPSBenita Bose
Enable and configure XPS. The driver code implemented sets up the Transmit Packet Steering Map, which in turn will be used by the kernel in queue selection during Tx. Signed-off-by: Benita Bose <benita.bose@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-12ice: move headroom initialization to ice_setup_rx_ctxMaciej Fijalkowski
ice_rx_offset(), that is supposed to initialize the Rx buffer headroom, relies on ICE_RX_FLAGS_RING_BUILD_SKB flag as well as XDP prog presence. Currently, the callsite of mentioned function is placed incorrectly within ice_setup_rx_ring() where Rx ring's build skb flag is not set yet. This causes the XDP_REDIRECT to be partially broken due to inability to create xdp_frame in the headroom space, as the headroom is 0. Fix this by moving ice_rx_offset() to ice_setup_rx_ctx() after the flag setting. Fixes: f1b1f409bf79 ("ice: store the result of ice_rx_offset() onto ice_ring") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-12ice: fix napi work done reporting in xsk pathMagnus Karlsson
Fix the wrong napi work done reporting in the xsk path of the ice driver. The code in the main Rx processing loop was written to assume that the buffer allocation code returns true if all allocations where successful and false if not. In contrast with all other Intel NIC xsk drivers, the ice_alloc_rx_bufs_zc() has the inverted logic messing up the work done reporting in the napi loop. This can be fixed either by inverting the return value from ice_alloc_rx_bufs_zc() in the function that uses this in an incorrect way, or by changing the return value of ice_alloc_rx_bufs_zc(). We chose the latter as it makes all the xsk allocation functions for Intel NICs behave in the same way. My guess is that it was this unexpected discrepancy that gave rise to this bug in the first place. Fixes: 5bb0c4b5eb61 ("ice, xsk: Move Rx allocation out of while-loop") Reported-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Tested-by: Kiran Bhandare <kiranx.bhandare@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-12-01xsk: Propagate napi_id to XDP socket Rx pathBjörn Töpel
Add napi_id to the xdp_rxq_info structure, and make sure the XDP socket pick up the napi_id in the Rx path. The napi_id is used to find the corresponding NAPI structure for socket busy polling. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/bpf/20201130185205.196029-7-bjorn.topel@gmail.com
2020-08-31xsk: i40e: ice: ixgbe: mlx5: Rename xsk zero-copy driver interfacesMagnus Karlsson
Rename the AF_XDP zero-copy driver interface functions to better reflect what they do after the replacement of umems with buffer pools in the previous commit. Mostly it is about replacing the umem name from the function names with xsk_buff and also have them take the a buffer pool pointer instead of a umem. The various ring functions have also been renamed in the process so that they have the same naming convention as the internal functions in xsk_queue.h. This so that it will be clearer what they do and also for consistency. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-3-git-send-email-magnus.karlsson@intel.com
2020-08-31xsk: i40e: ice: ixgbe: mlx5: Pass buffer pool to driver instead of umemMagnus Karlsson
Replace the explicit umem reference passed to the driver in AF_XDP zero-copy mode with the buffer pool instead. This in preparation for extending the functionality of the zero-copy mode so that umems can be shared between queues on the same netdev and also between netdevs. In this commit, only an umem reference has been added to the buffer pool struct. But later commits will add other entities to it. These are going to be entities that are different between different queue ids and netdevs even though the umem is shared between them. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Link: https://lore.kernel.org/bpf/1598603189-32145-2-git-send-email-magnus.karlsson@intel.com
2020-07-01ice: replace single-element array used for C struct hackBruce Allan
Convert the pre-C90-extension "C struct hack" method (using a single- element array at the end of a structure for implementing variable-length types) to the preferred use of C99 flexible array member. Additional code cleanups were done near areas affected by this change. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-05-31ice: Always clear QRXFLXP_CNTXT before writing new valueBrett Creeley
Always clear the previous value in QRXFLXP_CNTXT before writing a new value. This will make it so re-used queues will not accidentally take the previously configured settings. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27ice: Check UMEM FQ size when allocating bufsKrzysztof Kazimierczak
If a UMEM is present on a queue when an interface/queue pair is being enabled, the driver will try to prepare the Rx buffers in advance to improve performance. However, if fill queue is shorter than HW Rx ring, the driver will report failure after getting the last address from the fill queue. This still lets the driver process the packets correctly during the NAPI poll, but leads to a constant NAPI rescheduling. Not allocating the buffers in advance would result in a potential performance decrease. Commit d57d76428ae9 ("xsk: Add API to check for available entries in FQ") provides an API that lets drivers check the number of addresses that the fill queue holds. Notify the user if fill queue is not long enough to prepare all buffers before packet processing starts, and allocate the buffers during the NAPI poll. If the fill queue size is sufficient, prepare Rx buffers in advance. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27ice: avoid undefined behaviorBruce Allan
When writing the driver's struct ice_tlan_ctx structure, do not write the 8-bit element int_q_state with the associated internal-to-hardware field which is 122-bits, otherwise the helper function ice_write_byte() will use undefined behavior when setting the mask used for that write. This should not cause any functional change and will avoid use of undefined behavior. Also, update a comment to highlight this structure element is not written. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-27ice: fix signed vs unsigned comparisonsJesse Brandeburg
Fix the remaining signed vs unsigned issues, which appear when compiling with -Werror=sign-compare. Many of these are because there is an external interface that is passing an int to us (which we can't change) but that we (rightfully) store and compare against as an unsigned in our data structures. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-23Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2020-05-22 This series contains updates to virtchnl and the ice driver. Geert Uytterhoeven fixes a data structure alignment issue in the virtchnl structures. Henry adds Flow Director support which allows for the redirection on ntuple rules over six patches. Initially Henry adds the initial infrastructure for Flow Director, and then later adds IPv4 and IPv6 support, as well as being able to display the ntuple rules. Bret add Accelerated Receive Flow Steering (aRFS) support which is used to steer receive flows to a specific queue. Fixes a transmit timeout when the VF link transitions from up/down/up because the transmit and receive queue interrupts are not enabled as part of VF's link up. Fixed an issue when the default VF LAN address is changed and after reset the PF will attempt to add the new MAC, which fails because it already exists. This causes the VF to be disabled completely until it is removed and enabled via sysfs. Anirudh (Ani) makes a fix where the ice driver needs to call set_mac_cfg to enable jumbo frames, so ensure it gets called during initialization and after reset. Fix bad register reads during a register dump in ethtool by removing the bad registers. Paul fixes an issue where the receive Malicious Driver Detection (MDD) auto reset message was not being logged because it occurred after the VF reset. Victor adds a check for compatibility between the Dynamic Device Personalization (DDP) package and the NIC firmware to ensure that everything aligns. Jesse fixes a administrative queue string call with the appropriate error reporting variable. Also fixed the loop variables that are comparing or assigning signed against unsigned values. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-22ice: Initialize Flow Director resourcesHenry Tieman
Flow Director allows for redirection based on ntuple rules. Rules are programmed using the ethtool set-ntuple interface. Supported actions are redirect to queue and drop. Setup the initial framework to process Flow Director filters. Create and allocate resources to manage and program filters to the hardware. Filters are processed via a sideband interface; a control VSI is created to manage communication and process requests through the sideband. Upon allocation of resources, update the hardware tables to accept perfect filters. Signed-off-by: Henry Tieman <henry.w.tieman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2020-05-23 The following pull-request contains BPF updates for your *net-next* tree. We've added 50 non-merge commits during the last 8 day(s) which contain a total of 109 files changed, 2776 insertions(+), 2887 deletions(-). The main changes are: 1) Add a new AF_XDP buffer allocation API to the core in order to help lowering the bar for drivers adopting AF_XDP support. i40e, ice, ixgbe as well as mlx5 have been moved over to the new API and also gained a small improvement in performance, from Björn Töpel and Magnus Karlsson. 2) Add getpeername()/getsockname() attach types for BPF sock_addr programs in order to allow for e.g. reverse translation of load-balancer backend to service address/port tuple from a connected peer, from Daniel Borkmann. 3) Improve the BPF verifier is_branch_taken() logic to evaluate pointers being non-NULL, e.g. if after an initial test another non-NULL test on that pointer follows in a given path, then it can be pruned right away, from John Fastabend. 4) Larger rework of BPF sockmap selftests to make output easier to understand and to reduce overall runtime as well as adding new BPF kTLS selftests that run in combination with sockmap, also from John Fastabend. 5) Batch of misc updates to BPF selftests including fixing up test_align to match verifier output again and moving it under test_progs, allowing bpf_iter selftest to compile on machines with older vmlinux.h, and updating config options for lirc and v6 segment routing helpers, from Stanislav Fomichev, Andrii Nakryiko and Alan Maguire. 6) Conversion of BPF tracing samples outdated internal BPF loader to use libbpf API instead, from Daniel T. Lee. 7) Follow-up to BPF kernel test infrastructure in order to fix a flake in the XDP selftests, from Jesper Dangaard Brouer. 8) Minor improvements to libbpf's internal hashmap implementation, from Ian Rogers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-05-21ice: Fix casting issuesKarol Kolacinski
Change min() macros to min_t() which has compare type specified and it helps avoid precision loss. In some cases there was precision loss during calls or assignments. Some fields in structs were unnecessarily large and gave multiple warnings. There were also some minor type differences which are now fixed as well as some cases where a simple cast was needed. Callers were were passing data that is a u16 to ice_sched_cfg_node_bw_alloc() but the function was truncating that to a u8. Fix that by changing the function to take a u16. Signed-off-by: Karol Kolacinski <karol.kolacinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21ice: Provide more meaningful error messageLihong Yang
When printing the ice status or AQ error codes, instead of printing out the numerical value, provide the description of the error code. This provides more info about the issue than a number. Signed-off-by: Lihong Yang <lihong.yang@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-05-21ice, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOLBjörn Töpel
Remove MEM_TYPE_ZERO_COPY in favor of the new MEM_TYPE_XSK_BUFF_POOL APIs. v4->v5: Fixed "warning: Excess function parameter 'alloc' description in 'ice_alloc_rx_bufs_zc'" and "warning: Excess function parameter 'xdp' description in 'ice_construct_skb_zc'". (Jakub) Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Cc: intel-wired-lan@lists.osuosl.org Link: https://lore.kernel.org/bpf/20200520192103.355233-10-bjorn.topel@gmail.com
2020-02-19ice: add backslash-n to stringsJesse Brandeburg
There were several strings found without line feeds, fix them by adding a line feed, as is typical. Without this lotsofmessagescanbejumbledtogether. This patch has known checkpatch warnings from long lines for the NL_* messages, because checkpatch doesn't know how to ignore them. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-19ice: Always clear the QRXFLXP_CNTXT register for VF Rx queuesBrett Creeley
Currently when the PF reduces its number of channels via ethtool and then VFs are created there may be stale data for some of the Rx queues belonging to VFs. This happens when a VF reuses an Rx queue that was previously used by the PF. Specifically, the QRXFLXP_CNTXT register will have incorrect values. Fix this by always clearing the relevant values in the QRXFLXP_CNTXT register for VF queues. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-15ice: remove unnecessary fallthrough commentsBruce Allan
Fallthrough comments are used to explicitly indicate the code is intended to flow from one case statement to the next in a switch statement rather than break out of the switch statement. They are only needed when a case has one or more statements to execute before falling through to the next case, not when there is a list of cases for which the same statement(s) should be executed. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-15ice: Add support to enable/disable all Rx queues before waitingBrett Creeley
Currently when we enable/disable all Rx queues we do the following sequence for each Rx queue and then move to the next queue. 1. Enable/Disable the Rx queue via register write. 2. Read the configuration register to determine if the Rx queue was enabled/disabled successfully. In some cases enabling/disabling queue 0 fails because of step 2 above. Fix this by doing step 1 for all of the Rx queues and then step 2 for all of the Rx queues. Also, there are cases where we enable/disable a single queue (i.e. SR-IOV and XDP) so add a new function that does step 1 and 2 above with a read flush in between. This change also required a single Rx queue to be enabled/disabled with and without waiting for the change to propagate through hardware. Fix this by adding a boolean wait flag to the necessary functions. Also, add the keywords "one" and "all" to distinguish between enabling/disabling a single Rx queue and all Rx queues respectively. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12ice: Cleanup ice_vsi_alloc_q_vectorsAnirudh Venkataramanan
1. Remove local variable num_q_vectors and use vsi->num_q_vectors instead 2. Remove local variable pf and pass vsi->back to ice_pf_to_dev Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12ice: Make print statements more compactAnirudh Venkataramanan
Formatting strings in print function calls (like dev_info, dev_err, etc.) can exceed 80 columns without making checkpatch unhappy. So remove newlines where applicable and make print statements more compact. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-02-12ice: Use ice_pf_to_devAnirudh Venkataramanan
Use ice_pf_to_dev(pf) instead of &pf->pdev->dev Use ice_pf_to_dev(vsi->back) instead of &vsi->back->pdev->dev When a pointer to the pf instance is available, use ice_pf_to_dev instead of ice_hw_to_dev Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-03ice: Suppress Coverity warnings for xdp_rxq_info_regKrzysztof Kazimierczak
Coverity reports some of the calls to xdp_rxq_info_reg() as potential issues, because the driver does not check its return value. However, those calls are wrapped with "if (!xdp_rxq_info_is_reg(&ring->xdp_rxq))" and this check alone is enough to be sure that the function will never fail. All possible states of xdp_rxq_info are: - NEW, - REGISTERED, - UNREGISTERED, - UNUSED. The driver won't mark a queue as UNUSED under no circumstance, so the return value can be ignored safely. Add comments for Coverity right above calls to xdp_rxq_info_reg() to suppress the warnings. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-03ice: Add a boundary check in ice_xsk_umem()Krzysztof Kazimierczak
In ice_xsk_umem(), variable qid which is later used as an array index, is not validated for a possible boundary exceedance. Because of that, a calling function might receive an invalid address, which causes general protection fault when dereferenced. To address this, add a boundary check to see if qid is greater than the size of a UMEM array. Also, don't let user change vsi->num_xsk_umems just by trying to setup a second UMEM if its value is already set up (i.e. UMEM region has already been allocated for this VSI). While at it, make sure that ring->zca.free pointer is always zeroed out if there is no UMEM on a specified ring. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-03ice: Set default value for ITR in alloc functionMichal Swiatkowski
When the user sets itr_setting to zero from ethtool -C, the driver changes this value to default in ice_cfg_itr (for example after changing ring param). Remove code that sets default value in ice_cfg_itr and move it to place where the driver allocates q_vectors. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-22ice: Add ice_pf_to_dev(pf) macroBrett Creeley
We use &pf->dev->pdev all over the code. Add a simple macro to do this for us. When multiple de-references like this are being done add a local struct device variable. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ice: introduce frame padding computation logicMaciej Fijalkowski
Take into account the underlying architecture specific settings and based on that calculate the possible padding that can be supplied. Typically, for x86 and standard MTU size we will end up with 192 bytes of headroom. This is the same behavior as our other drivers have and we can dedicate it for XDP purposes. Furthermore, introduce the Rx ring flag for indicating whether build_skb is used on particular. Based on that invoke the routines for padding calculation. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ice: Add support for AF_XDPKrzysztof Kazimierczak
Add zero copy AF_XDP support. This patch adds zero copy support for Tx and Rx; code for zero copy is added to ice_xsk.h and ice_xsk.c. For Tx, implement ndo_xsk_wakeup. As with other drivers, reuse existing XDP Tx queues for this task, since XDP_REDIRECT guarantees mutual exclusion between different NAPI contexts based on CPU ID. In turn, a netdev can XDP_REDIRECT to another netdev with a different NAPI context, since the operation is bound to a specific core and each core has its own hardware ring. For Rx, allocate frames as MEM_TYPE_ZERO_COPY on queues that AF_XDP is enabled. Signed-off-by: Krzysztof Kazimierczak <krzysztof.kazimierczak@intel.com> Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ice: Add support for XDPMaciej Fijalkowski
Add support for XDP. Implement ndo_bpf and ndo_xdp_xmit. Upon load of an XDP program, allocate additional Tx rings for dedicated XDP use. The following actions are supported: XDP_TX, XDP_DROP, XDP_REDIRECT, XDP_PASS, and XDP_ABORTED. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ice: get rid of per-tc flow in Tx queue configuration routinesMaciej Fijalkowski
There's no reason for treating DCB as first class citizen when configuring the Tx queues and going through TCs. Reverse the logic and base the configuration logic on rings, which is the object of interest anyway. Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2019-11-04ice: Introduce ice_base.cAnirudh Venkataramanan
Remove a few uses of kernel configuration flags from ice_lib.c by introducing a new source file ice_base.c. Also move corresponding function prototypes from ice_lib.h to ice_base.h and include ice_base.h where required. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>