summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/mellanox/mlx4
AgeCommit message (Collapse)Author
2014-09-26net/mlx4_core: Allow not to specify probe_vf in SRIOV IB modeMatan Barak
When the HCA is configured in SRIOV IB mode (that is, at least one of the ports is IB) and the probe_vf module param isn't specified, mlx4_init_one() failed because of the following condition: if (ib_ports && (num_vfs_argc > 1 || probe_vfs_argc > 1)) { ..... } The root cause for that is a mistake in the initialization of num_vfs_argc and probe_vfs_argc. When num_vfs / probe_vf aren't given, their argument count counterpart should be 0, fix that. Fixes: dd41cc3bb90e ('net/mlx4: Adapt num_vfs/probed_vf params for single port VF') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-23Merge tag 'rdma-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband/rdma fixes from Roland Dreier: "Last late set of InfiniBand/RDMA fixes for 3.17: - fixes for the new memory region re-registration support - iSER initiator error path fixes - grab bag of small fixes for the qib and ocrdma hardware drivers - larger set of fixes for mlx4, especially in RoCE mode" * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (26 commits) IB/mlx4: Fix VF mac handling in RoCE IB/mlx4: Do not allow APM under RoCE IB/mlx4: Don't update QP1 in native mode IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addresses IB/core: When marshaling uverbs path, clear unused fields IB/mlx4: Avoid executing gid task when device is being removed IB/mlx4: Fix lockdep splat for the iboe lock IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up IB/mlx4: Reorder steps in RoCE GID table initialization IB/mlx4: Don't duplicate the default RoCE GID IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs() IB/iser: Bump version to 1.4.1 IB/iser: Allow bind only when connection state is UP IB/iser: Fix RX/TX CQ resource leak on error flow RDMA/ocrdma: Use right macro in query AH RDMA/ocrdma: Resolve L2 address when creating user AH mlx4: Correct error flows in rereg_mr IB/qib: Correct reference counting in debugfs qp_stats IPoIB: Remove unnecessary port query ...
2014-09-22mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addressesJack Morgenstein
There is a chance that the VF mlx4 RoCE driver (mlx4_ib) may see a 0-mac as the current default MAC address when a RoCE interface first comes up. In this case, the RoCE driver registers the 0-mac to get its MAC index -- used in the INIT2RTR transition when it creates its proxy Q1 qp's. If we do not allow QP1 to be created, the RoCE driver will not come up. If we do not register the 0-mac, but simply use a random mac-index, QP1 will attempt to send packets with an someone's else source MAC which will get the system into more troubled. Since a 0-mac was previously used to indicate a free slot, this leads to errors, both when the 0-mac is registered and when it is unregistered. The required fix is to check in addition that the slot containing the 0-mac has a reference count of zero. Additionally, when comparing MAC addresses, need to mask out the 2 MSBs of the u64 mac on both sides of the comparison. Note that when the EN driver (mlx4_en) comes up, it set itself a proper mac --> the RoCE driver gets to be notified on that and further handing is done with the update qp command, as was added by commit 9433c188915c ("IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes"). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-22mlx4: Correct error flows in rereg_mrMatan Barak
This patch addresses feedback from Sagi Grimberg on the rereg_mr implementation of mlx4. The following are fixed: 1. Set the correct pd_flags 2. Make sure we change the iova and size MR fields only after successful write and allocation of the MTTs. 3. Make the error checking more robust Fixes: e630664c8383 ("mlx4_core: Add helper functions to support MR re-registration") Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-09-10net/mlx4: Set vlan stripping policy by the right commandMatan Barak
Changing the vlan stripping policy of the QP isn't supported by older firmware versions for the INIT2RTR command. Nevertheless, we've used it. Fix that by doing this policy change using INIT2RTR only if the firmware supports it, otherwise, we call UPDATE_QP command to do the task. Fixes: 7677fc9 ('net/mlx4: Strengthen VLAN tags/priorities enforcement in VST mode') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-10net/mlx4: Avoid dealing with MAC index in UPDATE_QP wrapper if not neededMatan Barak
The current wrapper implementation of the UPDATE_QP command tries to get the MAC index, even if MAC wasn't set by the VF. Fix it up to only handle the MAC field if it's valid. Fixes: ce8d9e0 ('net/mlx4_core: Add UPDATE_QP SRIOV wrapper support') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-10net/mlx4: Use the correct VSD mask in UPDATE_QPMatan Barak
When doing VGT->VST->VGT state changes, we used an incorrect mask for the vlan-stripping-disable (VSD) flag, hence the vlan related policy for user-space Raw Ethernet QPs open by VFs wasn't really applied. Fix that, by using the correct mask. Fixes: f0f829b ('net/mlx4_core: Add immediate activate for VGT->VST->VGT') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-10net/mlx4: Correctly configure single ported VFs from the hostMatan Barak
Single port VFs are seen PCI wise on both ports of the PF (we don't have single port PFs with ConnectX). With this in mind, it's possible for virtualization tools to try and configure a single ported VF through the "wrong" PF port. To handle that, we use the PF driver mapping of single port VFs to NIC ports and adjust the port value before calling into the low level code that does the actual VF configuration Fixes: 449fc48 ('net/mlx4: Adapt code for N-Port VF') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-09-08net/mlx4_en: do not ignore autoneg in mlx4_en_set_pauseparam()Ivan Vecera
The driver does not support pause autonegotiation so it should return -EINVAL when the function is called with non-zero autoneg. Cc: Amir Vadai <amirv@mellanox.com> Signed-off-by: Ivan Vecera <ivecera@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-29net/mlx4: Move the tunnel steering helper function to mlx4_coreOr Gerlitz
Move the function which we use to set VXLAN DMFS (flow-steering) rules from mlx4_en to mlx4_core. This refactoring will allow the mlx4_ib driver to call the helper for the use case of user-space RAW Ethernet QPs, such that they can serve VXLAN traffic too. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-08-14Merge tag 'pci-v3.17-changes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull DEFINE_PCI_DEVICE_TABLE removal from Bjorn Helgaas: "Part two of the PCI changes for v3.17: - Remove DEFINE_PCI_DEVICE_TABLE macro use (Benoit Taine) It's a mechanical change that removes uses of the DEFINE_PCI_DEVICE_TABLE macro. I waited until later in the merge window to reduce conflicts, but it's possible you'll still see a few" * tag 'pci-v3.17-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Remove DEFINE_PCI_DEVICE_TABLE macro use
2014-08-14Merge tag 'rdma-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull infiniband/rdma updates from Roland Dreier: "Main set of InfiniBand/RDMA updates for 3.17 merge window: - MR reregistration support - MAD support for RMPP in userspace - iSER and SRP initiator updates - ocrdma hardware driver updates - other fixes..." * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (52 commits) IB/srp: Fix return value check in srp_init_module() RDMA/ocrdma: report asic-id in query device RDMA/ocrdma: Update sli data structure for endianness RDMA/ocrdma: Obtain SL from device structure RDMA/uapi: Include socket.h in rdma_user_cm.h IB/srpt: Handle GID change events IB/mlx5: Use ARRAY_SIZE instead of sizeof/sizeof[0] IB/mlx4: Use ARRAY_SIZE instead of sizeof/sizeof[0] RDMA/amso1100: Check for integer overflow in c2_alloc_cq_buf() IPoIB: Remove unnecessary test for NULL before debugfs_remove() IB/mad: Add user space RMPP support IB/mad: add new ioctl to ABI to support new registration options IB/mad: Add dev_notice messages for various umad/mad registration failures IB/mad: Update module to [pr|dev]_* style print messages IB/ipoib: Avoid multicast join attempts with invalid P_key IB/umad: Update module to [pr|dev]_* style print messages IB/ipoib: Avoid flushing the workqueue from worker context IB/ipoib: Use P_Key change event instead of P_Key polling mechanism IB/ipath: Add P_Key change event support mlx4_core: Add support for secure-host and SMP firewall ...
2014-08-14Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'iwcm', 'mad', 'misc', ↵Roland Dreier
'mlx4', 'mlx5', 'ocrdma' and 'srp' into for-next
2014-08-12PCI: Remove DEFINE_PCI_DEVICE_TABLE macro useBenoit Taine
We should prefer `struct pci_device_id` over `DEFINE_PCI_DEVICE_TABLE` to meet kernel coding style guidelines. This issue was reported by checkpatch. A simplified version of the semantic patch that makes this change is as follows (http://coccinelle.lip6.fr/): // <smpl> @@ identifier i; declarer name DEFINE_PCI_DEVICE_TABLE; initializer z; @@ - DEFINE_PCI_DEVICE_TABLE(i) + const struct pci_device_id i[] = z; // </smpl> [bhelgaas: add semantic patch] Signed-off-by: Benoit Taine <benoit.taine@lip6.fr> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-08-05mlx4_core: Add support for secure-host and SMP firewallJack Morgenstein
Secure-host is the general term for the capability of a device to protect itself and the subnet from malicious host software. This is achieved by: 1. Not allowing un-trusted entities to access device configuration registers, directly (through pci_cr or pci_conf) and indirectly (through MADs). 2. Hiding M_Key from untrusted entities. 3. Preventing the modification of GUID0 by un-trusted entities 4. Not allowing drivers on untrusted hosts to receive nor to transmit packets over QP0 (SMP Firewall). The secure-host capability depends on firmware handling all QP0 packets, and not passing these packets up to the driver. Any information required by the driver for proper operation (e.g., SM lid) is passed via events generated by the firmware while processing QP0 MADs. Driver support mainly requires using the MAD_DEMUX FW command at startup, where the feature is enabled/disabled through a procedure described in the Mellanox HCA tools package. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> [ Fix error path in mlx4_setup_hca to go to err_mcg_table_free. - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-08-01mlx4_core: Add helper functions to support MR re-registrationMatan Barak
Add few helper functions to support a mechanism of getting an MPT, modifying it and updating the HCA with the modified object. The code takes 2 paths, one for directly changing the MPT (and sometimes its related MTTs) and another one which queries the MPT and updates the HCA via fw command SW2HW_MPT. The first path is used in native mode; the second path is slower and is used only in SRIOV. Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-07-24net/mlx4_en: mlx4_en_[gs]et_priv_flags() can be staticFengguang Wu
Fixes sparse warning intrduced by commit 0fef9d0 ("net/mlx4_en: Disable blueflame using ethtool private flags") Signed-off-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22net/mlx4_en: Reduce memory consumption on kdump kernelAmir Vadai
When memory is limited, reduce number of rx and tx rings. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22net/mlx4_core: Use low memory profile on kdump kernelAmir Vadai
When running in kdump kernel, reduce number of resources allocated for the hardware. This will enable the NIC to operate in this low memory environment at the expense of performance and some features not related to the basic NIC functionality. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22net/mlx4_en: Disable blueflame using ethtool private flagsAmir Vadai
Enable the user to turn off the hardware feature called BlueFlame. Since it is something specific to mlx4_en hardware, we control the feature via ethtool private flags. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22net/mlx4_en: current_mac isn't updated in port upEyal Perry
When port is down dev_addr is changed (e.g. by bonding) but current_mac is not touched. When port is up again, hash_mac is updated to dev_addr, but current_mac isn't. This leads to inconsistency between current_mac and mac_hash. Because of that, mlx4_en_replace_mac() fails to find current_mac in mac_hash. Fix is to reset current_mac to dev_addr when port is up - as we do for mac_hash. Signed-off-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/infiniband/hw/cxgb4/device.c The cxgb4 conflict was simply overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16net/mlx4_en: cq->irq_desc wasn't set in legacy EQ'sAmir Vadai
Fix a regression introduced by commit 35f6f45 ("net/mlx4_en: Don't use irq_affinity_notifier to track changes in IRQ affinity map"). When core is started in legacy EQ's (number of IRQ's < rx rings), cq->irq_desc was NULL. This caused a kernel crash under heavy traffic - when having more than rx NAPI budget completions. Fixed to have it set for both EQ modes. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16net/mlx4_core: Remove MCG in case it is attached to promiscuous QPs onlySaeed Mahameed
In B0 steering mode if promiscuous QP asks to be detached from MCG entry, and it is the only one in this entry then the entry will never be deleted. This is a wrong behavior since we don't want to keep those entries after the promiscuous QP becomes non-promiscuous. Therefore remove steering entry containing only promiscuous QP. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16net/mlx4_core: In SR-IOV mode host should add promisc QP to default entry onlyEugenia Emantayev
In current situation host is adding the promiscuous QP to all steering entries and the default entry as well. In this case when having PV and SR-IOV on the same setup bridge will receive all traffic that is targeted to the other VMs. This is bad. Solution: In SR-IOV mode host can add promiscuous QP to default entry only. The above problem and fix are relevant for B0 steering mode only. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16net/mlx4_core: Make sure the max number of QPs per MCG isn't exceededAlexander Guller
In B0 steering mode when adding QPs to the default MCG entry need to check that maximal number of QPs per MCG entry was not exceeded. Signed-off-by: Alexander Guller <alexg@mellanox.com> Reviewed-by: Aviad Yehezkel <aviadye@mellanox.co.il> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16net/mlx4_core: Make sure that negative array index isn't usedDotan Barak
To make sure that the array index isn't used in the code with negative value, we stop using the for loop integer iterator outside of it. >From now on use members count to swap the last QP with removed one. Fix also the second occurrence of this flow in mlx4_qp_detach_common(). In mlx4_qp_detach_common() use members_count instead of loop iterator outside of the for loop. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Reviewed-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16net/mlx4_core: Fix leakage of SW multicast entriesYevgeny Petrilin
When removing multicast address in B0 steering mode there is a bug in cases where there is a single QP registered for the address, and this QP is also promiscuous. In such cases the entry wouldn't be deleted from the SW structure representing all Ethernet MCG entries, but would be removed in HW. This way when driver goes to remove it from SW and HW structures the HW deletion fails. Moreover the same index could later be used for registering different address, which can be Infiniband. Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-14mlx4: mark napi id for gro_skbJason Wang
Napi id was not marked for gro_skb, this will lead rx busy loop won't work correctly since they stack never try to call low latency receive method because of a zero socket napi id. Fix this by marking napi id for gro_skb. The transaction rate of 1 byte netperf tcp_rr gets about 50% increased (from 20531.68 to 30610.88). Cc: Amir Vadai <amirv@mellanox.com> Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08net/mlx4_en: Ignore budget on TX napi pollingAmir Vadai
It is recommended that TX work not count against the quota. The cost of TX packet liberation is a minute percentage of what it costs to process an RX frame. Furthermore, that SKB freeing makes memory available for other paths in the stack. Give the TX a larger budget and be more aggressive about cleaning up the Tx descriptors this budget could be changed using ethtool: $ ethtool -C eth1 tx-frames-irq <budget> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08net/mlx4_en: Fix mac_hash database inconsistencyNoa Osherovich
Using a local copy of dev_addr in mlx4_en_set_mac() to prevent dev_addr from being modified during error flow or when dev_addr is modified in another context (which is another problem that is being discussed over the mailing list [1]). Also fixing bad naming of priv->prev_mac into priv->current_mac. [1] - http://patchwork.ozlabs.org/patch/351489/ Reviewed-by: Eyal Perry <eyalpe@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08net/mlx4_en: Do not count LLC/SNAP in MTU calculationYishai Hadas
LLC/SNAP 8 bytes should not be added as part of header calculation. If used, payload will be decreased accordingly. For MTU of 1500 we'll set 1522 instead of 1523. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Liran Liss <liranl@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08net/mlx4_en: Do not disable vlan filter during promiscuous modeEugenia Emantayev
Promiscous mode is only for MACs. Should not disable/enable VLAN filter when entering/leaving promisuous mode. Signed-off-by: Aviad Yehezkel <aviadye@mellanox.co.il> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08net/mlx4: Verify port number in __mlx4_unregister_macEugenia Emantayev
Verify port number to avoid crashes if port number is outside the range. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08net/mlx4_en: Run loopback test only when port is upEugenia Emantayev
Loopback can't work when port is down. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-08net/mlx4_en: Fix set port ratelimit for 40GEEugenia Emantayev
In 40GE we can't use the default bw units for set ratelimit (100 Mbps) since the max is 255*100 Mbps = 25 Gbps (not suited for 40GE), thus we need 1 Gbps units. But for 10GE 1 Gbps units might be too bruit so we use the following solution. For user set ratelimit <= 25 Gbps: use 100 Mbps units * user_ratelimit (* 10). For user set ratelimit > 25 Gbps: use 1 Gbps units * user_ratelimit. For user set unlimited ratelimit (0 Gbps): use 1 Gbps units * MAX_RATELIMIT_DEFAULT (57) Note: any value > 58 will damage the FW ratelimit computation, so we allow a max and any higher value will be pulled down to 57. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-07net/mlx4_en: Don't configure the HW vxlan parser when vxlan offloading isn't setOr Gerlitz
The add_vxlan_port ndo driver code was wrongly testing whether HW vxlan offloads are supported by the device instead of checking if they are currently enabled. This causes the driver to configure the HW parser to conduct matching for vxlan packets but since no steering rules were set, vxlan packets are dropped on RX. Fix that by doing the right test, as done in the del_vxlan_port ndo handler. Fixes: 1b136de ('net/mlx4: Implement vxlan ndo calls') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02net/mlx4_en: IRQ affinity hint is not cleared on port downAmir Vadai
Need to remove affinity hint at mlx4_en_deactivate_cq() and not at mlx4_en_destroy_cq() - since affinity_mask might be free'd while still being used by procfs. Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-07-02net/mlx4_en: Don't use irq_affinity_notifier to track changes in IRQ ↵Amir Vadai
affinity map IRQ affinity notifier can only have a single notifier - cpu_rmap notifier. Can't use it to track changes in IRQ affinity map. Detect IRQ affinity changes by comparing CPU to current IRQ affinity map during NAPI poll thread. CC: Thomas Gleixner <tglx@linutronix.de> CC: Ben Hutchings <ben@decadent.org.uk> Fixes: 2eacc23 ("net/mlx4_core: Enforce irq affinity changes immediatly") Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-22net/mlx4_core: Fix the error flow when probing with invalid VF configurationOr Gerlitz
Single ported VF are currently not supported on configurations where one or both ports are IB. When we hit this case, the relevant flow in the driver didn't return error and jumped to the wrong label. Fix that. Fixes: dd41cc3 ('net/mlx4: Adapt num_vfs/probed_vf params for single port VF') Reported-by: Shirley Ma <shirley.ma@oracle.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-12Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) Seccomp BPF filters can now be JIT'd, from Alexei Starovoitov. 2) Multiqueue support in xen-netback and xen-netfront, from Andrew J Benniston. 3) Allow tweaking of aggregation settings in cdc_ncm driver, from Bjørn Mork. 4) BPF now has a "random" opcode, from Chema Gonzalez. 5) Add more BPF documentation and improve test framework, from Daniel Borkmann. 6) Support TCP fastopen over ipv6, from Daniel Lee. 7) Add software TSO helper functions and use them to support software TSO in mvneta and mv643xx_eth drivers. From Ezequiel Garcia. 8) Support software TSO in fec driver too, from Nimrod Andy. 9) Add Broadcom SYSTEMPORT driver, from Florian Fainelli. 10) Handle broadcasts more gracefully over macvlan when there are large numbers of interfaces configured, from Herbert Xu. 11) Allow more control over fwmark used for non-socket based responses, from Lorenzo Colitti. 12) Do TCP congestion window limiting based upon measurements, from Neal Cardwell. 13) Support busy polling in SCTP, from Neal Horman. 14) Allow RSS key to be configured via ethtool, from Venkata Duvvuru. 15) Bridge promisc mode handling improvements from Vlad Yasevich. 16) Don't use inetpeer entries to implement ID generation any more, it performs poorly, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1522 commits) rtnetlink: fix userspace API breakage for iproute2 < v3.9.0 tcp: fixing TLP's FIN recovery net: fec: Add software TSO support net: fec: Add Scatter/gather support net: fec: Increase buffer descriptor entry number net: fec: Factorize feature setting net: fec: Enable IP header hardware checksum net: fec: Factorize the .xmit transmit function bridge: fix compile error when compiling without IPv6 support bridge: fix smatch warning / potential null pointer dereference via-rhine: fix full-duplex with autoneg disable bnx2x: Enlarge the dorq threshold for VFs bnx2x: Check for UNDI in uncommon branch bnx2x: Fix 1G-baseT link bnx2x: Fix link for KR with swapped polarity lane sctp: Fix sk_ack_backlog wrap-around problem net/core: Add VF link state control policy net/fsl: xgmac_mdio is dependent on OF_MDIO net/fsl: Make xgmac_mdio read error message useful net_sched: drr: warn when qdisc is not work conserving ...
2014-06-11net/mlx4_en: Use affinity hintYuval Atias
The “affinity hint” mechanism is used by the user space daemon, irqbalancer, to indicate a preferred CPU mask for irqs. Irqbalancer can use this hint to balance the irqs between the cpus indicated by the mask. We wish the HCA to preferentially map the IRQs it uses to numa cores close to it. To accomplish this, we use cpumask_set_cpu_local_first(), that sets the affinity hint according the following policy: First it maps IRQs to “close” numa cores. If these are exhausted, the remaining IRQs are mapped to “far” numa cores. Signed-off-by: Yuval Atias <yuvala@mellanox.com> Signed-off-by: Amir Vadai <amirv@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net/mlx4_core: Keep only one driver entry release mlx4_privWei Yang
Following commit befdf89 "net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()", there are two mlx4 pci callbacks which will attempt to release the mlx4_priv object -- .shutdown and .remove. This leads to a use-after-free access to the already freed mlx4_priv instance and trigger a "Kernel access of bad area" crash when both .shutdown and .remove are called. During reboot or kexec, .shutdown is called, with the VFs probed to the host going through shutdown first and then the PF. Later, the PF will trigger VFs' .remove since VFs still have driver attached. Fix that by keeping only one driver entry which releases mlx4_priv. Fixes: befdf89 ('net/mlx4_core: Preserve pci_dev_data after __mlx4_remove_one()') CC: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Wei Yang <weiyang@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-11net/mlx4_core: Fix SRIOV free-pool management when enforcing resource quotasJack Morgenstein
The Hypervisor driver tracks free slots and reserved slots at the global level and tracks allocated slots and guaranteed slots per VF. Guaranteed slots are treated as reserved by the driver, so the total reserved slots is the sum of all guaranteed slots over all the VFs. As VFs allocate resources, free (global) is decremented and allocated (per VF) is incremented for those resources. However, reserved (global) is never changed. This means that effectively, when a VF allocates a resource from its guaranteed pool, it is actually reducing that resource's free pool (since the global reserved count was not also reduced). The fix for this problem is the following: For each resource, as long as a VF's allocated count is <= its guaranteed number, when allocating for that VF, the reserved count (global) should be reduced by the allocation as well. When the global reserved count reaches zero, the remaining global free count is still accessible as the free pool for that resource. When the VF frees resources, the reverse happens: the global reserved count for a resource is incremented only once the VFs allocated number falls below its guaranteed number. This fix was developed by Rick Kready <kready@us.ibm.com> Reported-by: Rick Kready <kready@us.ibm.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-10Merge tag 'rdma-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband Pull main InfiniBand/RDMA updates from Roland Dreier: - add iWARP port mapper to avoid conflicts between RDMA and normal stack TCP connections. - fixes for i386 / x86-64 structure padding differences (ABI compatibility for 32-on-64) from Yann Droneaud. - a pile of SRP initiator fixes from Bart Van Assche. - fixes for a writeback / memory allocation deadlock with NFS over IPoIB connected mode from Jiri Kosina. - the usual fixes and cleanups to mlx4, mlx5, cxgb4 and other low-level drivers. * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (61 commits) RDMA/cxgb4: Add support for iWARP Port Mapper user space service RDMA/nes: Add support for iWARP Port Mapper user space service RDMA/core: Add support for iWARP Port Mapper user space service IB/mlx4: Fix gfp passing in create_qp_common() IB/umad: Fix use-after-free on close IB/core: Fix kobject leak on device register error flow RDMA/cxgb4: add missing padding at end of struct c4iw_alloc_ucontext_resp mlx4_core: Fix GFP flags parameters to be gfp_t IB/core: Fix port kobject deletion during error flow IB/core: Remove unneeded kobject_get/put calls IB/core: Fix sparse warnings about redeclared functions IB/mad: Fix sparse warning about gfp_t use IB/mlx4: Implement IB_QP_CREATE_USE_GFP_NOIO IB: Add a QP creation flag to use GFP_NOIO allocations IB: Return error for unsupported QP creation flags IB: Allow build of hw/ and ulp/ subdirectories independently mlx4_core: Move handling of MLX4_QP_ST_MLX to proper switch statement RDMA/cxgb4: Add missing padding at end of struct c4iw_create_cq_resp IB/srp: Avoid problems if a header uses pr_fmt IB/umad: Fix error handling ...
2014-06-10Merge branches 'core', 'cxgb3', 'cxgb4', 'iser', 'iwpm', 'misc', 'mlx4', ↵Roland Dreier
'mlx5', 'noio', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next
2014-06-06net: use SPEED_UNKNOWN and DUPLEX_UNKNOWN when appropriateJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-06-04mlx4_core: Fix GFP flags parameters to be gfp_tRoland Dreier
Otherwise sparse gives a bunch of warnings like drivers/net/ethernet/mellanox/mlx4/srq.c:110:66: sparse: incorrect type in argument 4 (different base types) drivers/net/ethernet/mellanox/mlx4/srq.c:110:66: expected int [signed] gfp drivers/net/ethernet/mellanox/mlx4/srq.c:110:66: got restricted gfp_t Signed-off-by: Roland Dreier <roland@purestorage.com>
2014-06-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: include/net/inetpeer.h net/ipv6/output_core.c Changes in net were fixing bugs in code removed in net-next. Signed-off-by: David S. Miller <davem@davemloft.net>