summaryrefslogtreecommitdiff
path: root/include
AgeCommit message (Collapse)Author
2015-08-30IB/core: Add RoCE GID table managementMatan Barak
RoCE GIDs are based on IP addresses configured on Ethernet net-devices which relate to the RDMA (RoCE) device port. Currently, each of the low-level drivers that support RoCE (ocrdma, mlx4) manages its own RoCE port GID table. As there's nothing which is essentially vendor specific, we generalize that, and enhance the RDMA core GID cache to do this job. In order to populate the GID table, we listen for events: (a) netdev up/down/change_addr events - if a netdev is built onto our RoCE device, we need to add/delete its IPs. This involves adding all GIDs related to this ndev, add default GIDs, etc. (b) inet events - add new GIDs (according to the IP addresses) to the table. For programming the port RoCE GID table, providers must implement the add_gid and del_gid callbacks. RoCE GID management requires us to state the associated net_device alongside the GID. This information is necessary in order to manage the GID table. For example, when a net_device is removed, its associated GIDs need to be removed as well. RoCE mandates generating a default GID for each port, based on the related net-device's IPv6 link local. In contrast to the GID based on the regular IPv6 link-local (as we generate GID per IP address), the default GID is also available when the net device is down (in order to support loopback). Locking is done as follows: The patch modify the GID table code both for new RoCE drivers implementing the add_gid/del_gid callbacks and for current RoCE and IB drivers that do not. The flows for updating the table are different, so the locking requirements are too. While updating RoCE GID table, protection against multiple writers is achieved via mutex_lock(&table->lock). Since writing to a table requires us to find an entry (possible a free entry) in the table and then modify it, this mutex protects both the find_gid and write_gid ensuring the atomicity of the action. Each entry in the GID cache is protected by rwlock. In RoCE, writing (usually results from netdev notifier) involves invoking the vendor's add_gid and del_gid callbacks, which could sleep. Therefore, an invalid flag is added for each entry. Updates for RoCE are done via a workqueue, thus sleeping is permitted. In IB, updates are done in write_lock_irq(&device->cache.lock), thus write_gid isn't allowed to sleep and add_gid/del_gid are not called. When passing net-device into/out-of the GID cache, the device is always passed held (dev_hold). The code uses a single work item for updating all RDMA devices, following a netdev or inet notifier. The patch moves the cache from being a client (which was incorrect, as the cache is part of the IB infrastructure) to being explicitly initialized/freed when a device is registered/removed. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30net/bonding: Export bond_option_active_slave_get_rcuMatan Barak
Some consumers of the netdev events API would like to know who is the active slave when a NETDEV_CHANGEUPPER or NETDEV_BONDING_FAILOVER events occur. For example, when managing RoCE GIDs, GIDs based on the bond's ips should only be set on the port which corresponds to active slave netdevice. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30net: Add info for NETDEV_CHANGEUPPER eventMatan Barak
Some consumers of NETDEV_CHANGEUPPER event would like to know which upper device was linked/unlinked and what operation was carried. Add information in the notifier info block for that purpose. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30net/ipv6: Export addrconf_ifid_eui48Matan Barak
For loopback purposes, RoCE devices should have a default GID in the port GID table, even when the interface is down. In order to do so, we use the IPv6 link local address which would have been genenrated for the related Ethernet netdevice when it goes up as a default GID. addrconf_ifid_eui48 is used to gernerate this address, export it. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Drop ib_alloc_fast_reg_mrSagi Grimberg
Fully replaced by a more generic and suitable ib_alloc_mr. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB: Modify ib_create_mr APISagi Grimberg
Use ib_alloc_mr with specific parameters. Change the existing callers. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Get rid of redundant verb ib_destroy_mrSagi Grimberg
This was added in a thought of uniting all mr allocation and deallocation routines but the fact is we have a single deallocation routine already, ib_dereg_mr. And, move mlx5_ib_destroy_mr specific logic into mlx5_ib_dereg_mr (includes only signature stuff for now). And, fixup the only callers (iser/isert) accordingly. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cm: Remove compare_data checksHaggai Eran
Now that there are no ib_cm clients using the compare_data feature for matching IB CM requests' private data, remove the compare_data parameter of ib_cm_listen and remove the code implementing the feature. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cm: Expose BTH P_Key in CM and SIDR request eventsHaggai Eran
The rdma_cm module will later use the P_Key from the BTH to de-mux requests. See discussion at: http://www.spinics.net/lists/netdev/msg336067.html Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Liran Liss <liranl@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cm: Share listening CM IDsHaggai Eran
Enabling network namespaces for RDMA CM will allow processes on different namespaces to listen on the same port. In order to leave namespace support out of the CM layer, this requires that multiple RDMA CM IDs will be able to share a single CM ID. This patch adds infrastructure to retrieve an existing listening ib_cm_id, based on its device and service ID, or create a new one if one does not already exist. It also adds a reference count for such instances (cm_id_private.listen_sharecount), and prevents cm_destroy_id from destroying a CM if it is still shared. See the relevant discussion [1]. [1] Re: [PATCH v3 for-next 05/13] IB/cm: Reference count ib_cm_ids http://www.spinics.net/lists/netdev/msg328860.html Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/cm: Expose service ID in request eventsHaggai Eran
Expose the service ID on an incoming CM or SIDR request to the event handler. This will allow the RDMA CM module to de-multiplex connection requests based on the information encoded in the service ID. Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: Find the network device matching connection parametersYotam Kenneth
In the case of IPoIB, and maybe in other cases, the network device is managed by an upper-layer protocol (ULP). In order to expose this network device to other users of the IB device, let ULPs implement a callback that returns network device according to connection parameters. The IB device and port, together with the P_Key and the GID should be enough to uniquely identify the ULP net device. However, in current kernels there can be multiple IPoIB interfaces created with the same GID. Furthermore, such configuration may be desireable to support ipvlan-like configurations for RDMA CM with IPoIB. To resolve the device in these cases the code will also take the IP address as an additional input. Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Yotam Kenneth <yotamke@mellanox.com> Signed-off-by: Shachar Raindel <raindel@mellanox.com> Signed-off-by: Guy Shapiro <guysh@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-30IB/core: lock client data with lists_rwsemHaggai Eran
An ib_client callback that is called with the lists_rwsem locked only for read is protected from changes to the IB client lists, but not from ib_unregister_device() freeing its client data. This is because ib_unregister_device() will remove the device from the device list with lists_rwsem locked for write, but perform the rest of the cleanup, including the call to remove() without that lock. Mark client data that is undergoing de-registration with a new going_down flag in the client data context. Lock the client data list with lists_rwsem for write in addition to using the spinlock, so that functions calling the callback would be able to lock only lists_rwsem for read and let callbacks sleep. Since ib_unregister_client() now marks the client data context, no need for remove() to search the context again, so pass the client data directly to remove() callbacks. Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-31drm/exynos: add macro to get the address of START_S regGustavo Padovan
This macro is need to get the value of the START shadow register, that will tell if an framebuffer is currently displayed on the screen or not. Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2015-08-30Merge remote-tracking branches 'asoc/topic/wm0010', 'asoc/topic/wm5100', ↵Mark Brown
'asoc/topic/wm5110', 'asoc/topic/wm8004' and 'asoc/topic/wm8731' into asoc-next
2015-08-30Merge remote-tracking branches 'asoc/topic/tas2552', 'asoc/topic/tas5086', ↵Mark Brown
'asoc/topic/tegra', 'asoc/topic/tlv' and 'asoc/topic/topology' into asoc-next
2015-08-30Merge remote-tracking branches 'asoc/topic/rcar', 'asoc/topic/reg-default', ↵Mark Brown
'asoc/topic/rl6231', 'asoc/topic/rockchip' and 'asoc/topic/rt286' into asoc-next
2015-08-30Merge remote-tracking branches 'asoc/topic/mediatek', 'asoc/topic/mtk', ↵Mark Brown
'asoc/topic/nuc900', 'asoc/topic/of-name' and 'asoc/topic/omap' into asoc-next
2015-08-30Merge remote-tracking branches 'asoc/topic/davinci', ↵Mark Brown
'asoc/topic/davinci-vcif', 'asoc/topic/doc' and 'asoc/topic/dpcm' into asoc-next
2015-08-30Merge remote-tracking branches 'asoc/topic/88pm860x', 'asoc/topic/ac97', ↵Mark Brown
'asoc/topic/ak4542', 'asoc/topic/arizona' and 'asoc/topic/atmel' into asoc-next
2015-08-30Merge remote-tracking branch 'asoc/topic/ssm4567' into asoc-nextMark Brown
2015-08-30Merge remote-tracking branch 'asoc/topic/rt5645' into asoc-nextMark Brown
2015-08-30Merge remote-tracking branch 'asoc/topic/dapm' into asoc-nextMark Brown
2015-08-30Merge remote-tracking branch 'asoc/topic/core' into asoc-nextMark Brown
2015-08-30Merge remote-tracking branches 'regulator/topic/qcom-smd', ↵Mark Brown
'regulator/topic/qcom-spmi', 'regulator/topic/rk808', 'regulator/topic/stub' and 'regulator/topic/tol' into regulator-next
2015-08-30Merge remote-tracking branches 'regulator/topic/mt6311', ↵Mark Brown
'regulator/topic/ocp', 'regulator/topic/owner', 'regulator/topic/pfuze100' and 'regulator/topic/pwm' into regulator-next
2015-08-30Merge remote-tracking branches 'regulator/topic/lp872x', ↵Mark Brown
'regulator/topic/ltc3589', 'regulator/topic/max77693' and 'regulator/topic/max8973' into regulator-next
2015-08-30Merge remote-tracking branches 'regulator/topic/da9210', ↵Mark Brown
'regulator/topic/da9211', 'regulator/topic/fan53555', 'regulator/topic/isl9305' and 'regulator/topic/list' into regulator-next
2015-08-30regmap: regmap max_raw_read/write getter functionsMarkus Pargmann
Add functions to access the maximum size we can read/write using regmap_raw_read/write(). This helps drivers that need to know how much they can write with the raw functions without problems. There are some devices (e.g. bmc150) that have fifos as registers which need to be read in specific chunks otherwise samples are dropped. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-08-30regmap: Introduce max_raw_read/write for regmap_bulk_read/writeMarkus Pargmann
There are some buses which have a limit on the maximum number of bytes that can be send/received. An example for this is I2C_FUNC_SMBUS_I2C_BLOCK which does not support any reads/writes of more than 32 bytes. The regmap_bulk operations should still be able to utilize the full 32 bytes in this case. Signed-off-by: Markus Pargmann <mpa@pengutronix.de> Signed-off-by: Mark Brown <broonie@kernel.org>
2015-08-30Merge branches 'fix/raw', 'topic/core', 'topic/i2c', 'topic/raw' and ↵Mark Brown
'topic/doc' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap into regmap-smbus-block
2015-08-29vxlan: do not receive IPv4 packets on IPv6 socketJiri Benc
By default (subject to the sysctl settings), IPv6 sockets listen also for IPv4 traffic. Vxlan is not prepared for that and expects IPv6 header in packets received through an IPv6 socket. In addition, it's currently not possible to have both IPv4 and IPv6 vxlan tunnel on the same port (unless bindv6only sysctl is enabled), as it's not possible to create and bind both IPv4 and IPv6 vxlan interfaces and there's no way to specify both IPv4 and IPv6 remote/group IP addresses. Set IPV6_V6ONLY on vxlan sockets to fix both of these issues. This is not done globally in udp_tunnel, as l2tp and tipc seems to work okay when receiving IPv4 packets on IPv6 socket and people may rely on this behavior. The other tunnels (geneve and fou) do not support IPv6. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-29ip_tunnels: record IP version in tunnel infoJiri Benc
There's currently nothing preventing directing packets with IPv6 encapsulation data to IPv4 tunnels (and vice versa). If this happens, IPv6 addresses are incorrectly interpreted as IPv4 ones. Track whether the given ip_tunnel_key contains IPv4 or IPv6 data. Store this in ip_tunnel_info. Reject packets at appropriate places if they are supposed to be encapsulated into an incompatible protocol. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-29ip_tunnels: convert the mode field of ip_tunnel_info to flagsJiri Benc
The mode field holds a single bit of information only (whether the ip_tunnel_info struct is for rx or tx). Change the mode field to bit flags. This allows more mode flags to be added. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Thomas Graf <tgraf@suug.ch> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-29net: FIB tracepointsDavid Ahern
A few useful tracepoints developing VRF driver. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28netlink: add NETLINK_CAP_ACK socket optionChristophe Ricard
Since commit c05cdb1b864f ("netlink: allow large data transfers from user-space"), the kernel may fail to allocate the necessary room for the acknowledgment message back to userspace. This patch introduces a new socket option that trims off the payload of the original netlink message. The netlink message header is still included, so the user can guess from the sequence number what is the message that has triggered the acknowledgment. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28libnvdimm, pmem: direct map legacy pmem by defaultDan Williams
The expectation is that the legacy / non-standard pmem discovery method (e820 type-12) will only ever be used to describe small quantities of persistent memory. Larger capacities will be described via the ACPI NFIT. When "allocate struct page from pmem" support is added this default policy can be overridden by assigning a legacy pmem namespace to a pfn device, however this would be only be necessary if a platform used the legacy mechanism to define a very large range. Cc: Christoph Hellwig <hch@lst.de> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2015-08-28RDMA/Core: remove rdma_cap_read_multi_sge() helperSteve Wise
This functionality already exists via the max_sge_rd device capability. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28svcrdma: Use max_sge_rd for destination read depthsSteve Wise
Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28IB/core: Add core header changes needed for OPADennis Dalessandro
This patch adds the value of the CNP opcode to the existing list of enumerated opcodes in ib_pack.h Add common OPA header definitions for driver build: - opa_port_info.h - opa_smi.h - hfi1_user.h Additionally, ib_mad.h, has additional definitions that are common to ib_drivers including: - trap support - cca support The qib driver has the duplication removed in favor those in ib_mad.h Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: John, Jubin <jubin.john@intel.com> Signed-off-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-28mlx5: Fix missing device local_dma_lkeySagi Grimberg
The mlx5 driver exposes device capability IB_DEVICE_LOCAL_DMA_LKEY but does not set the the device local_dma_lkey. This breaks rpcrdma drivers. Query and set this lkey when creating the device resources. Signed-off-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-08-29PM / Domains: Remove unusable governor dummiesGeert Uytterhoeven
The governor dummies for the !CONFIG_PM_GENERIC_DOMAINS case are unusable, as a governors is always referred to by taking its address, which you can't do with a literal NULL pointer. I.e. pm_genpd_init(genpd, &simple_qos_governor, false); fails to compile with: error: lvalue required as unary '&' operand Hence just remove the governor dummies. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-08-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter/IPVS updates for your net-next tree. In sum, patches to address fallout from the previous round plus updates from the IPVS folks via Simon Horman, they are: 1) Add a new scheduler to IPVS: The weighted overflow scheduling algorithm directs network connections to the server with the highest weight that is currently available and overflows to the next when active connections exceed the node's weight. From Raducu Deaconu. 2) Fix locking ordering in IPVS, always take rtnl_lock in first place. Patch from Julian Anastasov. 3) Allow to indicate the MTU to the IPVS in-kernel state sync daemon. From Julian Anastasov. 4) Enhance multicast configuration for the IPVS state sync daemon. Also from Julian. 5) Resolve sparse warnings in the nf_dup modules. 6) Fix a linking problem when CONFIG_NF_DUP_IPV6 is not set. 7) Add ICMP codes 5 and 6 to IPv6 REJECT target, they are more informative subsets of code 1. From Andreas Herz. 8) Revert the jumpstack size calculation from mark_source_chains due to chain depth miscalculations, from Florian Westphal. 9) Calm down more sparse warning around the Netfilter tree, again from Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28lib: introduce strncpy_from_unsafe()Alexei Starovoitov
generalize FETCH_FUNC_NAME(memory, string) into strncpy_from_unsafe() and fix sparse warnings that were present in original implementation. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28Merge branches 'pci/enumeration' and 'pci/misc' into nextBjorn Helgaas
* pci/enumeration: PCI: Set MPS to match upstream bridge PCI: Move MPS configuration check to pci_configure_device() PCI: Drop references acquired by of_parse_phandle() PCI/MSI: Remove unused pcibios_msi_controller() hook ARM/PCI: Remove msi_controller from struct pci_sys_data ARM/PCI, designware, xilinx: Use pci_scan_root_bus_msi() PCI: Add pci_scan_root_bus_msi() ARM/PCI: Replace panic with WARN messages on failures PCI: generic: Add arm64 support PCI: Build setup-irq.o for arm64 PCI: generic: Remove dependency on ARM-specific struct hw_pci ARM/PCI: Set MPS before pci_bus_add_devices() * pci/misc: PCI: Disable async suspend/resume for JMicron multi-function SATA/AHCI
2015-08-28net: Add support for VRFs to inetpeer cacheDavid Ahern
inetpeer caches based on address only, so duplicate IP addresses within a namespace return the same cached entry. Enhance the ipv4 address key to contain both the IPv4 address and VRF device index. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28net: Refactor inetpeer address structDavid Ahern
Move the inetpeer_addr_base union to inetpeer_addr and drop inetpeer_addr_base. Both the a6 and in6_addr overlays are not needed; drop the __be32 version and rename in6 to a6 for consistency with ipv4. Add a new u32 array to the union which removes the need for the typecast in the compare function and the use of a consistent arg for both ipv4 and ipv6 addresses which makes the compare function more readable. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28net: Add helper function to compare inetpeer addressesDavid Ahern
tcp_metrics and inetpeer both have functions to compare inetpeer addresses. Consolidate into 1 version. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28net: Add set,get helpers for inetpeer addressesDavid Ahern
Use inetpeer set,get helpers in tcp_metrics rather than peeking into the inetpeer_addr struct. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-08-28net: Introduce ipv4_addr_hash and use it for tcp metricsDavid Ahern
Refactors a common line into helper function. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>