summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-07-17ipvlan: Stop advertising NETIF_F_UFO support.David S. Miller
It is going away. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17macb: Remove bogus reference to NETIF_F_UFO.David S. Miller
This driver doesn't actually support UFO explicitly yet it advertises this in netdev->features. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17s2io: Remove UFO support.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17Merge branch 'xdp-redirect'David S. Miller
John Fastabend says: ==================== Implement XDP bpf_redirect This series adds two new XDP helper routines bpf_redirect() and bpf_redirect_map(). The first variant bpf_redirect() is meant to be used the same way it is currently being used by the cls_bpf classifier. An xdp packet will be redirected immediately when this is called. The other variant bpf_redirect_map(map, key, flags) uses a new map type called devmap. A devmap uses integers as keys and net_devices as values. The user provies key/ifindex pairs to update the map with new net_devices. This provides two benefits over the normal variant 'bpf_redirect()'. First the datapath bpf program is abstracted away from using hard-coded ifindex values. Allowing a single bpf program to be run any many different environments. Second, and perhaps more important, the map enables batching packet transmits. The map plus small driver changes allows for batching all send requests across a NAPI poll loop. This allows driver writers to optimize the driver xmit path and only call expensive operations once for a batch of xdp_buffs. The devmap was designed to support possible future work for multicast and broadcast as follow-up patches. To see, in more detail, how to leverage the new helpers and map from the userspace side please review these two patches, xdp: sample program for new bpf_redirect helper xdp: bpf redirect with map sample program Performance numbers provided by Jesper are the following, tested using the ixgbe driver with CPU E5-1650 v4 @ 3.60GHz: 13,939,674 pkt/s = XDP_DROP without touching memory 14,290,650 pkt/s = xdp1: XDP_DROP with reading packet data 13,221,812 pkt/s = xdp2: XDP_TX with swap mac (writes into pkt) 7,596,576 pkt/s = xdp_redirect: XDP_REDIRECT with swap mac (like XDP_TX) 13,058,435 pkt/s = xdp_redirect_map:XDP_REDIRECT with swap mac + devmap A big thanks to everyone who helped with this series. Jesper provided fixes, debugging, code review, performance benchmarks! Daniel provided lots of useful feedback and code review. And last but not least Andy provided useful feedback related to supporting additional drivers, generic xdp implementation, testing, etc. Any other feedback is welcome but I believe at this point these are ready to be merged! Whats left... get the rest of the drivers developers to implement this in all the drivers. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17xdp: bpf redirect with map sample programJohn Fastabend
Signed-off-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17net: add notifier hooks for devmap bpf mapJohn Fastabend
The BPF map devmap holds a refcnt on the net_device structure when it is in the map. We need to do this to ensure on driver unload we don't lose a dev reference. However, its not very convenient to have to manually unload the map when destroying a net device so add notifier handlers to do the cleanup automatically. But this creates a race between update/destroy BPF syscall and programs and the unregister netdev hook. Unfortunately, the best I could come up with is either to live with requiring manual removal of net devices from the map before removing the net device OR to add a mutex in devmap to ensure the map is not modified while we are removing a device. The fallout also requires that BPF programs no longer update/delete the map from the BPF program side because the mutex may sleep and this can not be done from inside an rcu critical section. This is not a real problem though because I have not come up with any use cases where this is actually useful in practice. If/when we come up with a compelling user for this we may need to revisit this. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17xdp: Add batching support to redirect mapJohn Fastabend
For performance reasons we want to avoid updating the tail pointer in the driver tx ring as much as possible. To accomplish this we add batching support to the redirect path in XDP. This adds another ndo op "xdp_flush" that is used to inform the driver that it should bump the tail pointer on the TX ring. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17bpf: add bpf_redirect_map helper routineJohn Fastabend
BPF programs can use the devmap with a bpf_redirect_map() helper routine to forward packets to netdevice in map. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17bpf: add devmap, a map for storing net device referencesJohn Fastabend
Device map (devmap) is a BPF map, primarily useful for networking applications, that uses a key to lookup a reference to a netdevice. The map provides a clean way for BPF programs to build virtual port to physical port maps. Additionally, it provides a scoping function for the redirect action itself allowing multiple optimizations. Future patches will leverage the map to provide batching at the XDP layer. Another optimization/feature, that is not yet implemented, would be to support multiple netdevices per key to support efficient multicast and broadcast support. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17xdp: add trace event for xdp redirectJohn Fastabend
This adds a trace event for xdp redirect which may help when debugging XDP programs that use redirect bpf commands. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17ixgbe: add initial support for xdp redirectJohn Fastabend
There are optimizations we can add after the basic feature is enabled. But, for now keep the patch simple. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17net: implement XDP_REDIRECT for xdp genericJohn Fastabend
Add support for redirect to xdp generic creating a fall back for devices that do not yet have support and allowing test infrastructure using veth pairs to be built. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17xdp: sample program for new bpf_redirect helperJohn Fastabend
This implements a sample program for testing bpf_redirect. It reports the number of packets redirected per second and as input takes the ifindex of the device to run the xdp program on and the ifindex of the interface to redirect packets to. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17xdp: add bpf_redirect helper functionJohn Fastabend
This adds support for a bpf_redirect helper function to the XDP infrastructure. For now this only supports redirecting to the egress path of a port. In order to support drivers handling a xdp_buff natively this patches uses a new ndo operation ndo_xdp_xmit() that takes pushes a xdp_buff to the specified device. If the program specifies either (a) an unknown device or (b) a device that does not support the operation a BPF warning is thrown and the XDP_ABORTED error code is returned. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17net: xdp: support xdp generic on virtual devicesJohn Fastabend
XDP generic allows users to test XDP programs and/or run them with degraded performance on devices that do not yet support XDP. For testing I typically test eBPF programs using a set of veth devices. This allows testing topologies that would otherwise be difficult to setup especially in the early stages of development. This patch adds a xdp generic hook to the netif_rx_internal() function which is called from dev_forward_skb(). With this addition attaching XDP programs to veth devices works as expected! Also I noticed multiple drivers using netif_rx(). These devices will also benefit and generic XDP will work for them as well. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Tested-by: Andy Gospodarek <andy@greyhouse.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17ixgbe: NULL xdp_tx rings on resource cleanupJohn Fastabend
tx_rings and rx_rings are cleaned up on close paths in ixgbe driver however, xdp_rings are not. Set the xdp_rings to NULL here so that we can use the pointer to indicate if the XDP rings are initialized. Signed-off-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17jhash: fix -Wimplicit-fallthrough warningsJakub Kicinski
GCC 7 added a new -Wimplicit-fallthrough warning. It's only enabled with W=1, but since linux/jhash.h is included in over hundred places (including other global headers) it seems worthwhile fixing this warning. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17Merge branch 'mlxsw-traps'David S. Miller
Jiri Pirko says: ==================== mlxsw: Traps enhancements Ido says: The first patch makes sure the driver marks packets that were trapped in the router and might have already been flooded by the bridge, so that the bridge driver won't flood them again. This isn't critical at this time point, but will be when Neighbour Discovery traps are introduced as these are multicast packets that are trapped in the router. The second and third patches add new traps - for MLD and Router Alert packets. The last patch takes advantage of that and floods IPv6 unregistered multicast packets only to mrouter ports instead of all ports. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17mlxsw: spectrum: Improve IPv6 unregistered multicast floodingArkadi Sharshevsky
Up until now IPv6 unregistered multicast traffic would be flooded like broadcast, even when MLD snooping was enabled on the bridge. This was intentional as MLD packet traps were missing, preventing the bridge driver from programming MDB entries to the device. Previous patch added these traps, so we can now finally flood IPv6 unregistered multicast packets to specific ports via the multicast table instead of flooding them to all ports via the broadcast table. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17mlxsw: spectrum: Add support for IPv6 MLDv1/2 trapsArkadi Sharshevsky
Add support for IPv6 MLDv1/2 packet trapping. Signed-off-by: Arkadi Sharshevsky <arkadis@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17mlxsw: spectrum: Trap IPv4 packets with Router Alert optionIdo Schimmel
In case local sockets have the IP_ROUTER_ALERT socket option set, then they expect to get packets with the Router Alert option. Trap such packets, so that the kernel could inspect them and potentially send them to interested sockets. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17mlxsw: spectrum: Mark packets trapped in routerIdo Schimmel
In commit 1c6c6d221e2b ("mlxsw: spectrum: Mirror certain packets to CPU") we marked packets that were mirrored to the CPU, so that they won't be flooded again by the bridge driver. However, certain packets are trapped in the device's router block, after passing through the bridge block where they were potentially flooded. Mark all packets coming from L3 traps, so that they won't be potentially flooded again by the bridge driver. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17Merge branch 'mlxsw-ttl-tos'David S. Miller
Jiri Pirko says: ==================== mlxsw: offloading matches on ip ttl and tos Or says: Support offloading matches on ip ttl and tos ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17mlxsw: spectrum_flower: Add support for ip tosOr Gerlitz
Support offloading rules that match on ip tos. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17mlxsw: spectrum: Add tos to the ipv4 acl blockOr Gerlitz
Add ecn and dscp fields to the ipv4 acl block. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17mlxsw: acl: Add ip tos acl elementOr Gerlitz
Define new element for ip tos (ecn, dscp) and place it into scratch area. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17mlxsw: spectrum_flower: Add support for ip ttlOr Gerlitz
Support offloading rules that match on ip ttl. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17mlxsw: spectrum: Add ttl to the ipv4 acl blockOr Gerlitz
Add ttl field to the ipv4 acl block. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17mlxsw: acl: Add ip ttl acl elementOr Gerlitz
Define new element for ip ttl and place it into scratch area. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17inetpeer: remove AVL implementation in favor of RB treeEric Dumazet
As discussed in Faro during Netfilter Workshop 2017, RB trees can be used with RCU, using a seqlock. Note that net/rxrpc/conn_service.c is already using this. This patch converts inetpeer from AVL tree to RB tree, since it allows to remove private AVL implementation in favor of shared RB code. $ size net/ipv4/inetpeer.before net/ipv4/inetpeer.after text data bss dec hex filename 3195 40 128 3363 d23 net/ipv4/inetpeer.before 1562 24 0 1586 632 net/ipv4/inetpeer.after The same technique can be used to speed up net/netfilter/nft_set_rbtree.c (removing rwlock contention in fast path) Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17net/unix: drop obsolete fd-recursion limitsDavid Herrmann
All unix sockets now account inflight FDs to the respective sender. This was introduced in: commit 712f4aad406bb1ed67f3f98d04c044191f0ff593 Author: willy tarreau <w@1wt.eu> Date: Sun Jan 10 07:54:56 2016 +0100 unix: properly account for FDs passed over unix sockets and further refined in: commit 415e3d3e90ce9e18727e8843ae343eda5a58fad6 Author: Hannes Frederic Sowa <hannes@stressinduktion.org> Date: Wed Feb 3 02:11:03 2016 +0100 unix: correctly track in-flight fds in sending process user_struct Hence, regardless of the stacking depth of FDs, the total number of inflight FDs is limited, and accounted. There is no known way for a local user to exceed those limits or exploit the accounting. Furthermore, the GC logic is independent of the recursion/stacking depth as well. It solely depends on the total number of inflight FDs, regardless of their layout. Lastly, the current `recursion_level' suffers a TOCTOU race, since it checks and inherits depths only at queue time. If we consider `A<-B' to mean `queue-B-on-A', the following sequence circumvents the recursion level easily: A<-B B<-C C<-D ... Y<-Z resulting in: A<-B<-C<-...<-Z With all of this in mind, lets drop the recursion limit. It has no additional security value, anymore. On the contrary, it randomly confuses message brokers that try to forward file-descriptors, since any sendmsg(2) call can fail spuriously with ETOOMANYREFS if a client maliciously modifies the FD while inflight. Cc: Alban Crequy <alban.crequy@collabora.co.uk> Cc: Simon McVittie <simon.mcvittie@collabora.co.uk> Signed-off-by: David Herrmann <dh.herrmann@gmail.com> Reviewed-by: Tom Gundersen <teg@jklm.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17skbuff: optimize the pull_pages code in __pskb_pull_tail()linzhang
In the pull_pages code block, if the first frag size > eat, we can end the loop in advance to avoid extra copy. Signed-off-by: Lin Zhang <xiaolou4617@gmail.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17dt-bindings: net: ravb : Add support for r8a7743 SoCBiju Das
Add a new compatible string for the RZ/G1M (R8A7743) SoC. Signed-off-by: Biju Das <biju.das@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17net: axienet: add support for standard phy-mode bindingAlvaro G. M
Keep supporting proprietary "xlnx,phy-type" attribute and add support for MII connectivity to the PHY. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Alvaro Gamez Machado <alvaro.gamez@hazent.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-17IB/core: Don't resolve IP address to the loopback deviceMoni Shoua
When resolving an IP address that is on the host of the caller the result from querying the routing table is the loopback device. This is not a valid response, because it doesn't represent the RDMA device and the port. Therefore, callers need to check the resolved device and if it is a loopback device find an alternative way to resolve it. To avoid this we make sure that the response from rdma_resolve_ip() will not be the loopback device. While that, we fix an static checker warning about dereferencing an unintitialized pointer using the same solution as in commit abeffce90c7f ("net/mlx5e: Fix a -Wmaybe-uninitialized warning") as a reference. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17IB/core: Namespace is mandatory input for address resolutionMoni Shoua
In function addr_resolve() the namespace is a required input parameter and not an output. It is passed later for searching the routing table and device addresses. Also, it shouldn't be copied back to the caller. Fixes: 565edd1d5555 ('IB/addr: Pass network namespace as a parameter') Cc: <stable@vger.kernel.org> # v4.3+ Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17IB/iser: Fix connection teardown race conditionVladimir Neyelov
Under heavy iser target(scst) start/stop stress during login/logout on iser intitiator side happened trace call provided below. The function iscsi_iser_slave_alloc iser_conn pointer could be NULL, due to the fact that function iscsi_iser_conn_stop can be called before and free iser connection. Let's protect that flow by introducing global mutex. BUG: unable to handle kernel paging request at 0000000000001018 IP: [<ffffffffc0426f7e>] iscsi_iser_slave_alloc+0x1e/0x50 [ib_iser] Call Trace: ? scsi_alloc_sdev+0x242/0x300 scsi_probe_and_add_lun+0x9e1/0xea0 ? kfree_const+0x21/0x30 ? kobject_set_name_vargs+0x76/0x90 ? __pm_runtime_resume+0x5b/0x70 __scsi_scan_target+0xf6/0x250 scsi_scan_target+0xea/0x100 iscsi_user_scan_session.part.13+0x101/0x130 [scsi_transport_iscsi] ? iscsi_user_scan_session.part.13+0x130/0x130 [scsi_transport_iscsi] iscsi_user_scan_session+0x1e/0x30 [scsi_transport_iscsi] device_for_each_child+0x50/0x90 iscsi_user_scan+0x44/0x60 [scsi_transport_iscsi] store_scan+0xa8/0x100 ? common_file_perm+0x5d/0x1c0 dev_attr_store+0x18/0x30 sysfs_kf_write+0x37/0x40 kernfs_fop_write+0x12c/0x1c0 __vfs_write+0x18/0x40 vfs_write+0xb5/0x1a0 SyS_write+0x55/0xc0 Fixes: 318d311e8f01 ("iser: Accept arbitrary sg lists mapping if the device supports it") Cc: <stable@vger.kernel.org> # v4.5+ Signed-off-by: Vladimir Neyelov <vladimirn@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Sagi Grimberg <sagi@grimbeg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17RDMA/core: Document confusing codeGustavo A. R. Silva
While looking into Coverity ID 1351047 I ran into the following piece of code at drivers/infiniband/core/verbs.c:496: ret = rdma_addr_find_l2_eth_by_grh(&dgid, &sgid,                                    ah_attr->dmac,                                    wc->wc_flags & IB_WC_WITH_VLAN ?                                    NULL : &vlan_id,                                    &if_index, &hoplimit); The issue here is that the position of arguments in the call to rdma_addr_find_l2_eth_by_grh() function do not match the order of the parameters: &dgid is passed to sgid &sgid is passed to dgid This is the function prototype: int rdma_addr_find_l2_eth_by_grh(const union ib_gid *sgid,  const union ib_gid *dgid,  u8 *dmac, u16 *vlan_id, int *if_index,  int *hoplimit) My question here is if this is intentional? Answer: Yes. ib_init_ah_from_wc() creates ah from the incoming packet. Incoming packet has dgid of the receiver node on which this code is getting executed and sgid contains the GID of the sender. When resolving mac address of destination, you use arrived dgid as sgid and use sgid as dgid because sgid contains destinations GID whom to respond to. Signed-off-by: Gustavo A. R. Silva <garsilva@embeddedor.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17mlx5: Avoid that mlx5_ib_sg_to_klms() overflows the klms[] arrayBart Van Assche
ib_map_mr_sg() can pass an SG-list to .map_mr_sg() that is larger than what fits into a single MR. .map_mr_sg() must not attempt to map more SG-list elements than what fits into a single MR. Hence make sure that mlx5_ib_sg_to_klms() does not write outside the MR klms[] array. Fixes: b005d3164713 ("mlx5: Add arbitrary sg list support") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Israel Rukshin <israelr@mellanox.com> Cc: <stable@vger.kernel.org> Acked-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17IB/hfi1: Ensure dd->gi_mask can not be overflowedDennis Dalessandro
As the code stands today the array access in remap_intr() is OK. To future proof the code though we should explicitly check to ensure the index value is not outside of the valid range. This is not a straight forward calculation so err on the side of caution. Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17Merge tag 'v4.13-rc1' into k.o/for-4.13-rcDoug Ledford
Linux v4.13-rc1
2017-07-17netfilter: expect: fix crash when putting uninited expectationFlorian Westphal
We crash in __nf_ct_expect_check, it calls nf_ct_remove_expect on the uninitialised expectation instead of existing one, so del_timer chokes on random memory address. Fixes: ec0e3f01114ad32711243 ("netfilter: nf_ct_expect: Add nf_ct_remove_expect()") Reported-by: Sergey Kvachonok <ravenexp@gmail.com> Tested-by: Sergey Kvachonok <ravenexp@gmail.com> Cc: Gao Feng <fgao@ikuai8.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-17netfilter: nf_tables: only allow in/output for arp packetsFlorian Westphal
arp packets cannot be forwarded. They can be bridged, but then they can be filtered using either ebtables or nftables bridge family. The bridge netfilter exposes a "call-arptables" switch which pushes packets into arptables, but lets not expose this for nftables, so better close this asap. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-17netfilter: nat: fix src map lookupFlorian Westphal
When doing initial conversion to rhashtable I replaced the bucket walk with a single rhashtable_lookup_fast(). When moving to rhlist I failed to properly walk the list of identical tuples, but that is what is needed for this to work correctly. The table contains the original tuples, so the reply tuples are all distinct. We currently decide that mapping is (not) in range only based on the first entry, but in case its not we need to try the reply tuple of the next entry until we either find an in-range mapping or we checked all the entries. This bug makes nat core attempt collision resolution while it might be able to use the mapping as-is. Fixes: 870190a9ec90 ("netfilter: nat: convert nat bysrc hash to rhashtable") Reported-by: Jaco Kroon <jaco@uls.co.za> Tested-by: Jaco Kroon <jaco@uls.co.za> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-17netfilter: remove old pre-netns era hook apiFlorian Westphal
no more users in the tree, remove this. The old api is racy wrt. module removal, all users have been converted to the netns-aware api. The old api pretended we still have global hooks but that has not been true for a long time. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-17regmap: regmap-w1: Fix build troublesminimumlaw@rambler.ru
Fixes: cc5d0db390b0 ("regmap: Add 1-Wire bus support") Commit de0d6dbdbdb2 ("w1: Add subsystem kernel public interface") Fix place off w1.h header file Cosmetic: Fix company name (local to international) Signed-off-by: Alex A. Mihaylov <minimumlaw@rambler.ru> Signed-off-by: Mark Brown <broonie@kernel.org>
2017-07-17libceph: potential NULL dereference in ceph_msg_data_create()Dan Carpenter
If kmem_cache_zalloc() returns NULL then the INIT_LIST_HEAD(&data->links); will Oops. The callers aren't really prepared for NULL returns so it doesn't make a lot of difference in real life. Fixes: 5240d9f95dfe ("libceph: replace message data pointer with list") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-17ceph: fix race in concurrent readdirYan, Zheng
For a large directory, program needs to issue multiple readdir syscalls to get all dentries. When there are multiple programs read the directory concurrently. Following sequence of events can happen. - program calls readdir with pos = 2. ceph sends readdir request to mds. The reply contains N1 entries. ceph adds these N1 entries to readdir cache. - program calls readdir with pos = N1+2. The readdir is satisfied by the readdir cache, N2 entries are returned. (Other program calls readdir in the middle, which fills the cache) - program calls readdir with pos = N1+N2+2. ceph sends readdir request to mds. The reply contains N3 entries and it reaches directory end. ceph adds these N3 entries to the readdir cache and marks directory complete. The second readdir call does not update fi->readdir_cache_idx. ceph add the last N3 entries to wrong places. Cc: stable@vger.kernel.org # v4.3+ Signed-off-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-17libceph: don't call encode_request_finish() on MOSDBackoff messagesIlya Dryomov
encode_request_finish() is for MOSDOp messages. Calling it on MOSDBackoff ack-block messages corrupts them. Fixes: a02a946dfe96 ("libceph: respect RADOS_BACKOFF backoffs") Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2017-07-17libceph: use alloc_pg_mapping() in __decode_pg_upmap_items()Ilya Dryomov
... otherwise we die in insert_pg_mapping(), which wants pg->node to be empty, i.e. initialized with RB_CLEAR_NODE. Fixes: 6f428df47dae ("libceph: pg_upmap[_items] infrastructure") Signed-off-by: Ilya Dryomov <idryomov@gmail.com>