summaryrefslogtreecommitdiff
path: root/include/net
AgeCommit message (Collapse)Author
2015-09-21s390/iucv: do not use arrays as argumentUrsula Braun
The iucv code uses arrays as arguments. Even though this does not really cause a problem, it could be misleading, since the compiler turns array arguments into just a pointer argument. To be more precise this patch changes the array arguments into pointers. Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-21Merge branch 'for-upstream' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Johan Hedberg says: ==================== pull request: bluetooth-next 2015-09-18 Here's the first bluetooth-next pull request for the 4.4 kernel: - ieee802154 cleanups & fixes - debugfs support for the at86rf230 driver - Support for quirky (seemingly counterfeit) CSR Bluetooth controllers - Power management and device config improvements for Intel controllers - Fix for devices with incorrect advertising data length - Fix for closing HCI user channel socket Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20ip6tunnel: make rx/tx bytes counters consistentNicolas Dichtel
Like the previous patch, which fixes ipv4 tunnels, here is the ipv6 part. Before the patch, the external ipv6 header + gre header were included on tx. After the patch: $ ping -c1 192.168.6.121 ; ip -s l ls dev ip6gre1 PING 192.168.6.121 (192.168.6.121) 56(84) bytes of data. 64 bytes from 192.168.6.121: icmp_req=1 ttl=64 time=1.92 ms --- 192.168.6.121 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.923/1.923/1.923/0.000 ms 7: ip6gre1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1440 qdisc noqueue state UNKNOWN mode DEFAULT group default link/gre6 20:01:06:60:30:08:c1:c3:00:00:00:00:00:00:01:23 peer 20:01:06:60:30:08:c1:c3:00:00:00:00:00:00:01:21 RX: bytes packets errors dropped overrun mcast 84 1 0 0 0 0 TX: bytes packets errors dropped carrier collsns 84 1 0 0 0 0 $ ping -c1 192.168.1.121 ; ip -s l ls dev ip6tnl1 PING 192.168.1.121 (192.168.1.121) 56(84) bytes of data. 64 bytes from 192.168.1.121: icmp_req=1 ttl=64 time=2.28 ms --- 192.168.1.121 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 2.288/2.288/2.288/0.000 ms 8: ip6tnl1@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN mode DEFAULT group default link/tunnel6 2001:660:3008:c1c3::123 peer 2001:660:3008:c1c3::121 RX: bytes packets errors dropped overrun mcast 84 1 0 0 0 0 TX: bytes packets errors dropped carrier collsns 84 1 0 0 0 0 Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-20net: Fix behaviour of unreachable, blackhole and prohibit routesNikola Forró
Man page of ip-route(8) says following about route types: unreachable - these destinations are unreachable. Packets are dis‐ carded and the ICMP message host unreachable is generated. The local senders get an EHOSTUNREACH error. blackhole - these destinations are unreachable. Packets are dis‐ carded silently. The local senders get an EINVAL error. prohibit - these destinations are unreachable. Packets are discarded and the ICMP message communication administratively prohibited is generated. The local senders get an EACCES error. In the inet6 address family, this was correct, except the local senders got ENETUNREACH error instead of EHOSTUNREACH in case of unreachable route. In the inet address family, all three route types generated ICMP message net unreachable, and the local senders got ENETUNREACH error. In both address families all three route types now behave consistently with documentation. Signed-off-by: Nikola Forró <nforro@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-18netfilter: Pass net into nf_xfrm_me_harderEric W. Biederman
Instead of calling dev_net on a likley looking network device pass state->net into nf_xfrm_me_harder. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18netfilter: Pass priv instead of nf_hook_ops to netfilter hooksEric W. Biederman
Only pass the void *priv parameter out of the nf_hook_ops. That is all any of the functions are interested now, and by limiting what is passed it becomes simpler to change implementation details. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18netfilter: nf_conntrack: Add a struct net parameter to l4_pkt_to_tupleEric W. Biederman
As gre does not have the srckey in the packet gre_pkt_to_tuple needs to perform a lookup in it's per network namespace tables. Pass in the proper network namespace to all pkt_to_tuple implementations to ensure gre (and any similar protocols) can get this right. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18act_connmark: Remember the struct net instead of guessing it.Eric W. Biederman
Stop guessing the struct net instead of remember it. Guessing is just silly and will be problematic in the future when I implement routes between network namespaces. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18netfilter: Pass net to nf_dup_ipv4 and nf_dup_ipv6Eric W. Biederman
This allows them to stop guessing the network namespace with pick_net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18netfilter: nf_tables: Pass struct net in nft_pktinfoEric W. Biederman
nft_pktinfo is passed on the stack so this does not bloat any in core data structures. By centrally computing this information this makes maintence of the code simpler, and understading of the code easier. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18netfilter: x_tables: Pass struct net in xt_action_paramEric W. Biederman
As xt_action_param lives on the stack this does not bloat any persistent data structures. This is a first step in making netfilter code that needs to know which network namespace it is executing in simpler. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18netfilter: nf_tables: kill nft_pktinfo.opsEric W. Biederman
- Add nft_pktinfo.pf to replace ops->pf - Add nft_pktinfo.hook to replace ops->hooknum This simplifies the code, makes it more readable, and likely reduces cache line misses. Maintainability is enhanced as the details of nft_hook_ops are of no concern to the recpients of nft_pktinfo. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18Merge tag 'ipvs-for-v4.4' of ↵Pablo Neira Ayuso
https://git.kernel.org/pub/scm/linux/kernel/git/horms/ipvs-next Simon Horman says: ==================== IPVS Updates for v4.4 please consider these IPVS Updates for v4.4. The updates include the following from Alex Gartrell: * Scheduling of ICMP * Sysctl to ignore tunneled packets; and hence some packet-looping scenarios ==================== Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-09-18Bluetooth: Add BT_ERR_RATELIMITEDSzymon Janc
This patch adds ratelimited version of the BT_ERR macro. Signed-off-by: Szymon Janc <ext.szymon.janc@tieto.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-17bpf: add bpf_redirect() helperAlexei Starovoitov
Existing bpf_clone_redirect() helper clones skb before redirecting it to RX or TX of destination netdev. Introduce bpf_redirect() helper that does that without cloning. Benchmarked with two hosts using 10G ixgbe NICs. One host is doing line rate pktgen. Another host is configured as: $ tc qdisc add dev $dev ingress $ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \ action bpf run object-file tcbpf1_kern.o section clone_redirect_xmit drop so it receives the packet on $dev and immediately xmits it on $dev + 1 The section 'clone_redirect_xmit' in tcbpf1_kern.o file has the program that does bpf_clone_redirect() and performance is 2.0 Mpps $ tc filter add dev $dev root pref 10 u32 match u32 0 0 flowid 1:2 \ action bpf run object-file tcbpf1_kern.o section redirect_xmit drop which is using bpf_redirect() - 2.4 Mpps and using cls_bpf with integrated actions as: $ tc filter add dev $dev root pref 10 \ bpf run object-file tcbpf1_kern.o section redirect_xmit integ_act classid 1 performance is 2.5 Mpps To summarize: u32+act_bpf using clone_redirect - 2.0 Mpps u32+act_bpf using redirect - 2.4 Mpps cls_bpf using redirect - 2.5 Mpps For comparison linux bridge in this setup is doing 2.1 Mpps and ixgbe rx + drop in ip_rcv - 7.8 Mpps Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17cls_bpf: introduce integrated actionsDaniel Borkmann
Often cls_bpf classifier is used with single action drop attached. Optimize this use case and let cls_bpf return both classid and action. For backwards compatibility reasons enable this feature under TCA_BPF_FLAG_ACT_DIRECT flag. Then more interesting programs like the following are easier to write: int cls_bpf_prog(struct __sk_buff *skb) { /* classify arp, ip, ipv6 into different traffic classes * and drop all other packets */ switch (skb->protocol) { case htons(ETH_P_ARP): skb->tc_classid = 1; break; case htons(ETH_P_IP): skb->tc_classid = 2; break; case htons(ETH_P_IPV6): skb->tc_classid = 3; break; default: return TC_ACT_SHOT; } return TC_ACT_OK; } Joint work with Daniel Borkmann. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17tcp: provide skb->hash to synack packetsEric Dumazet
In commit b73c3d0e4f0e ("net: Save TX flow hash in sock and set in skbuf on xmit"), Tom provided a l4 hash to most outgoing TCP packets. We'd like to provide one as well for SYNACK packets, so that all packets of a given flow share same txhash, to later enable bonding driver to also use skb->hash to perform slave selection. Note that a SYNACK retransmit shuffles the tx hash, as Tom did in commit 265f94ff54d62 ("net: Recompute sk_txhash on negative routing advice") for established sockets. This has nice effect making TCP flows resilient to some kind of black holes, even at connection establish phase. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Tom Herbert <tom@herbertland.com> Cc: Mahesh Bandewar <maheshb@google.com> Acked-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17netfilter: Pass net into okfnEric W. Biederman
This is immediately motivated by the bridge code that chains functions that call into netfilter. Without passing net into the okfns the bridge code would need to guess about the best expression for the network namespace to process packets in. As net is frequently one of the first things computed in continuation functions after netfilter has done it's job passing in the desired network namespace is in many cases a code simplification. To support this change the function dst_output_okfn is introduced to simplify passing dst_output as an okfn. For the moment dst_output_okfn just silently drops the struct net. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17net: Merge dst_output and dst_output_skEric W. Biederman
Add a sock paramter to dst_output making dst_output_sk superfluous. Add a skb->sk parameter to all of the callers of dst_output Have the callers of dst_output_sk call dst_output. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17xfrm: Remove unused afinfo method init_dstEric W. Biederman
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17net: Fix vti use case with oif in dst lookupsDavid Ahern
Steffen reported that the recent change to add oif to dst lookups breaks the VTI use case. The problem is that with the oif set in the flow struct the comparison to the nh_oif is triggered. Fix by splitting the FLOWI_FLAG_VRFSRC into 2 flags -- one that triggers the vrf device cache bypass (FLOWI_FLAG_VRFSRC) and another telling the lookup to not compare nh oif (FLOWI_FLAG_SKIP_NH_OIF). Fixes: 42a7b32b73d6 ("xfrm: Add oif to dst lookups") Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Acked-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17ipv6: include NLM_F_REPLACE in route replace notificationsRoopa Prabhu
This patch adds NLM_F_REPLACE flag to ipv6 route replace notifications. This makes nlm_flags in ipv6 replace notifications consistent with ipv4. Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Reviewed-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-17ieee802154: af_ieee802154: fix typo in comment.Stefan Schmidt
Signed-off-by: Stefan Schmidt <stefan@osg.samsung.com> Acked-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-17ieee802154: 6lowpan: remove tx full-size calc workaroundAlexander Aring
This patch removes a workaround for datagram_size calculation while doing fragmentation on transmit. Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-17ieee802154: 6lowpan: check on valid 802.15.4 frameAlexander Aring
This patch adds frame control checks to check if the received frame is something which could contain a 6LoWPAN packet. Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-17ieee820154: 6lowpan: dispatch evaluation reworkAlexander Aring
This patch complete reworks the evaluation of 6lowpan dispatch value by introducing a receive handler mechanism for each dispatch value. A list of changes: - Doing uncompression on-the-fly when FRAG1 is received, this require some special handling for 802.15.4 lltype in generic 6lowpan branch for setting the payload length correct. - Fix dispatch mask for fragmentation. - Add IPv6 dispatch evaluation for FRAG1. - Add skb_unshare for dispatch which might manipulate the skb data buffer. Cc: Jukka Rissanen <jukka.rissanen@linux.intel.com> Reviewed-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-17Bluetooth: close HCI device when user channel socket gets closedSimon Fels
With 9380f9eacfbbee701daa416edd6625efcd3e29e1 the order of unsetting the HCI_USER_CHANNEL flag of the HCI device was reverted to ensure the device is first closed before making it available again. Due to hci_dev_close checking for HCI_USER_CHANNEL being set on the device it was never really closed and was kept opened. We're now calling hci_dev_do_close directly to make sure the device is correctly closed and we keep the correct order to unset the flag on our device object. Signed-off-by: Simon Fels <simon.fels@canonical.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-17Bluetooth: Add bt_dev logging macrosLoic Poulain
Add specific bluetooth device logging macros since hci device name is repeatedly referred in bluetooth subsystem logs. Signed-off-by: Loic Poulain <loic.poulain@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-09-17ipvs: add sysctl to ignore tunneled packetsAlex Gartrell
This is a way to avoid nasty routing loops when multiple ipvs instances can forward to eachother. Signed-off-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-09-15rtnetlink: RTEXT_FILTER_SKIP_STATS support to avoid dumping inet/inet6 statsSowmini Varadhan
Many commonly used functions like getifaddrs() invoke RTM_GETLINK to dump the interface information, and do not need the the AF_INET6 statististics that are always returned by default from rtnl_fill_ifinfo(). Computing the statistics can be an expensive operation that impacts scaling, so it is desirable to avoid this if the information is not needed. This patch adds a the RTEXT_FILTER_SKIP_STATS extended info flag that can be passed with netlink_request() to avoid statistics computation for the ifinfo path. Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15ipv6: Replace spinlock with seqlock and rcu in ip6_tunnelMartin KaFai Lau
This patch uses a seqlock to ensure consistency between idst->dst and idst->cookie. It also makes dst freeing from fib tree to undergo a rcu grace period. Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15ipv6: Fix dst_entry refcnt bugs in ip6_tunnelMartin KaFai Lau
Problems in the current dst_entry cache in the ip6_tunnel: 1. ip6_tnl_dst_set is racy. There is no lock to protect it: - One major problem is that the dst refcnt gets messed up. F.e. the same dst_cache can be released multiple times and then triggering the infamous dst refcnt < 0 warning message. - Another issue is the inconsistency between dst_cache and dst_cookie. It can be reproduced by adding and removing the ip6gre tunnel while running a super_netperf TCP_CRR test. 2. ip6_tnl_dst_get does not take the dst refcnt before returning the dst. This patch: 1. Create a percpu dst_entry cache in ip6_tnl 2. Use a spinlock to protect the dst_cache operations 3. ip6_tnl_dst_get always takes the dst refcnt before returning Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15ipv6: Rename the dst_cache helper functions in ip6_tunnelMartin KaFai Lau
It is a prep work to fix the dst_entry refcnt bugs in ip6_tunnel. This patch rename: 1. ip6_tnl_dst_check() to ip6_tnl_dst_get() to better reflect that it will take a dst refcnt in the next patch. 2. ip6_tnl_dst_store() to ip6_tnl_dst_set() to have a more conventional name matching with ip6_tnl_dst_get(). Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-15net: Add FIB table id to rtableDavid Ahern
Add the FIB table id to rtable to make the information available for IPv4 as it is for IPv6. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix out-of-bounds array access in netfilter ipset, from Jozsef Kadlecsik. 2) Use correct free operation on netfilter conntrack templates, from Daniel Borkmann. 3) Fix route leak in SCTP, from Marcelo Ricardo Leitner. 4) Fix sizeof(pointer) in mac80211, from Thierry Reding. 5) Fix cache pointer comparison in ip6mr leading to missed unlock of mrt_lock. From Richard Laing. 6) rds_conn_lookup() needs to consider network namespace in key comparison, from Sowmini Varadhan. 7) Fix deadlock in TIPC code wrt broadcast link wakeups, from Kolmakov Dmitriy. 8) Fix fd leaks in bpf syscall, from Daniel Borkmann. 9) Fix error recovery when installing ipv6 multipath routes, we would delete the old route before we would know if we could fully commit to the new set of nexthops. Fix from Roopa Prabhu. 10) Fix run-time suspend problems in r8152, from Hayes Wang. 11) In fec, don't program the MAC address into the chip when the clocks are gated off. From Fugang Duan. 12) Fix poll behavior for netlink sockets when using rx ring mmap, from Daniel Borkmann. 13) Don't allocate memory with GFP_KERNEL from get_stats64 in r8169 driver, from Corinna Vinschen. 14) In TCP Cubic congestion control, handle idle periods better where we are application limited, in order to keep cwnd from growing out of control. From Eric Dumzet. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (65 commits) tcp_cubic: better follow cubic curve after idle period tcp: generate CA_EVENT_TX_START on data frames xen-netfront: respect user provided max_queues xen-netback: respect user provided max_queues r8169: Fix sleeping function called during get_stats64, v2 ether: add IEEE 1722 ethertype - TSN netlink, mmap: fix edge-case leakages in nf queue zero-copy netlink, mmap: don't walk rx ring on poll if receive queue non-empty cxgb4: changes for new firmware 1.14.4.0 net: fec: add netif status check before set mac address r8152: fix the runtime suspend issues r8152: split DRIVER_VERSION ipv6: fix ifnullfree.cocci warnings add microchip LAN88xx phy driver stmmac: fix check for phydev being open net: qlcnic: delete redundant memsets net: mv643xx_eth: use kzalloc net: jme: use kzalloc() instead of kmalloc+memset net: cavium: liquidio: use kzalloc in setup_glist() net: ipv6: use common fib_default_rule_pref ...
2015-09-09net: ipv6: use common fib_default_rule_prefPhil Sutter
This switches IPv6 policy routing to use the shared fib_default_rule_pref() function of IPv4 and DECnet. It is also used in multicast routing for IPv4 as well as IPv6. The motivation for this patch is a complaint about iproute2 behaving inconsistent between IPv4 and IPv6 when adding policy rules: Formerly, IPv6 rules were assigned a fixed priority of 0x3FFF whereas for IPv4 the assigned priority value was decreased with each rule added. Since then all users of the default_pref field have been converted to assign the generic function fib_default_rule_pref(), fib_nl_newrule() may just use it directly instead. Therefore get rid of the function pointer altogether and make fib_default_rule_pref() static, as it's not used outside fib_rules.c anymore. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-09Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull inifiniband/rdma updates from Doug Ledford: "This is a fairly sizeable set of changes. I've put them through a decent amount of testing prior to sending the pull request due to that. There are still a few fixups that I know are coming, but I wanted to go ahead and get the big, sizable chunk into your hands sooner rather than waiting for those last few fixups. Of note is the fact that this creates what is intended to be a temporary area in the drivers/staging tree specifically for some cleanups and additions that are coming for the RDMA stack. We deprecated two drivers (ipath and amso1100) and are waiting to hear back if we can deprecate another one (ehca). We also put Intel's new hfi1 driver into this area because it needs to be refactored and a transfer library created out of the factored out code, and then it and the qib driver and the soft-roce driver should all be modified to use that library. I expect drivers/staging/rdma to be around for three or four kernel releases and then to go away as all of the work is completed and final deletions of deprecated drivers are done. Summary of changes for 4.3: - Create drivers/staging/rdma - Move amso1100 driver to staging/rdma and schedule for deletion - Move ipath driver to staging/rdma and schedule for deletion - Add hfi1 driver to staging/rdma and set TODO for move to regular tree - Initial support for namespaces to be used on RDMA devices - Add RoCE GID table handling to the RDMA core caching code - Infrastructure to support handling of devices with differing read and write scatter gather capabilities - Various iSER updates - Kill off unsafe usage of global mr registrations - Update SRP driver - Misc mlx4 driver updates - Support for the mr_alloc verb - Support for a netlink interface between kernel and user space cache daemon to speed path record queries and route resolution - Ininitial support for safe hot removal of verbs devices" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (136 commits) IB/ipoib: Suppress warning for send only join failures IB/ipoib: Clean up send-only multicast joins IB/srp: Fix possible protection fault IB/core: Move SM class defines from ib_mad.h to ib_smi.h IB/core: Remove unnecessary defines from ib_mad.h IB/hfi1: Add PSM2 user space header to header_install IB/hfi1: Add CSRs for CONFIG_SDMA_VERBOSITY mlx5: Fix incorrect wc pkey_index assignment for GSI messages IB/mlx5: avoid destroying a NULL mr in reg_user_mr error flow IB/uverbs: reject invalid or unknown opcodes IB/cxgb4: Fix if statement in pick_local_ip6adddrs IB/sa: Fix rdma netlink message flags IB/ucma: HW Device hot-removal support IB/mlx4_ib: Disassociate support IB/uverbs: Enable device removal when there are active user space applications IB/uverbs: Explicitly pass ib_dev to uverbs commands IB/uverbs: Fix race between ib_uverbs_open and remove_one IB/uverbs: Fix reference counting usage of event files IB/core: Make ib_dealloc_pd return void IB/srp: Create an insecure all physical rkey only if needed ...
2015-09-08memcg: move memcg_proto_active from sock.hMichal Hocko
The only user is sock_update_memcg which is living in memcontrol.c so it doesn't make much sense to pollute sock.h by this inline helper. Move it to memcontrol.c and open code it into its only caller. Signed-off-by: Michal Hocko <mhocko@suse.com> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-08memcg: export struct mem_cgroupMichal Hocko
mem_cgroup structure is defined in mm/memcontrol.c currently which means that the code outside of this file has to use external API even for trivial access stuff. This patch exports mm_struct with its dependencies and makes some of the exported functions inlines. This even helps to reduce the code size a bit (make defconfig + CONFIG_MEMCG=y) text data bss dec hex filename 12355346 1823792 1089536 15268674 e8fb42 vmlinux.before 12354970 1823792 1089536 15268298 e8f9ca vmlinux.after This is not much (370B) but better than nothing. We also save a function call in some hot paths like callers of mem_cgroup_count_vm_event which is used for accounting. The patch doesn't introduce any functional changes. [vdavykov@parallels.com: inline memcg_kmem_is_active] [vdavykov@parallels.com: do not expose type outside of CONFIG_MEMCG] [akpm@linux-foundation.org: memcontrol.h needs eventfd.h for eventfd_ctx] [akpm@linux-foundation.org: export mem_cgroup_from_task() to modules] Signed-off-by: Michal Hocko <mhocko@suse.cz> Reviewed-by: Vladimir Davydov <vdavydov@parallels.com> Suggested-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-09-06Merge tag 'mac80211-for-davem-2015-09-04' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== For the first round of fixes, we have this: * fix for the sizeof() pointer type issue * a fix for regulatory getting into a restore loop * a fix for rfkill global 'all' state, it needs to be stored everywhere to apply correctly to new rfkill instances * properly refuse CQM RSSI when it cannot actually be used * protect HT TDLS traffic properly in non-HT networks * don't incorrectly advertise 80 MHz support when not allowed ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Conflicts: include/net/netfilter/nf_conntrack.h The conflict was an overlap between changing the type of the zone argument to nf_ct_tmpl_alloc() whilst exporting nf_ct_tmpl_free. Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net, they are: 1) Oneliner to restore maps in nf_tables since we support addressing registers at 32 bits level. 2) Restore previous default behaviour in bridge netfilter when CONFIG_IPV6=n, oneliner from Bernhard Thaler. 3) Out of bound access in ipset hash:net* set types, reported by Dave Jones' KASan utility, patch from Jozsef Kadlecsik. 4) Fix ipset compilation with gcc 4.4.7 related to C99 initialization of unnamed unions, patch from Elad Raz. 5) Add a workaround to address inconsistent endianess in the res_id field of nfnetlink batch messages, reported by Florian Westphal. 6) Fix error paths of CT/synproxy since the conntrack template was moved to use kmalloc, patch from Daniel Borkmann. All of them look good to me to reach 4.2, I can route this to -stable myself too, just let me know what you prefer. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-04mac80211: protect non-HT BSS when HT TDLS traffic existsAvri Altman
HT TDLS traffic should be protected in a non-HT BSS to avoid collisions. Therefore, when TDLS peers join/leave, check if protection is (now) needed and set the ht_operation_mode of the virtual interface according to the HT capabilities of the TDLS peer(s). This works because a non-HT BSS connection never sets (or otherwise uses) the ht_operation_mode; it just means that drivers must be aware that this field applies to all HT traffic for this virtual interface, not just the traffic within the BSS. Document that. Signed-off-by: Avri Altman <avri.altman@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2015-09-02netfilter: nf_conntrack: make nf_ct_zone_dflt built-inDaniel Borkmann
Fengguang reported, that some randconfig generated the following linker issue with nf_ct_zone_dflt object involved: [...] CC init/version.o LD init/built-in.o net/built-in.o: In function `ipv4_conntrack_defrag': nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt' net/built-in.o: In function `ipv6_defrag': nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt' make: *** [vmlinux] Error 1 Given that configurations exist where we have a built-in part, which is accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user() and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in area when netfilter is configured in general. Therefore, split the more generic parts into a common header under include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in section that already holds parts related to CONFIG_NF_CONNTRACK in the netfilter core. This fixes the issue on my side. Fixes: 308ac9143ee2 ("netfilter: nf_conntrack: push zone object into functions") Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01flow_dissector: Use 'const' where possible.David S. Miller
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01flow_dissector: Don't use bit fields.David S. Miller
Just have a flags member instead. In file included from include/linux/linkage.h:4:0, from include/linux/kernel.h:6, from net/core/flow_dissector.c:1: In function 'flow_keys_hash_start', inlined from 'flow_hash_from_keys' at net/core/flow_dissector.c:553:34: >> include/linux/compiler.h:447:38: error: call to '__compiletime_assert_459' declared with attribute error: BUILD_BUG_ON failed: FLOW_KEYS_HASH_OFFSET % sizeof(u32) Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01flow_dissector: Add control/reporting of encapsulationTom Herbert
Add an input flag to flow dissector on rather dissection should stop when encapsulation is detected (IP/IP or GRE). Also, add a key_control flag that indicates encapsulation was encountered during the dissection. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01flow_dissector: Add flag to stop parsing when an IPv6 flow label is seenTom Herbert
Add an input flag to flow dissector on rather dissection should be stopped when a flow label is encountered. Presumably, the flow label is derived from a sufficient hash of an inner transport packet so further dissection is not needed (that is ports are not included in the flow hash). Using the flow label instead of ports has the additional benefit that packet fragments should hash to same value as non-fragments for a flow (assuming that the same flow label is used). We set this flag by default in for skb_get_hash. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01flow_dissector: Add flag to stop parsing at L3Tom Herbert
Add an input flag to flow dissector on rather dissection should be stopped when an L3 packet is encountered. This would be useful if a caller just wanted to get IP addresses of the outermost header (e.g. to do an L3 hash). Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01flow_dissector: Add control/reporting of fragmentationTom Herbert
Add an input flag to flow dissector on rather dissection should be attempted on a first fragment. Also add key_control flags to indicate that a packet is a fragment or first fragment. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-09-01flowi: Abstract out functions to get flow hash based on flowiTom Herbert
Create __get_hash_from_flowi6 and __get_hash_from_flowi4 to get the flow keys and hash based on flowi structures. These are called by __skb_get_hash_flowi6 and __skb_get_hash_flowi4. Also, created get_hash_from_flowi6 and get_hash_from_flowi4 which can be called when just the hash value for a flowi is needed. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>