summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2015-05-14tipc: simplify resetting and disabling of bearersJon Paul Maloy
Since commit 4b475e3f2f8e4e241de101c8240f1d74d0470494 ("tipc: eliminate delayed link deletion at link failover") the extra boolean parameter "shutting_down" is not any longer needed for the functions bearer_disable() and tipc_link_delete_list(). Furhermore, the function tipc_link_reset_links(), called from bearer_reset() is now unnecessary. We can just as well delete all the links, as we do in bearer_disable(), and start over with creating new links. This commit introduces those changes. Reviewed-by: Erik Hugne <erik.hugne@ericsson.com> Reviewed-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-14netfilter: xt_MARK: Add ARP supportZhang Chunyu
Add arpt_MARK to xt_mark. The corresponding userspace update is available at: http://git.netfilter.org/arptables/commit/?id=4bb2f8340783fd3a3f70aa6f8807428a280f8474 Signed-off-by: Zhang Chunyu <zhangcy@cn.fujitsu.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-14netfilter: ipset: deinline ip_set_put_extensions()Denys Vlasenko
On x86 allyesconfig build: The function compiles to 489 bytes of machine code. It has 25 callsites. text data bss dec hex filename 82441375 22255384 20627456 125324215 7784bb7 vmlinux.before 82434909 22255384 20627456 125317749 7783275 vmlinux Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> CC: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> CC: Eric W. Biederman <ebiederm@xmission.com> CC: David S. Miller <davem@davemloft.net> CC: Jan Engelhardt <jengelh@medozas.de> CC: Jiri Pirko <jpirko@redhat.com> CC: linux-kernel@vger.kernel.org CC: netdev@vger.kernel.org CC: netfilter-devel@vger.kernel.org Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-14netfilter: bridge: free nf_bridge info on xmitFlorian Westphal
nf_bridge information is only needed for -m physdev, so we can always free it after POST_ROUTING. This has the advantage that allocation and free will typically happen on the same cpu. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-14netfilter: bridge: neigh_head and physoutdev can't be used at same timeFlorian Westphal
The neigh_header is only needed when we detect DNAT after prerouting and neigh cache didn't have a mac address for us. The output port has not been chosen yet so we can re-use the storage area, bringing struct size down to 32 bytes on x86_64. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-14Bluetooth: Fix remote name event return directly.Wesley Kuo
This patch fixes hci_remote_name_evt dose not resolve name during discovery status is RESOLVING. Before simultaneous dual mode scan enabled, hci_check_pending_name will set discovery status to STOPPED eventually. Signed-off-by: Wesley Kuo <wesley.kuo@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2015-05-14netfilter: add netfilter ingress hook after handle_ing() under unique static keyPablo Neira
This patch adds the Netfilter ingress hook just after the existing tc ingress hook, that seems to be the consensus solution for this. Note that the Netfilter hook resides under the global static key that enables ingress filtering. Nonetheless, Netfilter still also has its own static key for minimal impact on the existing handle_ing(). * Without this patch: Result: OK: 6216490(c6216338+d152) usec, 100000000 (60byte,0frags) 16086246pps 7721Mb/sec (7721398080bps) errors: 100000000 42.46% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb_core 25.92% kpktgend_0 [kernel.kallsyms] [k] kfree_skb 7.81% kpktgend_0 [pktgen] [k] pktgen_thread_worker 5.62% kpktgend_0 [kernel.kallsyms] [k] ip_rcv 2.70% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_internal 2.34% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_sk 1.44% kpktgend_0 [kernel.kallsyms] [k] __build_skb * With this patch: Result: OK: 6214833(c6214731+d101) usec, 100000000 (60byte,0frags) 16090536pps 7723Mb/sec (7723457280bps) errors: 100000000 41.23% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb_core 26.57% kpktgend_0 [kernel.kallsyms] [k] kfree_skb 7.72% kpktgend_0 [pktgen] [k] pktgen_thread_worker 5.55% kpktgend_0 [kernel.kallsyms] [k] ip_rcv 2.78% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_internal 2.06% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_sk 1.43% kpktgend_0 [kernel.kallsyms] [k] __build_skb * Without this patch + tc ingress: tc filter add dev eth4 parent ffff: protocol ip prio 1 \ u32 match ip dst 4.3.2.1/32 Result: OK: 9269001(c9268821+d179) usec, 100000000 (60byte,0frags) 10788648pps 5178Mb/sec (5178551040bps) errors: 100000000 40.99% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb_core 17.50% kpktgend_0 [kernel.kallsyms] [k] kfree_skb 11.77% kpktgend_0 [cls_u32] [k] u32_classify 5.62% kpktgend_0 [kernel.kallsyms] [k] tc_classify_compat 5.18% kpktgend_0 [pktgen] [k] pktgen_thread_worker 3.23% kpktgend_0 [kernel.kallsyms] [k] tc_classify 2.97% kpktgend_0 [kernel.kallsyms] [k] ip_rcv 1.83% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_internal 1.50% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_sk 0.99% kpktgend_0 [kernel.kallsyms] [k] __build_skb * With this patch + tc ingress: tc filter add dev eth4 parent ffff: protocol ip prio 1 \ u32 match ip dst 4.3.2.1/32 Result: OK: 9308218(c9308091+d126) usec, 100000000 (60byte,0frags) 10743194pps 5156Mb/sec (5156733120bps) errors: 100000000 42.01% kpktgend_0 [kernel.kallsyms] [k] __netif_receive_skb_core 17.78% kpktgend_0 [kernel.kallsyms] [k] kfree_skb 11.70% kpktgend_0 [cls_u32] [k] u32_classify 5.46% kpktgend_0 [kernel.kallsyms] [k] tc_classify_compat 5.16% kpktgend_0 [pktgen] [k] pktgen_thread_worker 2.98% kpktgend_0 [kernel.kallsyms] [k] ip_rcv 2.84% kpktgend_0 [kernel.kallsyms] [k] tc_classify 1.96% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_internal 1.57% kpktgend_0 [kernel.kallsyms] [k] netif_receive_skb_sk Note that the results are very similar before and after. I can see gcc gets the code under the ingress static key out of the hot path. Then, on that cold branch, it generates the code to accomodate the netfilter ingress static key. My explanation for this is that this reduces the pressure on the instruction cache for non-users as the new code is out of the hot path, and it comes with minimal impact for tc ingress users. Using gcc version 4.8.4 on: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 [...] L1d cache: 16K L1i cache: 64K L2 cache: 2048K L3 cache: 8192K Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-14net: add CONFIG_NET_INGRESS to enable ingress filteringPablo Neira
This new config switch enables the ingress filtering infrastructure that is controlled through the ingress_needed static key. This prepares the introduction of the Netfilter ingress hook that resides under this unique static key. Note that CONFIG_SCH_INGRESS automatically selects this, that should be no problem since this also depends on CONFIG_NET_CLS_ACT. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-14netfilter: add hook list to nf_hook_statePablo Neira
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-14vlan: Correctly propagate promisc|allmulti flags in notifier.Vlad Yasevich
Currently vlan notifier handler will try to update all vlans for a device when that device comes up. A problem occurs, however, when the vlan device was set to promiscuous, but not by the user (ex: a bridge). In that case, dev->gflags are not updated. What results is that the lower device ends up with an extra promiscuity count. Here are the backtraces that prove this: [62852.052179] [<ffffffff814fe248>] __dev_set_promiscuity+0x38/0x1e0 [62852.052186] [<ffffffff8160bcbb>] ? _raw_spin_unlock_bh+0x1b/0x40 [62852.052188] [<ffffffff814fe4be>] ? dev_set_rx_mode+0x2e/0x40 [62852.052190] [<ffffffff814fe694>] dev_set_promiscuity+0x24/0x50 [62852.052194] [<ffffffffa0324795>] vlan_dev_open+0xd5/0x1f0 [8021q] [62852.052196] [<ffffffff814fe58f>] __dev_open+0xbf/0x140 [62852.052198] [<ffffffff814fe88d>] __dev_change_flags+0x9d/0x170 [62852.052200] [<ffffffff814fe989>] dev_change_flags+0x29/0x60 The above comes from the setting the vlan device to IFF_UP state. [62852.053569] [<ffffffff814fe248>] __dev_set_promiscuity+0x38/0x1e0 [62852.053571] [<ffffffffa032459b>] ? vlan_dev_set_rx_mode+0x2b/0x30 [8021q] [62852.053573] [<ffffffff814fe8d5>] __dev_change_flags+0xe5/0x170 [62852.053645] [<ffffffff814fe989>] dev_change_flags+0x29/0x60 [62852.053647] [<ffffffffa032334a>] vlan_device_event+0x18a/0x690 [8021q] [62852.053649] [<ffffffff8161036c>] notifier_call_chain+0x4c/0x70 [62852.053651] [<ffffffff8109d456>] raw_notifier_call_chain+0x16/0x20 [62852.053653] [<ffffffff814f744d>] call_netdevice_notifiers+0x2d/0x60 [62852.053654] [<ffffffff814fe1a3>] __dev_notify_flags+0x33/0xa0 [62852.053656] [<ffffffff814fe9b2>] dev_change_flags+0x52/0x60 [62852.053657] [<ffffffff8150cd57>] do_setlink+0x397/0xa40 And this one comes from the notification code. What we end up with is a vlan with promiscuity count of 1 and and a physical device with a promiscuity count of 2. They should both have a count 1. To resolve this issue, vlan code can use dev_get_flags() api which correctly masks promiscuity and allmulti flags. Signed-off-by: Vlad Yasevich <vyasevic@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13net: Reserve skb headroom and set skb->dev even if using __alloc_skbAlexander Duyck
When I had inlined __alloc_rx_skb into __netdev_alloc_skb and __napi_alloc_skb I had overlooked the fact that there was a return in the __alloc_rx_skb. As a result we weren't reserving headroom or setting the skb->dev in certain cases. This change corrects that by adding a couple of jump labels to jump to depending on __alloc_skb either succeeding or failing. Fixes: 9451980a6646 ("net: Use cached copy of pfmemalloc to avoid accessing page") Reported-by: Felipe Balbi <balbi@ti.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com> Tested-by: Kevin Hilman <khilman@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13geneve_core: identify as driver library in modules descriptionJohn W. Linville
Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13geneve: Rename support library as geneve_coreJohn W. Linville
net/ipv4/geneve.c -> net/ipv4/geneve_core.c This name better reflects the purpose of the module. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13geneve: move definition of geneve_hdr() to geneve.hJohn W. Linville
This is a static inline with identical definitions in multiple places... Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13geneve: remove MODULE_ALIAS_RTNL_LINK from net/ipv4/geneve.cJohn W. Linville
This file is essentially a library for implementing the geneve encapsulation protocol. The file does not register any rtnl_link_ops, so the MODULE_ALIAS_RTNL_LINK macro is inappropriate here. Signed-off-by: John W. Linville <linville@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13packet: rollover statisticsWillem de Bruijn
Rollover indicates exceptional conditions. Export a counter to inform socket owners of this state. If no socket with sufficient room is found, rollover fails. Also count these events. Finally, also count when flows are rolled over early thanks to huge flow detection, to validate its correctness. Tested: Read counters in bench_rollover on all other tests in the patchset Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13packet: rollover huge flows before small flowsWillem de Bruijn
Migrate flows from a socket to another socket in the fanout group not only when the socket is full. Start migrating huge flows early, to divert possible 4-tuple attacks without affecting normal traffic. Introduce fanout_flow_is_huge(). This detects huge flows, which are defined as taking up more than half the load. It does so cheaply, by storing the rxhashes of the N most recent packets. If over half of these are the same rxhash as the current packet, then drop it. This only protects against 4-tuple attacks. N is chosen to fit all data in a single cache line. Tested: Ran bench_rollover for 10 sec with 1.5 Mpps of single flow input. lpbb5:/export/hda3/willemb# ./bench_rollover -l 1000 -r -s cpu rx rx.k drop.k rollover r.huge r.failed 0 14 14 0 0 0 0 1 20 20 0 0 0 0 2 16 16 0 0 0 0 3 6168824 6168824 0 4867721 4867721 0 4 4867741 4867741 0 0 0 0 5 12 12 0 0 0 0 6 15 15 0 0 0 0 7 17 17 0 0 0 0 Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13packet: rollover lock contention avoidanceWillem de Bruijn
Rollover has to call packet_rcv_has_room on sockets in the fanout group to find a socket to migrate to. This operation is expensive especially if the packet sockets use rings, when a lock has to be acquired. Avoid pounding on the lock by all sockets by temporarily marking a socket as "under memory pressure" when such pressure is detected. While set, only the socket owner may call packet_rcv_has_room on the socket. Once it detects normal conditions, it clears the flag. The socket is not used as a victim by any other socket in the meantime. Under reasonably balanced load, each socket writer frequently calls packet_rcv_has_room and clears its own pressure field. As a backup for when the socket is rarely written to, also clear the flag on reading (packet_recvmsg, packet_poll) if this can be done cheaply (i.e., without calling packet_rcv_has_room). This is only for edge cases. Tested: Ran bench_rollover: a process with 8 sockets in a single fanout group, each pinned to a single cpu that receives one nic recv interrupt. RPS and RFS are disabled. The benchmark uses packet rx_ring, which has to take a lock when determining whether a socket has room. Sent 3.5 Mpps of UDP traffic with sufficient entropy to spread uniformly across the packet sockets (and inserted an iptables rule to drop in PREROUTING to avoid protocol stack processing). Without this patch, all sockets try to migrate traffic to neighbors, causing lock contention when searching for a non- empty neighbor. The lock is the top 9 entries. perf record -a -g sleep 5 - 17.82% bench_rollover [kernel.kallsyms] [k] _raw_spin_lock - _raw_spin_lock - 99.00% spin_lock + 81.77% packet_rcv_has_room.isra.41 + 18.23% tpacket_rcv + 0.84% packet_rcv_has_room.isra.41 + 5.20% ksoftirqd/6 [kernel.kallsyms] [k] _raw_spin_lock + 5.15% ksoftirqd/1 [kernel.kallsyms] [k] _raw_spin_lock + 5.14% ksoftirqd/2 [kernel.kallsyms] [k] _raw_spin_lock + 5.12% ksoftirqd/7 [kernel.kallsyms] [k] _raw_spin_lock + 5.12% ksoftirqd/5 [kernel.kallsyms] [k] _raw_spin_lock + 5.10% ksoftirqd/4 [kernel.kallsyms] [k] _raw_spin_lock + 4.66% ksoftirqd/0 [kernel.kallsyms] [k] _raw_spin_lock + 4.45% ksoftirqd/3 [kernel.kallsyms] [k] _raw_spin_lock + 1.55% bench_rollover [kernel.kallsyms] [k] packet_rcv_has_room.isra.41 On net-next with this patch, this lock contention is no longer a top entry. Most time is spent in the actual read function. Next up are other locks: + 15.52% bench_rollover bench_rollover [.] reader + 4.68% swapper [kernel.kallsyms] [k] memcpy_erms + 2.77% swapper [kernel.kallsyms] [k] packet_lookup_frame.isra.51 + 2.56% ksoftirqd/1 [kernel.kallsyms] [k] memcpy_erms + 2.16% swapper [kernel.kallsyms] [k] tpacket_rcv + 1.93% swapper [kernel.kallsyms] [k] mlx4_en_process_rx_cq Looking closer at the remaining _raw_spin_lock, the cost of probing in rollover is now comparable to the cost of taking the lock later in tpacket_rcv. - 1.51% swapper [kernel.kallsyms] [k] _raw_spin_lock - _raw_spin_lock + 33.41% packet_rcv_has_room + 28.15% tpacket_rcv + 19.54% enqueue_to_backlog + 6.45% __free_pages_ok + 2.78% packet_rcv_fanout + 2.13% fanout_demux_rollover + 2.01% netif_receive_skb_internal Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13packet: rollover only to socket with headroomWillem de Bruijn
Only migrate flows to sockets that have sufficient headroom, where sufficient is defined as having at least 25% empty space. The kernel has three different buffer types: a regular socket, a ring with frames (TPACKET_V[12]) or a ring with blocks (TPACKET_V3). The latter two do not expose a read pointer to the kernel, so headroom is not computed easily. All three needs a different implementation to estimate free space. Tested: Ran bench_rollover for 10 sec with 1.5 Mpps of single flow input. bench_rollover has as many sockets as there are NIC receive queues in the system. Each socket is owned by a process that is pinned to one of the receive cpus. RFS is disabled. RPS is enabled with an identity mapping (cpu x -> cpu x), to count drops with softnettop. lpbb5:/export/hda3/willemb# ./bench_rollover -r -l 1000 -s Press [Enter] to exit cpu rx rx.k drop.k rollover r.huge r.failed 0 16 16 0 0 0 0 1 21 21 0 0 0 0 2 5227502 5227502 0 0 0 0 3 18 18 0 0 0 0 4 6083289 6083289 0 5227496 0 0 5 22 22 0 0 0 0 6 21 21 0 0 0 0 7 9 9 0 0 0 0 Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13packet: rollover prepare: per-socket stateWillem de Bruijn
Replace rollover state per fanout group with state per socket. Future patches will add fields to the new structure. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13packet: rollover prepare: move code out of callsitesWillem de Bruijn
packet_rcv_fanout calls fanout_demux_rollover twice. Move all rollover logic into the callee to simplify these callsites, especially with upcoming changes. The main differences between the two callsites is that the FLAG variant tests whether the socket previously selected by another mode (RR, RND, HASH, ..) has room before migrating flows, whereas the rollover mode has no original socket to test. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13ipv4: __ip_local_out_sk() is staticEric Dumazet
__ip_local_out_sk() is only used from net/ipv4/ip_output.c net/ipv4/ip_output.c:94:5: warning: symbol '__ip_local_out_sk' was not declared. Should it be static? Fixes: 7026b1ddb6b8 ("netfilter: Pass socket pointer down through okfn().") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13tcp/dccp: tw_timer_handler() is staticEric Dumazet
tw_timer_handler() is only used from net/ipv4/inet_timewait_sock.c Fixes: 789f558cfb36 ("tcp/dccp: get rid of central timewait timer") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13tc: introduce Flower classifierJiri Pirko
This patch introduces a flow-based filter. So far, the very essential packet fields are supported. This patch is only the first step. There is a lot of potential performance improvements possible to implement. Also a lot of features are missing now. They will be addressed in follow-up patches. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13flow_dissector: change port array into src, dst tupleJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13flow_dissector: introduce support for Ethernet addressesJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13flow_dissector: introduce support for ipv6 addresssesJiri Pirko
So far, only hashes made out of ipv6 addresses could be dissected. This patch introduces support for dissection of full ipv6 addresses. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13flow_dissect: use programable dissector in skb_flow_dissect and friendsJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13flow_dissector: introduce programable flow_dissectorJiri Pirko
Introduce dissector infrastructure which allows user to specify which parts of skb he wants to dissect. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13flow_dissector: fix doc for skb_get_poffJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13net: move netdev_pick_tx and dependencies to net/core/dev.cJiri Pirko
next to its user. No relation to flow_dissector so it makes no sense to have it in flow_dissector.c Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13net: move __skb_tx_hash to dev.cJiri Pirko
__skb_tx_hash function has no relation to flow_dissect so just move it to dev.c Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13flow_dissector: fix doc for __skb_get_hash and remove couple of empty linesJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13net: move *skb_get_poff declarations into correct headerJiri Pirko
Since these functions are defined in flow_dissector.c, move header declarations from skbuff.h into flow_dissector.h Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13net: change name of flow_dissector header to match the .c file nameJiri Pirko
add couple of empty lines on the way. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13net: sched: use counter to break reclassify loopsFlorian Westphal
Seems all we want here is to avoid endless 'goto reclassify' loop. tc_classify_compat even resets this counter when something other than TC_ACT_RECLASSIFY is returned, so this skb-counter doesn't break hypothetical loops induced by something other than perpetual TC_ACT_RECLASSIFY return values. skb_act_clone is now identical to skb_clone, so just use that. Tested with following (bogus) filter: tc filter add dev eth0 parent ffff: \ protocol ip u32 match u32 0 0 police rate 10Kbit burst \ 64000 mtu 1500 action reclassify Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Four minor merge conflicts: 1) qca_spi.c renamed the local variable used for the SPI device from spi_device to spi, meanwhile the spi_set_drvdata() call got moved further up in the probe function. 2) Two changes were both adding new members to codel params structure, and thus we had overlapping changes to the initializer function. 3) 'net' was making a fix to sk_release_kernel() which is completely removed in 'net-next'. 4) In net_namespace.c, the rtnl_net_fill() call for GET operations had the command value fixed, meanwhile 'net-next' adjusted the argument signature a bit. This also matches example merge resolutions posted by Stephen Rothwell over the past two days. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13switchdev: don't use anonymous union on switchdev attr/obj structsScott Feldman
Older gcc versions (e.g. gcc version 4.4.6) don't like anonymous unions which was causing build issues on the newly added switchdev attr/obj structs. Fix this by using named union on structs. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Reported-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13switchdev: sparse warning: pass ipv4 fib dst as network-byte orderScott Feldman
And let driver convert it to host-byte order as needed. Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13switchdev: sparse warning: make __switchdev_port_obj_add staticScott Feldman
Signed-off-by: Scott Feldman <sfeldma@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-13netfilter: ipset: Use better include files in xt_set.cJozsef Kadlecsik
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Improve preprocessor macros checksSergey Popovich
Check if mandatory MTYPE, HTYPE and HOST_MASK macros defined. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Fix hashing for ipv6 setsSergey Popovich
HKEY_DATALEN remains defined after first inclusion of ip_set_hash_gen.h, so it is incorrectly reused for IPv6 code. Undefine HKEY_DATALEN in ip_set_hash_gen.h at the end. Also remove some useless defines of HKEY_DATALEN in ip_set_hash_{ip{,mark,port},netiface}.c as ip_set_hash_gen.h defines it correctly for such set types anyway. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Check for comment netlink attribute lengthSergey Popovich
Ensure userspace supplies string not longer than IPSET_MAX_COMMENT_SIZE. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Return bool values instead of intSergey Popovich
Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Use HOST_MASK literal to represent host address CIDR lenSergey Popovich
Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Check IPSET_ATTR_PORT only onceSergey Popovich
We do not need to check tb[IPSET_ATTR_PORT] != NULL before retrieving port, as this attribute is known to exist due to ip_set_attr_netorder() returning true only when attribute exists and it is in network byte order. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Return ipset error instead of boolSergey Popovich
Statement ret = func1() || func2() returns 0 when both func1() and func2() return 0, or 1 if func1() or func2() returns non-zero. However in our case func1() and func2() returns error code on failure, so it seems good to propagate such error codes, rather than returning 1 in case of failure. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: Preprocessor directices cleanupSergey Popovich
* Undefine mtype_data_reset_elem before defining. * Remove duplicated mtype_gc_init undefine, move mtype_gc_init define closer to mtype_gc define. * Use htype instead of HTYPE in IPSET_TOKEN(HTYPE, _create)(). * Remove PF definition from sets: no more used. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2015-05-13netfilter: ipset: No need to make nomatch bitfieldSergey Popovich
We do not store cidr packed with no match, so there is no need to make nomatch bitfield. This simplifies mtype_data_reset_flags() a bit. Signed-off-by: Sergey Popovich <popovich_sergei@mail.ua> Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>