summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2017-08-07ipv6: sr: define core operations for seg6local lightweight tunnelDavid Lebrun
This patch implements a new type of lightweight tunnel named seg6local. A seg6local lwt is defined by a type of action and a set of parameters. The action represents the operation to perform on the packets matching the lwt's route, and is not necessarily an encapsulation. The set of parameters are arguments for the processing function. Each action is defined in a struct seg6_action_desc within seg6_action_table[]. This structure contains the action, mandatory attributes, the processing function, and a static headroom size required by the action. The mandatory attributes are encoded as a bitmask field. The static headroom is set to a non-zero value when the processing function always add a constant number of bytes to the skb (e.g. the header size for encapsulations). To facilitate rtnetlink-related operations such as parsing, fill_encap, and cmp_encap, each type of action parameter is associated to three function pointers, in seg6_action_params[]. All actions defined in seg6_local.h are detailed in [1]. [1] https://tools.ietf.org/html/draft-filsfils-spring-srv6-network-programming-01 Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07ipv6: sr: export SRH insertion functionsDavid Lebrun
This patch exports the seg6_do_srh_encap() and seg6_do_srh_inline() functions. It also removes the CONFIG_IPV6_SEG6_INLINE knob that enabled the compilation of seg6_do_srh_inline(). This function is now built-in. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07ipv6: sr: allow SRH insertion with arbitrary segments_left valueDavid Lebrun
The seg6_validate_srh() function only allows SRHs whose active segment is the first segment of the path. However, an application may insert an SRH whose active segment is not the first one. Such an application might be for example an SR-aware Virtual Network Function. This patch enables to insert SRHs with an arbitrary active segment. Signed-off-by: David Lebrun <david.lebrun@uclouvain.be> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net_sched: use void pointer for filter handleWANG Cong
Now we use 'unsigned long fh' as a pointer in every place, it is safe to convert it to a void pointer now. This gets rid of many casts to pointer. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net_sched: refactor notification code for RTM_DELTFILTERWANG Cong
It is confusing to use 'unsigned long fh' as both a handle and a pointer, especially commit 9ee7837449b3 ("net sched filters: fix notification of filter delete with proper handle"). This patch introduces tfilter_del_notify() so that we can pass it as a pointer as before, and we don't need to check RTM_DELTFILTER in tcf_fill_node() any more. This prepares for the next patch. Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Jiri Pirko <jiri@resnulli.us> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07lwtunnel: replace EXPORT_SYMBOL with EXPORT_SYMBOL_GPLRoopa Prabhu
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: ipv6: add second dif to raw socket lookupsDavid Ahern
Add a second device index, sdif, to raw socket lookups. sdif is the index for ingress devices enslaved to an l3mdev. It allows the lookups to consider the enslaved device as well as the L3 domain when searching for a socket. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: ipv6: add second dif to inet6 socket lookupsDavid Ahern
Add a second device index, sdif, to inet6 socket lookups. sdif is the index for ingress devices enslaved to an l3mdev. It allows the lookups to consider the enslaved device as well as the L3 domain when searching for a socket. TCP moves the data in the cb. Prior to tcp_v4_rcv (e.g., early demux) the ingress index is obtained from IPCB using inet_sdif and after tcp_v4_rcv tcp_v4_sdif is used. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: ipv6: add second dif to udp socket lookupsDavid Ahern
Add a second device index, sdif, to udp socket lookups. sdif is the index for ingress devices enslaved to an l3mdev. It allows the lookups to consider the enslaved device as well as the L3 domain when searching for a socket. Early demux lookups are handled in the next patch as part of INET_MATCH changes. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: ipv4: add second dif to multicast source filterDavid Ahern
Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: ipv4: add second dif to raw socket lookupsDavid Ahern
Add a second device index, sdif, to raw socket lookups. sdif is the index for ingress devices enslaved to an l3mdev. It allows the lookups to consider the enslaved device as well as the L3 domain when searching for a socket. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: ipv4: add second dif to inet socket lookupsDavid Ahern
Add a second device index, sdif, to inet socket lookups. sdif is the index for ingress devices enslaved to an l3mdev. It allows the lookups to consider the enslaved device as well as the L3 domain when searching for a socket. TCP moves the data in the cb. Prior to tcp_v4_rcv (e.g., early demux) the ingress index is obtained from IPCB using inet_sdif and after the cb move in tcp_v4_rcv the tcp_v4_sdif helper is used. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: ipv4: add second dif to udp socket lookupsDavid Ahern
Add a second device index, sdif, to udp socket lookups. sdif is the index for ingress devices enslaved to an l3mdev. It allows the lookups to consider the enslaved device as well as the L3 domain when searching for a socket. Early demux lookups are handled in the next patch as part of INET_MATCH changes. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07Merge tag 'mlx5-shared-2017-08-07' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-shared-2017-08-07 This series includes some mlx5 updates for both net-next and rdma trees. From Saeed, Core driver updates to allow selectively building the driver with or without some large driver components, such as - E-Switch (Ethernet SRIOV support). - Multi-Physical Function Switch (MPFs) support. For that we split E-Switch and MPFs functionalities into separate files. From Erez, Delay mlx5_core events when mlx5 interfaces, namely mlx5_ib, registration is taking place and until it completes. From Rabie, Increase the maximum supported flow counters. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: get rid of struct tc_to_netdevJiri Pirko
Get rid of struct tc_to_netdev which is now just unnecessary container and rather pass per-type structures down to drivers directly. Along with that, consolidate the naming of per-type structure variables in cls_*. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: move prio into cls_commonJiri Pirko
prio is not cls_flower specific, but it is meaningful for all classifiers. Seems that only mlxsw cares about the value. Obviously, cls offload in other drivers is broken. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: push cls related args into cls_common structureJiri Pirko
As ndo_setup_tc is generic offload op for whole tc subsystem, does not really make sense to have cls-specific args. So move them under cls_common structurure which is embedded in all cls structs. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07dsa: push cls_matchall setup_tc processing into a separate functionJiri Pirko
Let dsa_slave_setup_tc be a splitter for specific setup_tc types and push out cls_matchall specific code into a separate function. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: make egress_dev flag part of flower offload structJiri Pirko
Since this is specific to flower now, make it part of the flower offload struct. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: rename TC_SETUP_MATCHALL to TC_SETUP_CLSMATCHALLJiri Pirko
In order to be aligned with the rest of the types, rename TC_SETUP_MATCHALL to TC_SETUP_CLSMATCHALL. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07net: sched: make type an argument for ndo_setup_tcJiri Pirko
Since the type is always present, push it to be a separate argument to ndo_setup_tc. On the way, name the type enum and use it for arg type. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07esp: Fix error handling on layer 2 xmit.Steffen Klassert
esp_output_tail() and esp6_output_tail() can return negative and positive error values. We currently treat only negative values as errors, fix this to treat both cases as error. Fixes: fca11ebde3f0 ("esp4: Reorganize esp_output") Fixes: 383d0350f2cc ("esp6: Reorganize esp_output") Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2017-08-06sctp: remove the typedef sctp_subtype_tXin Long
This patch is to remove the typedef sctp_subtype_t, and replace with union sctp_subtype in the places where it's using this typedef. Note that it doesn't fix many indents although it should, as sctp_disposition_t's removal would mess them up again. So better to fix them when removing sctp_disposition_t in later patch. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sctp: remove the typedef sctp_event_tXin Long
This patch is to remove the typedef sctp_event_t, and replace with enum sctp_event in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sctp: remove the typedef sctp_event_timeout_tXin Long
This patch is to remove the typedef sctp_event_timeout_t, and replace with enum sctp_event_timeout in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sctp: remove the typedef sctp_state_tXin Long
This patch is to remove the typedef sctp_state_t, and replace with enum sctp_state in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sctp: remove the typedef sctp_ierror_tXin Long
This patch is to remove the typedef sctp_ierror_t, and replace with enum sctp_ierror in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sctp: remove the typedef sctp_xmit_tXin Long
This patch is to remove the typedef sctp_xmit_t, and replace with enum sctp_xmit in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sctp: remove the typedef sctp_transport_cmd_tXin Long
This patch is to remove the typedef sctp_transport_cmd_t, and replace with enum sctp_transport_cmd in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sctp: remove the typedef sctp_scope_tXin Long
This patch is to remove the typedef sctp_scope_t, and replace with enum sctp_scope in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sctp: remove the typedef sctp_scope_policy_tXin Long
This patch is to remove the typedef sctp_scope_policy_t and keep it's members as an anonymous enum. It is also to define SCTP_SCOPE_POLICY_MAX to replace the num 3 in sysctl.c to make codes clear. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sctp: remove the typedef sctp_retransmit_reason_tXin Long
This patch is to remove the typedef sctp_retransmit_reason_t, and replace with enum sctp_retransmit_reason in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06sctp: remove the typedef sctp_lower_cwnd_tXin Long
This patch is to remove the typedef sctp_lower_cwnd_t, and replace with enum sctp_lower_cwnd in the places where it's using this typedef. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06net: dsa: User per-cpu 64-bit statisticsFlorian Fainelli
During testing with a background iperf pushing 1Gbit/sec worth of traffic and having both ifconfig and ethtool collect statistics, we could see quite frequent deadlocks. Convert the often accessed DSA slave network devices statistics to per-cpu 64-bit statistics to remove these deadlocks and provide fast efficient statistics updates. Fixes: f613ed665bb3 ("net: dsa: Add support for 64-bit statistics") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06tcp: consolidate congestion control undo functionsYuchung Cheng
Most TCP congestion controls are using identical logic to undo cwnd except BBR. This patch consolidates these similar functions to the one used currently by Reno and others. Suggested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06tcp: fix cwnd undo in Reno and HTCP congestion controlsYuchung Cheng
Using ssthresh to revert cwnd is less reliable when ssthresh is bounded to 2 packets. This patch uses an existing variable in TCP "prior_cwnd" that snapshots the cwnd right before entering fast recovery and RTO recovery in Reno. This fixes the issue discussed in netdev thread: "A buggy behavior for Linux TCP Reno and HTCP" https://www.spinics.net/lists/netdev/msg444955.html Suggested-by: Neal Cardwell <ncardwell@google.com> Reported-by: Wei Sun <unlcsewsun@gmail.com> Signed-off-by: Yuchung Cheng <ncardwell@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06udp: no need to preserve skb->dstPaolo Abeni
__ip_options_echo() does not need anymore skb->dst, so we can avoid explicitly preserving it for its own sake. This is almost a revert of commit 0ddf3fb2c43d ("udp: preserve skb->dst if required for IP options processing") plus some lifting to fit later changes. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06Revert "ipv4: keep skb->dst around in presence of IP options"Paolo Abeni
ip_options_echo() does not use anymore the skb->dst and don't need to keep the dst around for options's sake only. This reverts commit 34b2cef20f19c87999fff3da4071e66937db9644. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06ip/options: explicitly provide net ns to __ip_options_echo()Paolo Abeni
__ip_options_echo() uses the current network namespace, and currently retrives it via skb->dst->dev. This commit adds an explicit 'net' argument to __ip_options_echo() and update all the call sites to provide it, usually via a simpler sock_net(). After this change, __ip_options_echo() no more needs to access skb->dst and we can drop a couple of hack to preserve such info in the rx path. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-06IP: do not modify ingress packet IP option in ip_options_echo()Paolo Abeni
While computing the response option set for LSRR, ip_options_echo() also changes the ingress packet LSRR addresses list, setting the last one to the dst specific address for the ingress packet - via memset(start[ ... The only visible effect of such change - beyond possibly damaging shared/cloned skbs - is modifying the data carried by ICMP replies changing the header information for reported the ingress packet, which violates RFC1122 3.2.2.6. All the others call sites just ignore the ingress packet IP options after calling ip_options_echo() Note that the last element in the LSRR option address list for the reply packet will be properly set later in the ip output path via ip_options_build(). This buggy memset() predates git history and apparently was present into the initial ip_options_echo() implementation in linux 1.3.30 but still looks wrong. The removal of the fib_compute_spec_dst() call will help completely dropping the skb->dst usage by __ip_options_echo() with a later patch. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: avoid atomic swap in tcf_exts_changeJiri Pirko
tcf_exts_change is always called on newly created exts, which are not used on fastpath. Therefore, simple struct copy is enough. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: cls_u32: no need to call tcf_exts_change for newly allocated structJiri Pirko
As the n struct was allocated right before u32_set_parms call, no need to use tcf_exts_change to do atomic change, and we can just fill-up the unused exts struct directly by tcf_exts_validate. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: cls_route: no need to call tcf_exts_change for newly allocated ↵Jiri Pirko
struct As the f struct was allocated right before route4_set_parms call, no need to use tcf_exts_change to do atomic change, and we can just fill-up the unused exts struct directly by tcf_exts_validate. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: cls_flow: no need to call tcf_exts_change for newly allocated structJiri Pirko
As the fnew struct just was allocated, so no need to use tcf_exts_change to do atomic change, and we can just fill-up the unused exts struct directly by tcf_exts_validate. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: cls_cgroup: no need to call tcf_exts_change for newly allocated ↵Jiri Pirko
struct As the new struct just was allocated, so no need to use tcf_exts_change to do atomic change, and we can just fill-up the unused exts struct directly by tcf_exts_validate. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: cls_bpf: no need to call tcf_exts_change for newly allocated structJiri Pirko
As the prog struct was allocated right before cls_bpf_set_parms call, no need to use tcf_exts_change to do atomic change, and we can just fill-up the unused exts struct directly by tcf_exts_validate. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: cls_basic: no need to call tcf_exts_change for newly allocated ↵Jiri Pirko
struct As the f struct was allocated right before basic_set_parms call, no need to use tcf_exts_change to do atomic change, and we can just fill-up the unused exts struct directly by tcf_exts_validate. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: cls_matchall: no need to call tcf_exts_change for newly ↵Jiri Pirko
allocated struct As the head struct was allocated right before mall_set_parms call, no need to use tcf_exts_change to do atomic change, and we can just fill-up the unused exts struct directly by tcf_exts_validate. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: cls_fw: no need to call tcf_exts_change for newly allocated structJiri Pirko
As the f struct was allocated right before fw_set_parms call, no need to use tcf_exts_change to do atomic change, and we can just fill-up the unused exts struct directly by tcf_exts_validate. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04net: sched: cls_flower: no need to call tcf_exts_change for newly allocated ↵Jiri Pirko
struct As the f struct was allocated right before fl_set_parms call, no need to use tcf_exts_change to do atomic change, and we can just fill-up the unused exts struct directly by tcf_exts_validate. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>