summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-01-26net/smc: smc_poll improvementsUrsula Braun
Increase the socket refcount during poll wait. Take the socket lock before checking socket state. For a listening socket return a mask independent of state SMC_ACTIVE and cover errors or closed state as well. Get rid of the accept_q loop in smc_accept_poll(). Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26net/smc: handle device, port, and QP error eventsUrsula Braun
RoCE device changes cause an IB event, processed in the global event handler for the ROCE device. Problems for a certain Queue Pair cause a QP event, processed in the QP event handler for this QP. Among those events are port errors and other fatal device errors. All link groups using such a port or device must be terminated in those cases. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2018-01-26 One last patch for this development cycle: 1) Add ESN support for IPSec HW offload. From Yossef Efraim. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26sfc: add suffix to large constant in ptpBert Kenward
Fixes: 1280c0f8aafc ("sfc: support second + quarter ns time format for receive datapath") Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26Merge branch 'net-ipv6-Add-support-for-ONLINK-flag'David S. Miller
David Ahern says: ==================== net/ipv6: Add support for ONLINK flag Add support for RTNH_F_ONLINK with ipv6 routes. First patch moves existing gateway validation into helper. The onlink flag requires a different set of checks and the existing validation makes ip6_route_info_create long enough. Second patch makes the table id and lookup flag an option to ip6_nh_lookup_table. onlink check needs to verify the gateway without the RT6_LOOKUP_F_IFACE flag and PBR with VRF means the table id can vary between the table the route is inserted and the VRF the egress device is enslaved to. Third patch adds support for RTNH_F_ONLINK. I have a set of test cases in a format based on the framework Ido and Jiri are working on. Once that goes in I will adapt the script and submit. v2 - removed table id check. Too constraining for PBR with VRF use cases ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26net/ipv6: Add support for onlink flagDavid Ahern
Similar to IPv4 allow routes to be added with the RTNH_F_ONLINK flag. The onlink option requires a gateway and a nexthop device. Any unicast gateway is allowed (including IPv4 mapped addresses and unresolved ones) as long as the gateway is not a local address and if it resolves it must match the given device. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26net/ipv6: Add flags and table id to ip6_nh_lookup_tableDavid Ahern
onlink verification needs to do a lookup in potentially different table than the table in fib6_config and without the RT6_LOOKUP_F_IFACE flag. Change ip6_nh_lookup_table to take table id and flags as input arguments. Both verifications want to ignore link state, so add that flag can stay in the lookup helper. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26net/ipv6: Move gateway validation into helperDavid Ahern
Move existing code to validate nexthop into a helper. Follow on patch adds support for nexthops marked with onlink, and this helper keeps the complexity of ip6_route_info_create in check. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-26dt-bindings: can: rcar_can: document r8a774[35] can supportFabrizio Castro
Document "renesas,can-r8a7743" and "renesas,can-r8a7745" compatible strings. Since the fallback compatible string ("renesas,rcar-gen2-can") activates the right code in the driver, no driver change is needed. Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com> Reviewed-by: Biju Das <biju.das@bp.renesas.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-01-26can: migrate documentation to restructured textRobert Schwebel
The kernel documentation is now restructured text. Convert the SocketCAN documentation and include it in the toplevel kernel documentation. This patch doesn't do any content change. All references to can.txt in the code are converted to can.rst. Signed-off-by: Robert Schwebel <r.schwebel@pengutronix.de> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-01-26Documentation/devicetree: mpc5200.txt: fix pointer to location of ↵Marc Kleine-Budde
fsl,mpc5200-mscan node This patch fixes the pointer to the location of the fsl,mpc5200-mscan device tree node binding documentation. Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2018-01-26mt76: validate rx CCMP PNFelix Fietkau
Apparently hardware does not perform CCMP PN validation in hardware, so we need to take care of this in the driver. This is important for protecting against replay attacks. Since validation of fragmented frames is more complex, the CCMP header for those is preserved. To keep the counter in sync, the first fragment is verified by both mt76 and mac80211, and all other fragments only by mac80211. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-26mt76: pass the per-vif wcid to the core for multicast rxFelix Fietkau
Preparation for adding software rx CCMP PN validation Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-26mt76: split mt76_rx_completeFelix Fietkau
Add a separate function for processing frames after A-MPDU reordering, reduce code duplication Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-26mt76: implement A-MPDU rx reordering in the driver codeFelix Fietkau
This is required for performing CCMP PN validation in software Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-26mt76: get station pointer by wcid and pass it to mac80211Felix Fietkau
Avoids the rhashtable lookup based on the MAC address inside mac80211 Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-26mt76: add an intermediate struct for rx status informationFelix Fietkau
Preparation for passing in more internal rx data via skb->cb Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-26mt76: fix TSF value in probe responsesFelix Fietkau
Like beacons, probe responses need a hardware-generated TSF value. Set the flag that causes the hw to generate it Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-26mt76: retry rx polling as long as there is budget leftFelix Fietkau
Sending frames to mac80211 needs time, which could allow for more rx packets to end up in the DMA ring. Retry polling until there are no more frames left. Improves rx latency under load. Signed-off-by: Felix Fietkau <nbd@nbd.name> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2018-01-26Merge branch 'linux-4.15' of git://github.com/skeggsb/linux into drm-fixesDave Airlie
Single irq regression fix * 'linux-4.15' of git://github.com/skeggsb/linux: drm/nouveau: Move irq setup/teardown to pci ctor/dtor
2018-01-25net/ipv4: Allow send to local broadcast from a socket bound to a VRFDavid Ahern
Message sends to the local broadcast address (255.255.255.255) require uc_index or sk_bound_dev_if to be set to an egress device. However, responses or only received if the socket is bound to the device. This is overly constraining for processes running in an L3 domain. This patch allows a socket bound to the VRF device to send to the local broadcast address by using IP_UNICAST_IF to set the egress interface with packet receipt handled by the VRF binding. Similar to IP_MULTICAST_IF, relax the constraint on setting IP_UNICAST_IF if a socket is bound to an L3 master device. In this case allow uc_index to be set to an enslaved if sk_bound_dev_if is an L3 master device and is the master device for the ifindex. In udp and raw sendmsg, allow uc_index to override the oif if uc_index master device is oif (ie., the oif is an L3 master and the index is an L3 slave). Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25net: vrf: Add support for sends to local broadcast addressDavid Ahern
Sukumar reported that sends to the local broadcast address (255.255.255.255) are broken. Check for the address in vrf driver and do not redirect to the VRF device - similar to multicast packets. With this change sockets can use SO_BINDTODEVICE to specify an egress interface and receive responses. Note: the egress interface can not be a VRF device but needs to be the enslaved device. https://bugzilla.kernel.org/show_bug.cgi?id=198521 Reported-by: Sukumar Gopalakrishnan <sukumarg1973@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25Merge branch 'net-erspan-add-support-for-openvswitch'David S. Miller
William Tu says: ==================== net: erspan: add support for openvswitch The first patch refactors the erspan header definitions. Originally, the erspan fields are defined as a group into a __be16 field, and use mask and offset to access each field. This is more costly due to calling ntohs/htons and error-prone. The first patch changes it to use bitfields. The second patch creates erspan.h in UAPI and move the definition 'struct erspan_metadata' to it for later openvswitch to use. The final patch introduces the new OVS tunnel key attribute, OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS, to program both v1 and v2 erspan tunnel for openvswitch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25openvswitch: add erspan version I and II supportWilliam Tu
The patch adds support for openvswitch to configure erspan v1 and v2. The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr is added to uapi as a binary blob to support all ERSPAN v1 and v2's fields. Note that Previous commit "openvswitch: Add erspan tunnel support." was reverted since it does not design properly. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25net: erspan: create erspan metadata uapi headerWilliam Tu
The patch adds a new uapi header file, erspan.h, and moves the 'struct erspan_metadata' from internal erspan.h to it. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25net: erspan: use bitfield instead of mask and offsetWilliam Tu
Originally the erspan fields are defined as a group into a __be16 field, and use mask and offset to access each field. This is more costly due to calling ntohs/htons. The patch changes it to use bitfields. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25r8169: fix memory corruption on retrieval of hardware statistics.Francois Romieu
Hardware statistics retrieval hurts in tight invocation loops. Avoid extraneous write and enforce strict ordering of writes targeted to the tally counters dump area address registers. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Oliver Freyermuth <o.freyermuth@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25Merge branch 'use-tc_cls_can_offload_and_chain0-throughout-the-drivers'David S. Miller
Jakub Kicinski says: ==================== use tc_cls_can_offload_and_chain0() throughout the drivers This set makes all drivers use a new tc_cls_can_offload_and_chain0() helper which will set extack in case TC hw offload flag is disabled. I chose to keep the new helper which also looks at the chain but renamed it more appropriately. The rationale being that most drivers don't accept chains other than 0 and since we have to pass extack to the helper we can as well pass the entire struct tc_cls_common_offload and perform the most common checks. This code makes the assumption that type_data in the callback can be interpreted as struct tc_cls_common_offload, i.e. the real offload structure has common part as the first member. This allows us to make the check once for all classifier types if driver supports more than one. v1: - drop the type validation in nfp and netdevsim. v2: - reorder checks in patch 1; - split other changes from patch 1; - add the i40e patch in; - add one more test case - for chain 0 extack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25selftests/bpf: check for chain-non-0 extack messageJakub Kicinski
Make sure netdevsim doesn't allow offload of chains other than 0, and that it reports the expected extack message. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25selftests/bpf: check for spurious extacks from the driverJakub Kicinski
Drivers should not report errors when offload is not forced. Check stdout and stderr for familiar messages when with no skip flags and with skip_hw. Check for add, replace, and destroy. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25mlxsw: use tc_cls_can_offload_and_chain0()Jakub Kicinski
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case ethtool tc offload flag is not set or chain unsupported. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25i40e: use tc_cls_can_offload_and_chain0()Jakub Kicinski
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case ethtool tc offload flag is not set or chain unsupported. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25ixgbe: use tc_cls_can_offload_and_chain0()Jakub Kicinski
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case ethtool tc offload flag is not set or chain unsupported. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25bnxt: use tc_cls_can_offload_and_chain0()Jakub Kicinski
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case ethtool tc offload flag is not set or chain unsupported. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25mlx5: use tc_cls_can_offload_and_chain0()Jakub Kicinski
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case ethtool tc offload flag is not set or chain unsupported. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25cxgb4: use tc_cls_can_offload_and_chain0()Jakub Kicinski
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case ethtool tc offload flag is not set or chain unsupported. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25nfp: use tc_cls_can_offload_and_chain0()Jakub Kicinski
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case ethtool tc offload flag is not set or chain unsupported. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25netdevsim: use tc_cls_can_offload_and_chain0()Jakub Kicinski
Make use of tc_cls_can_offload_and_chain0() to set extack msg in case ethtool tc offload flag is not set or chain unsupported. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25pkt_cls: add new tc cls helper to check offload flag and chain indexJakub Kicinski
Very few (mlxsw) upstream drivers seem to allow offload of chains other than 0. Save driver developers typing and add a helper for checking both if ethtool's TC offload flag is on and if chain is 0. This helper will set the extack appropriately in both error cases. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25bpf: Use the IS_FD_ARRAY() macro in map_update_elem()Mickaël Salaün
Make the code more readable. Signed-off-by: Mickaël Salaün <mic@digikod.net> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "The main item is that we try to better handle the newer trackpoints on Lenovo devices that are now being produced by Elan/ALPS/NXP and only implement a small subset of the original IBM trackpoint controls" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Revert "Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01" Input: trackpoint - only expose supported controls for Elan, ALPS and NXP Input: trackpoint - force 3 buttons if 0 button is reported Input: xpad - add support for PDP Xbox One controllers Input: stmfts,s6sy671 - add SPDX identifier
2018-01-25orangefs: fix deadlock; do not write i_size in read_iterMartin Brandenburg
After do_readv_writev, the inode cache is invalidated anyway, so i_size will never be read. It will be fetched from the server which will also know about updates from other machines. Fixes deadlock on 32-bit SMP. See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2 Signed-off-by: Martin Brandenburg <martin@omnibond.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mike Marshall <hubcap@omnibond.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-25Merge branch 'bpf-more-sock_ops-callbacks'Alexei Starovoitov
Lawrence Brakmo says: ==================== This patchset adds support for: - direct R or R/W access to many tcp_sock fields - passing up to 4 arguments to sock_ops BPF functions - tcp_sock field bpf_sock_ops_cb_flags for controlling callbacks - optionally calling sock_ops BPF program when RTO fires - optionally calling sock_ops BPF program when packet is retransmitted - optionally calling sock_ops BPF program when TCP state changes - access to tclass and sk_txhash - new selftest v2: Fixed commit message 0/11. The commit is to "bpf-next" but the patch below used "bpf" and Patchwork didn't work correctly. v3: Cleaned RTO callback as per Yuchung's comment Added BPF enum for TCP states as per Alexei's comment v4: Fixed compile warnings related to detecting changes between TCP internal states and the BPF defined states. v5: Fixed comment issues in some selftest files Fixed accesss issue with u64 fields in bpf_sock_ops struct v6: Made fixes based on comments form Eric Dumazet: The field bpf_sock_ops_cb_flags was addded in a hole on 64bit kernels Field bpf_sock_ops_cb_flags is now set through a helper function which returns an error when a BPF program tries to set bits for callbacks that are not supported in the current kernel. Added a comment indicating that when adding fields to bpf_sock_ops_kern they should be added before the field named "temp" if they need to be cleared before calling the BPF function. v7: Enfornced fields "op" and "replylong[1] .. replylong[3]" not be writable based on comments form Eric Dumazet and Alexei Starovoitov. Filled 32 bit hole in bpf_sock_ops struct with sk_txhash based on comments from Daniel Borkmann. Removed unused functions (tcp_call_bpf_1arg, tcp_call_bpf_4arg) based on comments from Daniel Borkmann. v8: Add commit message 00/12 Add Acked-by as appropriate v9: Moved the bug fix to the front of the patchset Changed RETRANS_CB so it is always called (before it was only called if the retransmit succeeded). It is now called with an extra argument, the return value of tcp_transmit_skb (0 => success). Based on comments from Yuchung Cheng. Added support for reading 2 new fields, sacked_out and lost_out, based on comments from Yuchung Cheng. v10: Moved the callback flags from include/uapi/linux/tcp.h to include/uapi/linux/bpf.h Cleaned up the test in selftest. Added a timeout so it always completes, even if the client is not communicating with the server. Made it faster by removing the sleeps. Made sure it works even when called back-to-back 20 times. Consists of the following patches: [PATCH bpf-next v10 01/12] bpf: Only reply field should be writeable [PATCH bpf-next v10 02/12] bpf: Make SOCK_OPS_GET_TCP size [PATCH bpf-next v10 03/12] bpf: Make SOCK_OPS_GET_TCP struct [PATCH bpf-next v10 04/12] bpf: Add write access to tcp_sock and sock [PATCH bpf-next v10 05/12] bpf: Support passing args to sock_ops bpf [PATCH bpf-next v10 06/12] bpf: Adds field bpf_sock_ops_cb_flags to [PATCH bpf-next v10 07/12] bpf: Add sock_ops RTO callback [PATCH bpf-next v10 08/12] bpf: Add support for reading sk_state and [PATCH bpf-next v10 09/12] bpf: Add sock_ops R/W access to tclass [PATCH bpf-next v10 10/12] bpf: Add BPF_SOCK_OPS_RETRANS_CB [PATCH bpf-next v10 11/12] bpf: Add BPF_SOCK_OPS_STATE_CB [PATCH bpf-next v10 12/12] bpf: add selftest for tcpbpf ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: add selftest for tcpbpfLawrence Brakmo
Added a selftest for tcpbpf (sock_ops) that checks that the appropriate callbacks occured and that it can access tcp_sock fields and that their values are correct. Run with command: ./test_tcpbpf_user Adding the flag "-d" will show why it did not pass. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add BPF_SOCK_OPS_STATE_CBLawrence Brakmo
Adds support for calling sock_ops BPF program when there is a TCP state change. Two arguments are used; one for the old state and another for the new state. There is a new enum in include/uapi/linux/bpf.h that exports the TCP states that prepends BPF_ to the current TCP state names. If it is ever necessary to change the internal TCP state values (other than adding more to the end), then it will become necessary to convert from the internal TCP state value to the BPF value before calling the BPF sock_ops function. There are a set of compile checks added in tcp.c to detect if the internal and BPF values differ so we can make the necessary fixes. New op: BPF_SOCK_OPS_STATE_CB. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add BPF_SOCK_OPS_RETRANS_CBLawrence Brakmo
Adds support for calling sock_ops BPF program when there is a retransmission. Three arguments are used; one for the sequence number, another for the number of segments retransmitted, and the last one for the return value of tcp_transmit_skb (0 => success). Does not include syn-ack retransmissions. New op: BPF_SOCK_OPS_RETRANS_CB. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add sock_ops R/W access to tclassLawrence Brakmo
Adds direct write access to sk_txhash and access to tclass for ipv6 flows through getsockopt and setsockopt. Sample usage for tclass: bpf_getsockopt(skops, SOL_IPV6, IPV6_TCLASS, &v, sizeof(v)) where skops is a pointer to the ctx (struct bpf_sock_ops). Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add support for reading sk_state and moreLawrence Brakmo
Add support for reading many more tcp_sock fields state, same as sk->sk_state rtt_min same as sk->rtt_min.s[0].v (current rtt_min) snd_ssthresh rcv_nxt snd_nxt snd_una mss_cache ecn_flags rate_delivered rate_interval_us packets_out retrans_out total_retrans segs_in data_segs_in segs_out data_segs_out lost_out sacked_out sk_txhash bytes_received (__u64) bytes_acked (__u64) Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Add sock_ops RTO callbackLawrence Brakmo
Adds an optional call to sock_ops BPF program based on whether the BPF_SOCK_OPS_RTO_CB_FLAG is set in bpf_sock_ops_flags. The BPF program is passed 2 arguments: icsk_retransmits and whether the RTO has expired. Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-25bpf: Adds field bpf_sock_ops_cb_flags to tcp_sockLawrence Brakmo
Adds field bpf_sock_ops_cb_flags to tcp_sock and bpf_sock_ops. Its primary use is to determine if there should be calls to sock_ops bpf program at various points in the TCP code. The field is initialized to zero, disabling the calls. A sock_ops BPF program can set it, per connection and as necessary, when the connection is established. It also adds support for reading and writting the field within a sock_ops BPF program. Reading is done by accessing the field directly. However, writing is done through the helper function bpf_sock_ops_cb_flags_set, in order to return an error if a BPF program is trying to set a callback that is not supported in the current kernel (i.e. running an older kernel). The helper function returns 0 if it was able to set all of the bits set in the argument, a positive number containing the bits that could not be set, or -EINVAL if the socket is not a full TCP socket. Examples of where one could call the bpf program: 1) When RTO fires 2) When a packet is retransmitted 3) When the connection terminates 4) When a packet is sent 5) When a packet is received Signed-off-by: Lawrence Brakmo <brakmo@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>