linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2017-12-15	Merge branch 'sctp-stream-interleave'	David S. Miller
	Xin Long says: ==================== sctp: Implement Stream Interleave: Interaction with Other SCTP Extensions Stream Interleave would be implemented in two Parts: 1. The I-DATA Chunk Supporting User Message Interleaving 2. Interaction with Other SCTP Extensions Overview in section 2.3 of RFC8260 for Part 2: The usage of the I-DATA chunk might interfere with other SCTP extensions. Future SCTP extensions MUST describe if and how they interfere with the usage of I-DATA chunks. For the SCTP extensions already defined when this document was published, the details are given in the following subsections. As the 2nd part of Stream Interleave Implementation, this patchset mostly adds the support for SCTP Partial Reliability Extension with I-FORWARD-TSN chunk. Then adjusts stream scheduler and stream reconfig to make them work properly with I-DATA chunks. In the last patch, all stream interleave codes will be enabled by adding sysctl to allow users to use this feature. v1 -> v2: - removed the intl_enable check from sctp_chunk_event_lookup, as Marcelo's suggestion. - fixed a typo in changelog. ==================== Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	sctp: support sysctl to allow users to use stream interleave	Xin Long
	This is the last patch for support of stream interleave, after this patch, users could enable stream interleave by systcl -w net.sctp.intl_enable=1. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo R. Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	sctp: update mid instead of ssn when doing stream and asoc reset	Xin Long
	When using idata and doing stream and asoc reset, setting ssn with 0 could only clear the 1st 16 bits of mid. So to make this work for both data and idata, it sets mid with 0 instead of ssn, and also mid_uo for unordered idata also need to be cleared, as said in section 2.3.2 of RFC8260. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo R. Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	sctp: add stream interleave support in stream scheduler	Xin Long
	As Marcelo said in the stream scheduler patch: Support for I-DATA chunks, also described in RFC8260, with user message interleaving is straightforward as it just requires the schedulers to probe for the feature and ignore datamsg boundaries when dequeueing. All needs to do is just to ignore datamsg boundaries when dequeueing. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo R. Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	sctp: implement handle_ftsn for sctp_stream_interleave	Xin Long
	handle_ftsn is added as a member of sctp_stream_interleave, used to skip ssn for data or mid for idata, called for SCTP_CMD_PROCESS_FWDTSN cmd. sctp_handle_iftsn works for ifwdtsn, and sctp_handle_fwdtsn works for fwdtsn. Note that different from sctp_handle_fwdtsn, sctp_handle_iftsn could do stream abort pd. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo R. Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	sctp: implement report_ftsn for sctp_stream_interleave	Xin Long
	report_ftsn is added as a member of sctp_stream_interleave, used to skip tsn from tsnmap, remove old events from reasm or lobby queue, and abort pd for data or idata, called for SCTP_CMD_REPORT_FWDTSN cmd and asoc reset. sctp_report_iftsn works for ifwdtsn, and sctp_report_fwdtsn works for fwdtsn. Note that sctp_report_iftsn doesn't do asoc abort_pd, as stream abort_pd will be done when handling ifwdtsn. But when ftsn is equal with ftsn, which means asoc reset, asoc abort_pd has to be done. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo R. Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	sctp: implement validate_ftsn for sctp_stream_interleave	Xin Long
	validate_ftsn is added as a member of sctp_stream_interleave, used to validate ssn/chunk type for fwdtsn or mid (message id)/chunk type for ifwdtsn, called in sctp_sf_eat_fwd_tsn, just as validate_data. If this check fails, an abort packet will be sent, as said in section 2.3.1 of RFC8260. As ifwdtsn and fwdtsn chunks have different length, it also defines ftsn_chunk_len for sctp_stream_interleave to describe the chunk size. Then it replaces all sizeof(struct sctp_fwdtsn_chunk) with sctp_ftsnchk_len. It also adds the process for ifwdtsn in rx path. As Marcelo pointed out, there's no need to add event table for ifwdtsn, but just share prsctp_chunk_event_table with fwdtsn's. It would drop fwdtsn chunk for ifwdtsn and drop ifwdtsn chunk for fwdtsn by calling validate_ftsn in sctp_sf_eat_fwd_tsn. After this patch, the ifwdtsn can be accepted. Note that this patch also removes the sctp.intl_enable check for idata chunks in sctp_chunk_event_lookup, as it will do this check in validate_data later. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo R. Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	sctp: implement generate_ftsn for sctp_stream_interleave	Xin Long
	generate_ftsn is added as a member of sctp_stream_interleave, used to create fwdtsn or ifwdtsn chunk according to abandoned chunks, called in sctp_retransmit and sctp_outq_sack. sctp_generate_iftsn works for ifwdtsn, and sctp_generate_fwdtsn is still used for making fwdtsn. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo R. Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	sctp: add basic structures and make chunk function for ifwdtsn	Xin Long
	sctp_ifwdtsn_skip, sctp_ifwdtsn_hdr and sctp_ifwdtsn_chunk are used to define and parse I-FWD TSN chunk format, and sctp_make_ifwdtsn is a function to build the chunk. The I-FORWARD-TSN Chunk Format is defined in section 2.3.1 of RFC8260. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Marcelo R. Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: phy: phylink: Handle NULL fwnode_handle	Florian Fainelli
	Unlike the various of_* routines to fetch properties, fwnode_* routines can have an early check against a NULL fwnode_handle reference which makes them return -EINVAL (see fwnode_call_int_op), thus making it virtually impossible to differentiate what type of error is going on. Have an early check in phylink_register_sfp() so we can keep proceeding with the initialization, there is not much we can do without a valid fwnode_handle except return early and treat this similarly to -ENOENT. Fixes: 8fa7b9b6af25 ("phylink: convert to fwnode") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Acked-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	qmi_wwan: set FLAG_SEND_ZLP to avoid network initiated disconnect	Bjørn Mork
	It has been reported that the dummy byte we add to avoid ZLPs can be forwarded by the modem to the PGW/GGSN, and that some operators will drop the connection if this happens. In theory, QMI devices are based on CDC ECM and should as such both support ZLPs and silently ignore the dummy byte. The latter assumption failed. Let's test out the first. Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: usb: qmi_wwan: add Telit ME910 PID 0x1101 support	Daniele Palmas
	This patch adds support for Telit ME910 PID 0x1101. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	Merge branch 'net-sched-Make-qdisc-offload-uapi-uniform'	David S. Miller
	Yuval Mintz says: ==================== net: sched: Make qdisc offload uapi uniform Several qdiscs can already be offloaded to hardware, but there's an inconsistecy in regard to the uapi through which they indicate such an offload is taking place - indication is passed to the user via TCA_OPTIONS where each qdisc retains private logic for setting it. The recent addition of offloading to RED in 602f3baf2218 ("net_sch: red: Add offload ability to RED qdisc") caused the addition of yet another uapi field for this purpose - TC_RED_OFFLOADED. For clarity and prevention of bloat in the uapi we want to eliminate said added uapi, replacing it with a common mechanism that can be used to reflect offload status of the various qdiscs. The first patch introduces TCA_HW_OFFLOAD as the generic message meant for this purpose. The second changes the current RED implementation into setting the internal bits necessary for passing it, and the third removes TC_RED_OFFLOADED as its no longer needed. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	pkt_sched: Remove TC_RED_OFFLOADED from uapi	Yuval Mintz
	Following the previous patch, RED is now using the new uniform uapi for indicating it's offloaded. As a result, TC_RED_OFFLOADED is no longer utilized by kernel and can be removed [as it's still not part of any stable release]. Fixes: 602f3baf2218 ("net_sch: red: Add offload ability to RED qdisc") Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: sched: Move to new offload indication in RED	Yuval Mintz
	Let RED utilize the new internal flag, TCQ_F_OFFLOADED, to mark a given qdisc as offloaded instead of using a dedicated indication. Also, change internal logic into looking at said flag when possible. Fixes: 602f3baf2218 ("net_sch: red: Add offload ability to RED qdisc") Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: sched: Add TCA_HW_OFFLOAD	Yuval Mintz
	Qdiscs can be offloaded to HW, but current implementation isn't uniform. Instead, qdiscs either pass information about offload status via their TCA_OPTIONS or omit it altogether. Introduce a new attribute - TCA_HW_OFFLOAD that would form a uniform uAPI for the offloading status of qdiscs. Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: alteon: acenic: clean up indentation issue	Colin Ian King
	There is a hunk of code that is incorrectly indented with spaces and rather than a tab. Clean this up. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	Merge branch 'sfp-SFF-module-support'	David S. Miller
	Russell King says: ==================== Add SFF module support Add support for SFF modules. SFF modules are similar to SFP modules, but they have fewer control signals, and are soldered down rather than pluggable. They also have different IDs in the EEPROM to identify as soldered down SFF modules. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	sfp: add sff module support	Russell King
	Add support for SFF modules, which are soldered down SFP modules. These have a different phys_id value, and also have the present and rate select signals omitted compared with their socketed counter-parts. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	dt-bindings: add sff,sff binding for SFP support	Russell King
	Add "sff,sff" for SFF module support with SFP. These have a different phys_id value, and also have the present and rate select signals omitted compared with their socketed counter-parts. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	Merge branch 'nfp-fix-rtsym-and-XPB-register-handling-in-debug-dump'	David S. Miller
	Simon Horman says: ==================== nfp: fix rtsym and XPB register handling in debug dump this series resolves two problems in the recently added debug dump facility. * Correctly handle reading absolute rtysms * Correctly handle special-case PB register reads These fixes are for code only present in net-next. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	nfp: fix XPB register reads in debug dump	Carl Heymann
	For XPB registers reads, some island IDs require special handling (e.g. ARM island), which is already taken care of in nfp_xpb_readl(), so use that instead of a straight CPP read. Without this fix all "xpbm:ArmIsldXpbmMap.*" registers are reported as 0xffffffff. It has also been observed to cause a system reboot. With this fix correct values are reported, none of which are 0xffffffff. The values may be read using ethtool debug level 2. # ethtool -W <netdev> 2 # ethtool -w <netdev> data dump.dat Fixes: 0e6c4955e149 ("nfp: dump CPP, XPB and direct ME CSRs") Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	nfp: fix absolute rtsym handling in debug dump	Carl Heymann
	In TLV-based ethtool debug dumps, don't do a CPP read for absolute rtsyms, use the addr field in the symbol table directly as the value. Without this fix rtsym gro_release_ring_0 is 4 bytes of zeros. With this fix the correct value, 0x0000004a 0x00000000 is reported. The values may be read using ethtool debug level 2. # ethtool -W <netdev> 2 # ethtool -w <netdev> data dump.dat Fixes: e1e798e3fd93 ("nfp: dump rtsyms") Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	Merge branch 'aquantia-fixes'	David S. Miller
	Igor Russkikh says: ==================== net: aquantia: Atlantic driver 12/2017 updates The patchset contains important hardware fix for machines with large MRRS and couple of improvement in stats and capabilities reporting patch v3: - Fixed patch #7 after Andrew's finding. NIC level stats actually have to be cleaned only on hw struct creation (and this is done in kzalloc). On each hwinit we only have to reset link state to make sure hw stats update will not increment nic stats during init. patch v2: - split into more detailed commits Comment from David on wrong defines case will be submitted separately later ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: aquantia: Increment driver version	Igor Russkikh
	Add a suffix to distinguish kernel mainline version and aquantia releases Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: aquantia: Fix typo in ethtool statistics names	Igor Russkikh
	Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: aquantia: Update hw counters on hw init	Igor Russkikh
	On very first start we should read out current HW counter values to make diff based calculations later. This also should be done each time NIC gets down/up or wakes up after sleep state. We reset link state explicitly to prevent diffs from being summed this first time. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: aquantia: Improve link state and statistics check interval callback	Igor Russkikh
	Reduce timeout from 2 secs to 1 sec. If link is down, reduce it to 500msec. This speeds up link detection. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: aquantia: Fill in multicast counter in ndev stats from hardware	Igor Russkikh
	This metric comes from HW and is also diff-calculated, like other counters Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: aquantia: Fill ndev stat couters from hardware	Igor Russkikh
	Originally they were filled from ring sw counters. These sometimes incorrectly calculate byte and packet amounts when using LRO/LSO and jumboframes. Filling ndev counters from hardware makes them precise. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: aquantia: Extend stat counters to 64bit values	Igor Russkikh
	Device hardware provides only 32bit counters. Using these directly causes byte counters to overflow soon. A separate nic level structure with 64 bit counters is now used to collect incrementally all the stats and report these counters to ethtool stats and ndev stats. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: aquantia: Fix hardware DMA stream overload on large MRRS	Igor Russkikh
	Systems with large MRRS on device (2K, 4K) with high data rates and/or large MTU, atlantic observes DMA packet buffer overflow. On some systems that causes PCIe transaction errors, hardware NMIs or datapath freeze. This patch 1) Limits MRRS from device side to 2K (thats maximum our hardware supports) 2) Limit maximum size of outstanding TX DMA data read requests. This makes hardware buffers running fine. Signed-off-by: Pavel Belous <pavel.belous@aquantia.com> Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: aquantia: Fix actual speed capabilities reporting	Igor Russkikh
	Different hardware device Ids correspond to different maximum speed available. Extra checks were added for devices D108 and D109 to remove unsupported speeds from these device capabilities list. Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	Merge branch 'erspan-version-2'	David S. Miller
	William Tu says: ==================== ERSPAN version 2 (type III) support ERSPAN has two versions, v1 (type II) and v2 (type III). This patch series add support for erspan v2 based on existing erspan v1 implementation. The first patch refactors the existing erspan v1's header structure, making it extensible to put additional v2's header. The second and third patch introduces erspan v2's implementation to ipv4 and ipv6 erspan, for both native mode and collect metadata mode. Finally, test cases are added under the samples/bpf. Note: ERSPAN version 2 has many features and this patch does not implement all. One major use case of version 2 over version 1 is its timestamp and direction. So the traffic collector is able to distinguish the mirrorred traffic better. Other features such as SGT (security group tag), FT (frame type) for carrying non-ethernet packet, and optional subheader are not implemented yet. Example commandline for ERSPAN version 2: ip link add dev ip6erspan11 type ip6erspan seq key 102 \ local fc00:100::2 remote fc00:100::1 \ erspan_ver 2 erspan_dir 1 erspan_hwid 17 The corresponding iproute2 patch: https://marc.info/?l=linux-netdev&m=151321141525106&w=2 William Tu (4): net: erspan: refactor existing erspan code net: erspan: introduce erspan v2 for ip_gre ip6_gre: add erspan v2 support samples/bpf: add erspan v2 sample code include/net/erspan.h \| 152 ++++++++++++++++++++++++++++++++++++++--- include/net/ip6_tunnel.h \| 3 + include/net/ip_tunnels.h \| 5 +- include/uapi/linux/if_ether.h \| 1 + include/uapi/linux/if_tunnel.h \| 3 + net/ipv4/ip_gre.c \| 124 +++++++++++++++++++++++++++------ net/ipv6/ip6_gre.c \| 139 +++++++++++++++++++++++++++++++------ net/openvswitch/flow_netlink.c \| 8 +-- samples/bpf/tcbpf2_kern.c \| 77 ++++++++++++++++++--- samples/bpf/test_tunnel_bpf.sh \| 38 ++++++++--- 10 files changed, 472 insertions(+), 78 deletions(-) -- A simple script to test it: set -ex function cleanup() { set +ex ip netns del ns0 ip link del ip6erspan11 ip link del veth1 } function main() { trap cleanup 0 2 3 9 ip netns add ns0 ip link add veth0 type veth peer name veth1 ip link set veth0 netns ns0 # non-namespace ip addr add dev veth1 fc00:100::2/96 if [ "$1" == "v1" ]; then echo "create IP6 ERSPAN v1 tunnel" ip link add dev ip6erspan11 type ip6erspan seq key 102 \ local fc00:100::2 remote fc00:100::1 \ erspan 123 erspan_ver 1 else echo "create IP6 ERSPAN v2 tunnel" ip link add dev ip6erspan11 type ip6erspan seq key 102 \ local fc00:100::2 remote fc00:100::1 \ erspan_ver 2 erspan_dir 1 erspan_hwid 17 fi ip addr add dev ip6erspan11 fc00:200::2/96 ip addr add dev ip6erspan11 10.10.200.2/24 # namespace: ns0 ip netns exec ns0 ip addr add fc00:100::1/96 dev veth0 if [ "$1" == "v1" ]; then ip netns exec ns0 \ ip link add dev ip6erspan00 type ip6erspan seq key 102 \ local fc00:100::1 remote fc00:100::2 \ erspan 123 erspan_ver 1 else ip netns exec ns0 \ ip link add dev ip6erspan00 type ip6erspan seq key 102 \ local fc00:100::1 remote fc00:100::2 \ erspan_ver 2 erspan_dir 1 erspan_hwid 7 fi ip netns exec ns0 ip addr add dev ip6erspan00 fc00:200::1/96 ip netns exec ns0 ip addr add dev ip6erspan00 10.10.200.1/24 ip link set dev veth1 up ip link set dev ip6erspan11 up ip netns exec ns0 ip link set dev ip6erspan00 up ip netns exec ns0 ip link set dev veth0 up } main $1 ping6 -c 1 fc00:100::1 \|\| true ping -c 3 10.10.200.1 exit 0 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	samples/bpf: add erspan v2 sample code	William Tu
	Extend the existing tests for ipv4 ipv6 erspan version 2. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	ip6_gre: add erspan v2 support	William Tu
	Similar to support for ipv4 erspan, this patch adds erspan v2 to ip6erspan tunnel. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: erspan: introduce erspan v2 for ip_gre	William Tu
	The patch adds support for erspan version 2. Not all features are supported in this patch. The SGT (security group tag), GRA (timestamp granularity), FT (frame type) are set to fixed value. Only hardware ID and direction are configurable. Optional subheader is also not supported. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	net: erspan: refactor existing erspan code	William Tu
	The patch refactors the existing erspan implementation in order to support erspan version 2, which has additional metadata. So, in stead of having one 'struct erspanhdr' holding erspan version 1, breaks it into 'struct erspan_base_hdr' and 'struct erspan_metadata'. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	Merge branch 'nfp-ethtool-flash-updates'	David S. Miller
	Jakub Kicinski says: ==================== nfp: ethtool flash updates Dirk says: This series adds the ability to update the control FW with ethtool. It should be noted that the locking scheme here is to release the RTNL lock before the flashing operation and to take it again afterwards to ensure consistent state from the core code point of view. In this time, we take a reference to the device to prevent the device being freed while its being flashed. This provides protection for the device being flashed while at the same time not holding up any networking related functions which would otherwise be locked out due to RTNL being held. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	nfp: implement firmware flashing	Dirk van der Merwe
	Firmware flashing takes around 60s (specified to not take more than 70s). Prevent hogging the RTNL lock in this time and make use of the longer timeout for the NSP command. The timeout is set to 2.5 * 70 seconds. We only allow flashing the firmware from reprs or PF netdevs. VFs do not have an app reference. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	nfp: extend NSP infrastructure for configurable timeouts	Dirk van der Merwe
	The firmware flashing NSP operation takes longer to execute than the current default timeout. We need a mechanism to set a longer timeout for some commands. This patch adds the infrastructure to this. The default timeout is still 30 seconds. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	Merge branch 'bpf-jit-fixes'	Alexei Starovoitov
	Daniel Borkmann says: ==================== Two fixes that deal with buggy usage of bpf_helper_changes_pkt_data() in the sense that they also reload cached skb data when there's no skb context but xdp one, for example. A fix where skb meta data is reloaded out of the wrong register on helper call, rest is test cases and making sure on verifier side that there's always the guarantee that ctx sits in r1. Thanks! ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15	bpf: add test case for ld_abs and helper changing pkt data	Daniel Borkmann
	Add a test that i) uses LD_ABS, ii) zeroing R6 before call, iii) calls a helper that triggers reload of cached skb data, iv) uses LD_ABS again. It's added for test_bpf in order to do runtime testing after JITing as well as test_verifier to test that the sequence is allowed. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15	bpf, sparc: fix usage of wrong reg for load_skb_regs after call	Daniel Borkmann
	When LD_ABS/IND is used in the program, and we have a BPF helper call that changes packet data (bpf_helper_changes_pkt_data() returns true), then in case of sparc JIT, we try to reload cached skb data from bpf2sparc[BPF_REG_6]. However, there is no such guarantee or assumption that skb sits in R6 at this point, all helpers changing skb data only have a guarantee that skb sits in R1. Therefore, store BPF R1 in L7 temporarily and after procedure call use L7 to reload cached skb data. skb sitting in R6 is only true at the time when LD_ABS/IND is executed. Fixes: 7a12b5031c6b ("sparc64: Add eBPF JIT.") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15	bpf: guarantee r1 to be ctx in case of bpf_helper_changes_pkt_data	Daniel Borkmann
	Some JITs don't cache skb context on stack in prologue, so when LD_ABS/IND is used and helper calls yield bpf_helper_changes_pkt_data() as true, then they temporarily save/restore skb pointer. However, the assumption that skb always has to be in r1 is a bit of a gamble. Right now it turned out to be true for all helpers listed in bpf_helper_changes_pkt_data(), but lets enforce that from verifier side, so that we make this a guarantee and bail out if the func proto is misconfigured in future helpers. In case of BPF helper calls from cBPF, bpf_helper_changes_pkt_data() is completely unrelevant here (since cBPF is context read-only) and therefore always false. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15	bpf, ppc64: do not reload skb pointers in non-skb context	Daniel Borkmann
	The assumption of unconditionally reloading skb pointers on BPF helper calls where bpf_helper_changes_pkt_data() holds true is wrong. There can be different contexts where the helper would enforce a reload such as in case of XDP. Here, we do have a struct xdp_buff instead of struct sk_buff as context, thus this will access garbage. JITs only ever need to deal with cached skb pointer reload when ld_abs/ind was seen, therefore guard the reload behind SEEN_SKB. Fixes: 156d0e290e96 ("powerpc/ebpf/jit: Implement JIT compiler for extended BPF") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Tested-by: Sandipan Das <sandipan@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15	bpf, s390x: do not reload skb pointers in non-skb context	Daniel Borkmann
	The assumption of unconditionally reloading skb pointers on BPF helper calls where bpf_helper_changes_pkt_data() holds true is wrong. There can be different contexts where the BPF helper would enforce a reload such as in case of XDP. Here, we do have a struct xdp_buff instead of struct sk_buff as context, thus this will access garbage. JITs only ever need to deal with cached skb pointer reload when ld_abs/ind was seen, therefore guard the reload behind SEEN_SKB only. Tested on s390x. Fixes: 9db7f2b81880 ("s390/bpf: recache skb->data/hlen for skb_vlan_push/pop") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Cc: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2017-12-15	Merge branch 'ipvlan-packet-scrub'	David S. Miller
	Mahesh Bandewar says: ==================== ipvlan: packet scrub While crossing namespace boundary IPvlan aggressively scrubs packets. This is creating problems. First thing is that scrubbing changes the packet type in skb meta-data to PACKET_HOST. This causes erroneous packet delivery when dev_forward_skb() has already marked the packet type as OTHER_HOST. On the egress side scrubbing just before calling dev_queue_xmit() creates another set of problems. Scrubbing remove skb->sk so the prio update gets missed and more seriously, socket back-pressure fails making TSQ not function correctly. The first patch in the series just reverts the earlier change which was adding a mac-check, but that is unnecessary if packet_type that dev_forward_skb() has set is honored. The second path removes two of the scrubs which are causing problems described above. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	ipvlan: remove excessive packet scrubbing	Mahesh Bandewar
	IPvlan currently scrubs packets at every location where packets may be crossing namespace boundary. Though this is desirable, currently IPvlan does it more than necessary. e.g. packets that are going to take dev_forward_skb() path will get scrubbed so no point in scrubbing them before forwarding. Another side-effect of scrubbing is that pkt-type gets set to PACKET_HOST which overrides what was already been set by the earlier path making erroneous delivery of the packets. Also scrubbing packets just before calling dev_queue_xmit() has detrimental effects since packets lose skb->sk and because of that miss prio updates, incorrect socket back-pressure and would even break TSQ. Fixes: b93dd49c1a35 ('ipvlan: Scrub skb before crossing the namespace boundary') Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-15	Revert "ipvlan: add L2 check for packets arriving via virtual devices"	Mahesh Bandewar
	This reverts commit 92ff42645028fa6f9b8aa767718457b9264316b4. Even though the check added is not that taxing, it's not really needed. First of all this will be per packet cost and second thing is that the eth_type_trans() already does this correctly. The excessive scrubbing in IPvlan was changing the pkt-type skb metadata of the packet which made it necessary to re-check the mac. The subsequent patch in this series removes the faulty packet-scrub. Signed-off-by: Mahesh Bandewar <maheshb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>