linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2020-03-30	mfd: sc27xx: Add USB charger type detection support	Baolin Wang
	The Spreadtrum SC27XX series PMICs supply the USB charger type detection function, and related registers are located on the PMIC global registers region, thus we implement and export this function in the MFD driver for users to get the USB charger type. Signed-off-by: Baolin Wang <baolin.wang7@gmail.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	dt-bindings: mfd: Document STM32 low power timer bindings	Benjamin Gaignard
	Add a subnode to STM low power timer bindings to support timer driver Signed-off-by: Benjamin Gaignard <benjamin.gaignard@st.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: rk808: Convert RK805 to shutdown/suspend hooks	Robin Murphy
	RK805 has the same kind of dual-role sleep/shutdown pin as RK809/RK817, so it makes little sense for the driver to have to have two completely different mechanisms to handle essentially the same thing. Move RK805 over to the shutdown/suspend flow to clean things up. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: rk808: Reduce shutdown duplication	Robin Murphy
	Rather than having 3 almost-identical functions plus the machinery to keep track of them, it's far simpler to just dynamically select the appropriate register field per variant. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: rk808: Stop using syscore ops	Robin Murphy
	Setting the SLEEP pin to its shutdown function for appropriate PMICs doesn't need to happen in single-CPU context, so there's really no point involving the syscore machinery. Hook it up to the standard driver model shutdown method instead. This also obviates the issue that the syscore ops weren't being unregistered on probe failure or module removal. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: rk808: Ensure suspend/resume hooks always work	Robin Murphy
	The RK809/RK817 suspend/resume hooks should not have to depend on whether this driver owns the pm_power_off hook, and thus the global rk808_i2c_client is set - indeed, the GPIO-based control is really only relevant when PSCI firmware is in charge of power rather than the kernel. As driver model callbacks, they have an appropriate device argument to hand, so can just always use that. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: rk808: Always use poweroff when requested	Soeren Moch
	With the device tree property "rockchip,system-power-controller" we explicitly request to use this PMIC to power off the system. So always register our poweroff function, even if some other handler (probably PSCI poweroff) was registered before. This does tend to reveal a warning on shutdown due to the Rockchip I2C driver not implementing an atomic transfer method, however since the write to DEV_OFF takes effect immediately the I2C completion interrupt is moot anyway, and as the very last thing written to the console it is only visible to users going out of their way to capture serial output. Signed-off-by: Soeren Moch <smoch@web.de> Reviewed-by: Heiko Stuebner <heiko@sntech.de> [ rm: note potential warning in commit message ] Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: omap: Remove useless cast for driver.name	Corentin Labbe
	device_driver name is const char pointer, so it not useful to cast xx_driver_name (which is already const char). Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: Kconfig: Fix some misspelling of the word functionality	Christophe JAILLET
	Fix several variations of typo around functionali{ty,es}. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: pm8xxx: Replace zero-length array with flexible-array member	Gustavo A. R. Silva
	The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: omap-usb-tll: Replace zero-length array with flexible-array member	Gustavo A. R. Silva
	The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: cpcap: Fix compile if MFD_CORE is not selected	Tony Lindgren
	If only cpcap mfd driver is selected we will get: ERROR: "devm_mfd_add_devices" [drivers/mfd/motorola-cpcap.ko] undefined! This is because Kconfig is missing select for MFD_CORE. Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	mfd: cros_ec: Check DT node for usbpd-notify add	Prashant Malani
	Add a check to ensure there is indeed an EC device tree entry before adding the cros-usbpd-notify device. This covers configs where both CONFIG_ACPI and CONFIG_OF are defined, but the EC device is defined using device tree and not in ACPI. Fixes: 4602dce0361e ("mfd: cros_ec: Add cros-usbpd-notify subdevice") Signed-off-by: Prashant Malani <pmalani@chromium.org> Tested-by: Enric Balletbo i Serra <enric.balletbo@collabora.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
2020-03-30	Merge branches 'ib-mfd-iio-input-5.7' and 'ib-mfd-iio-rtc-5.7' into ↵	Lee Jones
	ibs-for-mfd-merged
2020-03-30	iio: cros_ec: Use Hertz as unit for sampling frequency	Gwendal Grignou
	To be compliant with other sensors, set and get sensor sampling frequency in Hz, not mHz. Fixes: ae7b02ad2f32 ("iio: common: cros_ec_sensors: Expose cros_ec_sensors frequency range via iio sysfs") Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
2020-03-30	Merge tag 'drm-intel-next-fixes-2020-03-27' of ↵	Dave Airlie
	git://anongit.freedesktop.org/drm/drm-intel into drm-next Fixes for instability on Baytrail and Haswell; Ice Lake RPS; Sandy Bridge RC6; and few others around GT hangchec/reset; livelock; and a null dereference. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200327081607.GA3082710@intel.com
2020-03-29	Merge branch 'ethtool-netlink-interface-part-4'	David S. Miller
	Michal Kubecek says: ==================== ethtool netlink interface, part 4 Implementation of more netlink request types: - coalescing (ethtool -c/-C, patches 2-4) - pause parameters (ethtool -a/-A, patches 5-7) - EEE settings (--show-eee / --set-eee, patches 8-10) - timestamping info (-T, patches 11-12) Patch 1 is a fix for netdev reference leak similar to commit 2f599ec422ad ("ethtool: fix reference leak in some _SET handlers") but fixing a code Changes in v3 - change "one-step-" Tx type names to "onestep-*", (patch 11, suggested by Richard Cochran - use "TSINFO" rather than "TIMESTAMP" for timestamping information constants and adjust symbol names (patch 12, suggested by Richard Cochran) Changes in v2: - fix compiler warning in net_hwtstamp_validate() (patch 11) - fix follow-up lines alignment (whitespace only, patches 3 and 8) which is only in net-next tree at the moment. ==================== Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: provide timestamping information with TSINFO_GET request	Michal Kubecek
	Implement TSINFO_GET request to get timestamping information for a network device. This is traditionally available via ETHTOOL_GET_TS_INFO ioctl request. Move part of ethtool_get_ts_info() into common.c so that ioctl and netlink code use the same logic to get timestamping information from the device. v3: use "TSINFO" rather than "TIMESTAMP", suggested by Richard Cochran Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: add timestamping related string sets	Michal Kubecek
	Add three string sets related to timestamping information: ETH_SS_SOF_TIMESTAMPING: SOF_TIMESTAMPING_* flags ETH_SS_TS_TX_TYPES: timestamping Tx types ETH_SS_TS_RX_FILTERS: timestamping Rx filters These will be used for TIMESTAMP_GET request. v2: avoid compiler warning ("enumeration value not handled in switch") in net_hwtstamp_validate() v3: omit dash in Tx type names ("one-step-" -> "onestep-"), suggested by Richard Cochran Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: add EEE_NTF notification	Michal Kubecek
	Send ETHTOOL_MSG_EEE_NTF notification whenever EEE settings of a network device are modified using ETHTOOL_MSG_EEE_SET netlink message or ETHTOOL_SEEE ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: set EEE settings with EEE_SET request	Michal Kubecek
	Implement EEE_SET netlink request to set EEE settings of a network device. These are traditionally set with ETHTOOL_SEEE ioctl request. The netlink interface allows setting the EEE status for all link modes supported by kernel but only first 32 link modes can be set at the moment as only those are supported by the ethtool_ops callback. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: provide EEE settings with EEE_GET request	Michal Kubecek
	Implement EEE_GET request to get EEE settings of a network device. These are traditionally available via ETHTOOL_GEEE ioctl request. The netlink interface allows reporting EEE status for all link modes supported by kernel but only first 32 link modes are provided at the moment as only those are reported by the ethtool_ops callback and drivers. v2: fix alignment (whitespace only) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: add PAUSE_NTF notification	Michal Kubecek
	Send ETHTOOL_MSG_PAUSE_NTF notification whenever pause parameters of a network device are modified using ETHTOOL_MSG_PAUSE_SET netlink message or ETHTOOL_SPAUSEPARAM ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: set pause parameters with PAUSE_SET request	Michal Kubecek
	Implement PAUSE_SET netlink request to set pause parameters of a network device. Thease are traditionally set with ETHTOOL_SPAUSEPARAM ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: provide pause parameters with PAUSE_GET request	Michal Kubecek
	Implement PAUSE_GET request to get pause parameters of a network device. These are traditionally available via ETHTOOL_GPAUSEPARAM ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: add COALESCE_NTF notification	Michal Kubecek
	Send ETHTOOL_MSG_COALESCE_NTF notification whenever coalescing parameters of a network device are modified using ETHTOOL_MSG_COALESCE_SET netlink message or ETHTOOL_SCOALESCE ioctl request. Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: set coalescing parameters with COALESCE_SET request	Michal Kubecek
	Implement COALESCE_SET netlink request to set coalescing parameters of a network device. These are traditionally set with ETHTOOL_SCOALESCE ioctl request. This commit adds only support for device coalescing parameters, not per queue coalescing parameters. Like the ioctl implementation, the generic ethtool code checks if only supported parameters are modified; if not, first offending attribute is reported using extack. v2: fix alignment (whitespace only) Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: provide coalescing parameters with COALESCE_GET request	Michal Kubecek
	Implement COALESCE_GET request to get coalescing parameters of a network device. These are traditionally available via ETHTOOL_GCOALESCE ioctl request. This commit adds only support for device coalescing parameters, not per queue coalescing parameters. Omit attributes with zero values unless they are declared as supported (i.e. the corresponding bit in ethtool_ops::supported_coalesce_params is set). Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	ethtool: fix reference leak in ethnl_set_privflags()	Michal Kubecek
	Andrew noticed that some handlers for *_SET commands leak a netdev reference if required ethtool_ops callbacks do not exist. One of them is ethnl_set_privflags(), a simple reproducer would be e.g. ip link add veth1 type veth peer name veth2 ethtool --set-priv-flags veth1 foo on ip link del veth1 Make sure dev_put() is called when ethtool_ops check fails. Fixes: f265d799596a ("ethtool: set device private flags with PRIVFLAGS_SET request") Reported-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Michal Kubecek <mkubecek@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	Merge branch 'ipv6-add-rpl-source-routing'	David S. Miller
	Alexander Aring says: ==================== net: ipv6: add rpl source routing This patch series will add handling for RPL source routing handling and insertion (implement as lwtunnel)! I did an example prototype implementation in rpld for using this implementation in non-storing mode: https://github.com/linux-wpan/rpld/tree/nonstoring_mode I will also present a talk at netdev about it: https://netdevconf.info/0x14/session.html?talk-extend-segment-routing-for-RPL In receive handling I add handling for IPIP encapsulation as RFC6554 describes it as possible. For reasons I didn't implemented it yet for generating such packets because I am not really sure how/when this should happen. So far I understand there exists a draft yet which describes the cases (inclusive a Hop-by-Hop option which we also not support yet). https://tools.ietf.org/html/draft-ietf-roll-useofrplinfo-35 This is just the beginning to start implementation everything for yet, step by step. It works for my use cases yet to have it running on a 6LOWPAN _only_ network. I have some patches for iproute2 as well. A sidenote: I check on local addresses if they are part of segment routes, this is just to avoid stupid settings. A use can add addresses afterwards what I cannot control anymore but then it's users fault to make such thing. The receive handling checks for this as well which is required by RFC6554, so the next hops or when it comes back should drop it anyway. To make this possible I added functionality to pass the net structure to the build_state of lwtunnel (I hope I caught all lwtunnels). Another sidenote: I set the headroom value to 0 as I figured out it will break on interfaces with IPv6 min mtu if set to non zero for tunnels on L3. - Alex changes since v3: - use parse_nested which isn't deprecated - Thanks David Ahern - change to return -1 instead errno in exthdr handling to unify error code - change function name from ipv6_rpl_srh_decompress_size to ipv6_rpl_srh_size changes since v2: - add additional segdata length in lwtunnel build_state - fix build_state patch by not catching one inline noop function if LWTUNNEL is disabled Alexander Aring (5): include: uapi: linux: add rpl sr header definition addrconf: add functionality to check on rpl requirements net: ipv6: add support for rpl sr exthdr net: add net available in build_state net: ipv6: add rpl sr tunnel ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	net: ipv6: add rpl sr tunnel	Alexander Aring
	This patch adds functionality to configure routes for RPL source routing functionality. There is no IPIP functionality yet implemented which can be added later when the cases when to use IPv6 encapuslation comes more clear. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	net: add net available in build_state	Alexander Aring
	The build_state callback of lwtunnel doesn't contain the net namespace structure yet. This patch will add it so we can check on specific address configuration at creation time of rpl source routes. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	net: ipv6: add support for rpl sr exthdr	Alexander Aring
	This patch adds rpl source routing receive handling. Everything works only if sysconf "rpl_seg_enabled" and source routing is enabled. Mostly the same behaviour as IPv6 segmentation routing. To handle compression and uncompression a rpl.c file is created which contains the necessary functionality. The receive handling will also care about IPv6 encapsulated so far it's specified as possible nexthdr in RFC 6554. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	addrconf: add functionality to check on rpl requirements	Alexander Aring
	This patch adds a functionality to addrconf to check on a specific RPL address configuration. According to RFC 6554: To detect loops in the SRH, a router MUST determine if the SRH includes multiple addresses assigned to any interface on that router. If such addresses appear more than once and are separated by at least one address not assigned to that router. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	include: uapi: linux: add rpl sr header definition	Alexander Aring
	This patch adds a uapi header for rpl struct definition. The segments data can be accessed over rpl_segaddr or rpl_segdata macros. In case of compri and compre is zero the segment data is not compressed and can be accessed by rpl_segaddr. In the other case the compressed data can be accessed by rpl_segdata and interpreted as byte array. Signed-off-by: Alexander Aring <alex.aring@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-30	Merge tag 'amd-drm-next-5.7-2020-03-26' of ↵	Dave Airlie
	git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.7-2020-03-26: amdgpu: - Remove a dpm quirk that is not necessary - Fix handling of AC/DC mode in newer SMU firmwares on navi - SR-IOV fixes - RAS fixes scheduler: - Fix a race condition radeon: - Remove a dpm quirk that is not necessary Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200326155310.5486-1-alexander.deucher@amd.com
2020-03-29	net, ip_tunnel: fix interface lookup with no key	William Dauchy
	when creating a new ipip interface with no local/remote configuration, the lookup is done with TUNNEL_NO_KEY flag, making it impossible to match the new interface (only possible match being fallback or metada case interface); e.g: `ip link add tunl1 type ipip dev eth0` To fix this case, adding a flag check before the key comparison so we permit to match an interface with no local/remote config; it also avoids breaking possible userland tools relying on TUNNEL_NO_KEY flag and uninitialised key. context being on my side, I'm creating an extra ipip interface attached to the physical one, and moving it to a dedicated namespace. Fixes: c54419321455 ("GRE: Refactor GRE tunneling code.") Signed-off-by: William Dauchy <w.dauchy@criteo.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	Merge branch 'mptcp-multiple-subflows-path-management'	David S. Miller
	Mat Martineau says: ==================== Multipath TCP part 3: Multiple subflows and path management v2 -> v3: Remove 'inline' in .c files, fix uapi bit macros, and rebase. v1 -> v2: Rebase on current net-next, fix for netlink limit setting, and update .gitignore for selftest. This patch set allows more than one TCP subflow to be established and used for a multipath TCP connection. Subflows are added to an existing connection using the MP_JOIN option during the 3-way handshake. With multiple TCP subflows available, sent data is now stored in the MPTCP socket so it may be retransmitted on any TCP subflow if there is no DATA_ACK before a timeout. If an MPTCP-level timeout occurs, data is retransmitted using an available subflow. Storing this sent data requires the addition of memory accounting at the MPTCP level, which was previously delegated to the single subflow. Incoming DATA_ACKs now free data from the MPTCP-level retransmit buffer. IP addresses available for new subflow connections can now be advertised and received with the ADD_ADDR option, and the corresponding REMOVE_ADDR option likewise advertises that an address is no longer available. The MPTCP path manager netlink interface has commands to set in-kernel limits for the number of concurrent subflows and control the advertisement of IP addresses between peers. To track and debug MPTCP connections there are new MPTCP MIB counters, and subflow context can be requested using inet_diag. The MPTCP self-tests now validate multiple-subflow operation and the netlink path manager interface. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	selftests: add test-cases for MPTCP MP_JOIN	Paolo Abeni
	Use the pm netlink to configure the creation of several subflows, and verify that via MIB counters. Update the mptcp_connect program to allow reliable MP_JOIN handshake even on small data file Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	selftests: add PM netlink functional tests	Paolo Abeni
	This introduces basic self-tests for the PM netlink, checking the basic APIs and possible exceptional values. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	mptcp: add netlink-based PM	Paolo Abeni
	Expose a new netlink family to userspace to control the PM, setting: - list of local addresses to be signalled. - list of local addresses used to created subflows. - maximum number of add_addr option to react When the msk is fully established, the PM netlink attempts to announce the 'signal' list via the ADD_ADDR option. Since we currently lack the ADD_ADDR echo (and related event) only the first addr is sent. After exhausting the 'announce' list, the PM tries to create subflow for each addr in 'local' list, waiting for each connection to be completed before attempting the next one. Idea is to add an additional PM hook for ADD_ADDR echo, to allow the PM netlink announcing multiple addresses, in sequence. Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	mptcp: add and use MIB counter infrastructure	Florian Westphal
	Exported via same /proc file as the Linux TCP MIB counters, so "netstat -s" or "nstat" will show them automatically. The MPTCP MIB counters are allocated in a distinct pcpu area in order to avoid bloating/wasting TCP pcpu memory. Counters are allocated once the first MPTCP socket is created in a network namespace and free'd on exit. If no sockets have been allocated, all-zero mptcp counters are shown. The MIB counter list is taken from the multipath-tcp.org kernel, but only a few counters have been picked up so far. The counter list can be increased at any time later on. v2 -> v3: - remove 'inline' in foo.c files (David S. Miller) Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	mptcp: allow dumping subflow context to userspace	Davide Caratti
	add ulp-specific diagnostic functions, so that subflow information can be dumped to userspace programs like 'ss'. v2 -> v3: - uapi: use bit macros appropriate for userspace Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Co-developed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	mptcp: implement and use MPTCP-level retransmission	Paolo Abeni
	On timeout event, schedule a work queue to do the retransmission. Retransmission code closely resembles the sendmsg() implementation and re-uses mptcp_sendmsg_frag, providing a dummy msghdr - for flags' sake - and peeking the relevant dfrag from the rtx head. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	mptcp: rework mptcp_sendmsg_frag to accept optional dfrag	Paolo Abeni
	This will simplify mptcp-level retransmission implementation in the next patch. If dfrag is provided by the caller, skip kernel space memory allocation and use data and metadata provided by the dfrag itself. Because a peer could ack data at TCP level but refrain from sending mptcp-level ACKs, we could grow the mptcp socket backlog indefinitely. We should thus block mptcp_sendmsg until the peer has acked some of the sent data. In order to be able to do so, increment the mptcp socket wmem_queued counter on memory allocation and decrement it when releasing the memory on mptcp-level ack reception. Because TCP performns sndbuf auto-tuning up to tcp_wmem_max[2], make this the mptcp sk_sndbuf limit. In the future we could add experiment with autotuning as TCP does in tcp_sndbuf_expand(). v2 -> v3: - remove 'inline' in foo.c files (David S. Miller) Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	mptcp: allow partial cleaning of rtx head dfrag	Florian Westphal
	After adding wmem accounting for the mptcp socket we could get into a situation where the mptcp socket can't transmit more data, and mptcp_clean_una doesn't reduce wmem even if snd_una has advanced because it currently will only remove entire dfrags. Allow advancing the dfrag head sequence and reduce wmem, even though this isn't correct (as we can't release the page). Because we will soon block on mptcp sk in case wmem is too large, call sk_stream_write_space() in case we reduced the backlog so userspace task blocked in sendmsg or poll will be woken up. This isn't an issue if the send buffer is large, but it is when SO_SNDBUF is used to reduce it to a lower value. Note we can still get a deadlock for low SO_SNDBUF values in case both sides of the connection write to the socket: both could be blocked due to wmem being too small -- and current mptcp stack will only increment mptcp ack_seq on recv. This doesn't happen with the selftest as it uses poll() and will always call recv if there is data to read. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	mptcp: implement memory accounting for mptcp rtx queue	Paolo Abeni
	Charge the data on the rtx queue to the master MPTCP socket, too. Such memory in uncharged when the data is acked/dequeued. Also account mptcp sockets inuse via a protocol specific pcpu counter. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	mptcp: introduce MPTCP retransmission timer	Paolo Abeni
	The timer will be used to schedule retransmission. It's frequency is based on the current subflow RTO estimation and is reset on every una_seq update The timer is clearer for good by __mptcp_clear_xmit() Also clean MPTCP rtx queue before each transmission. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	mptcp: queue data for mptcp level retransmission	Paolo Abeni
	Keep the send page fragment on an MPTCP level retransmission queue. The queue entries are allocated inside the page frag allocator, acquiring an additional reference to the page for each list entry. Also switch to a custom page frag refill function, to ensure that the current page fragment can always host an MPTCP rtx queue entry. The MPTCP rtx queue is flushed at disconnect() and close() time Note that now we need to call __mptcp_init_sock() regardless of mptcp enable status, as the destructor will try to walk the rtx_queue. v2 -> v3: - remove 'inline' in foo.c files (David S. Miller) Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-03-29	mptcp: update per unacked sequence on pkt reception	Paolo Abeni
	So that we keep per unacked sequence number consistent; since we update per msk data, use an atomic64 cmpxchg() to protect against concurrent updates from multiple subflows. Initialize the snd_una at connect()/accept() time. Co-developed-by: Florian Westphal <fw@strlen.de> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>