linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2021-01-23	ALSA: usb-audio: workaround for iface reset issue	Takashi Iwai
	The recently introduced sample rate validation code seems causing a problem on some devices; namely, after performing this, the bus gets screwed and it influences even on other USB devices. As a quick workaround, perform it only for the necessary devices; currently MOTU devices are known to need the valid altset checks, so filter out other devices. Fixes: 93db51d06b32 ("ALSA: usb-audio: Check valid altsetting at parsing rates for UAC2/3") Reported-by: Jamie Heilman <jamie@audible.transient.net> BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1178203 Link: https://lore.kernel.org/r/20210123155842.22652-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-01-23	ALSA: pcm: One more dependency for hw constraints	Takashi Iwai
	The fix for a long-standing USB-audio bug required one more dependency variable to be added to the hw constraints. Unfortunately I didn't realize at debugging that the new addition may result in the overflow of the dependency array of each snd_pcm_hw_rule (up to three plus a sentinel), because USB-audio driver adds one more dependency only for a certain device and bus, hence it works as is for many devices. But in a bad case, a simple open always results in -EINVAL (with kernel WARNING if CONFIG_SND_DEBUG is set) no matter what is passed. Since the dependencies are real and unavoidable (USB-audio restricts the hw_params per looping over the format/rate/channels combos), the only good solution seems to raise the bar for one more dependency for snd_pcm_hw_rule -- so does this patch: now the hw constraint dependencies can be up to four. Fixes: 506c203cc3de ("ALSA: usb-audio: Fix hw constraints dependencies") Reported-by: Jamie Heilman <jamie@audible.transient.net> Link: https://lore.kernel.org/r/20210123155730.22576-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2021-01-23	kbuild: simplify GCC_PLUGINS enablement in dummy-tools/gcc	Masahiro Yamada
	With commit 1e860048c53e ("gcc-plugins: simplify GCC plugin-dev capability test") applied, this hunk can be way simplified because now scripts/gcc-plugins/Kconfig only checks plugin-version.h Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-01-23	Documentation/Kbuild: Remove references to gcc-plugin.sh	Robert Karszniewicz
	gcc-plugin.sh has been removed in commit 1e860048c53e ("gcc-plugins: simplify GCC plugin-dev capability test"). Signed-off-by: Robert Karszniewicz <r.karszniewicz@phytec.de> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-01-23	cifs: do not fail __smb_send_rqst if non-fatal signals are pending	Ronnie Sahlberg
	RHBZ 1848178 The original intent of returning an error in this function in the patch: "CIFS: Mask off signals when sending SMB packets" was to avoid interrupting packet send in the middle of sending the data (and thus breaking an SMB connection), but we also don't want to fail the request for non-fatal signals even before we have had a chance to try to send it (the reported problem could be reproduced e.g. by exiting a child process when the parent process was in the midst of calling futimens to update a file's timestamps). In addition, since the signal may remain pending when we enter the sending loop, we may end up not sending the whole packet before TCP buffers become full. In this case the code returns -EINTR but what we need here is to return -ERESTARTSYS instead to allow system calls to be restarted. Fixes: b30c74c73c78 ("CIFS: Mask off signals when sending SMB packets") Cc: stable@vger.kernel.org # v5.1+ Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2021-01-22	Merge branch 'mlxsw-expose-number-of-physical-ports'	Jakub Kicinski
	Ido Schimmel says: ==================== mlxsw: Expose number of physical ports The switch ASIC has a limited capacity of physical ports that it can support. While each system is brought up with a different number of ports, this number can be increased via splitting up to the ASIC's limit. Expose physical ports as a devlink resource so that user space will have visibility into the maximum number of ports that can be supported and the current occupancy. With this resource it is possible, for example, to write generic (i.e., not platform dependent) tests for port splitting. Patch #1 adds the new resource and patch #2 adds a selftest. v2: * Add the physical ports resource as a generic devlink resource so that it could be re-used by other device drivers ==================== Link: https://lore.kernel.org/r/20210121131024.2656154-1-idosch@idosch.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	selftests: mlxsw: Add a scale test for physical ports	Danielle Ratson
	Query the maximum number of supported physical ports using devlink-resource and test that this number can be reached by splitting each of the splittable ports to its width. Test that an error is returned in case the maximum number is exceeded. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	mlxsw: Register physical ports as a devlink resource	Danielle Ratson
	The switch ASIC has a limited capacity of physical ('flavour physical' in devlink terminology) ports that it can support. While each system is brought up with a different number of ports, this number can be increased via splitting up to the ASIC's limit. Expose physical ports as a devlink resource so that user space will have visibility to the maximum number of ports that can be supported and the current occupancy. In addition, add a "Generic Resources" section in devlink-resource documentation so the different drivers will be aligned by the same resource name when exposing to user space. Signed-off-by: Danielle Ratson <danieller@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	Merge branch 'htb-offload'	Jakub Kicinski
	Maxim Mikityanskiy says: ==================== HTB offload This series adds support for HTB offload to the HTB qdisc, and adds usage to mlx5 driver. The previous RFCs are available at [1], [2]. The feature is intended to solve the performance bottleneck caused by the single lock of the HTB qdisc, which prevents it from scaling well. The HTB algorithm itself is offloaded to the device, eliminating the need to take the root lock of HTB on every packet. Classification part is done in clsact (still in software) to avoid acquiring the lock, which imposes a limitation that filters can target only leaf classes. The speedup on Mellanox ConnectX-6 Dx was 14.2 times in the UDP multi-stream test, compared to software HTB implementation (more details in the mlx5 patch). [1]: https://www.spinics.net/lists/netdev/msg628422.html [2]: https://www.spinics.net/lists/netdev/msg663548.html v2 changes: Fixed sparse and smatch warnings. Formatted HTB patches to 80 chars per line. v3 changes: Fixed the CI failure on parisc with 16-bit xchg by replacing it with WRITE_ONCE. Fixed the capability bits in mlx5_ifc.h and the value of MLX5E_QOS_MAX_LEAF_NODES. v4 changes: Check if HTB is root when offloading. Add extack for hardware errors. Rephrase explanations of how it works in the commit message. Remove %hu from format strings. Add resiliency when leaf_del_last fails to create a new leaf node. ==================== Link: https://lore.kernel.org/r/20210119120815.463334-1-maximmi@mellanox.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net/mlx5e: Support HTB offload	Maxim Mikityanskiy
	This commit adds support for HTB offload in the mlx5e driver. Performance: NIC: Mellanox ConnectX-6 Dx CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (24 cores with HT) 100 Gbit/s line rate, 500 UDP streams @ ~200 Mbit/s each 48 traffic classes, flower used for steering No shaping (rate limits set to 4 Gbit/s per TC) - checking for max throughput. Baseline: 98.7 Gbps, 8.25 Mpps HTB: 6.7 Gbps, 0.56 Mpps HTB offload: 95.6 Gbps, 8.00 Mpps Limitations: 1. 256 leaf nodes, 3 levels of depth. 2. Granularity for ceil is 1 Mbit/s. Rates are converted to weights, and the bandwidth is split among the siblings according to these weights. Other parameters for classes are not supported. Ethtool statistics support for QoS SQs are also added. The counters are called qos_txN_, where N is the QoS queue number (starting from 0, the numeration is separate from the normal SQs), and is the counter name (the counters are the same as for the normal SQs). Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	sch_htb: Stats for offloaded HTB	Maxim Mikityanskiy
	This commit adds support for statistics of offloaded HTB. Bytes and packets counters for leaf and inner nodes are supported, the values are taken from per-queue qdiscs, and the numbers that the user sees should have the same behavior as the software (non-offloaded) HTB. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	sch_htb: Hierarchical QoS hardware offload	Maxim Mikityanskiy
	HTB doesn't scale well because of contention on a single lock, and it also consumes CPU. This patch adds support for offloading HTB to hardware that supports hierarchical rate limiting. In the offload mode, HTB passes control commands to the driver using ndo_setup_tc. The driver has to replicate the whole hierarchy of classes and their settings (rate, ceil) in the NIC. Every modification of the HTB tree caused by the admin results in ndo_setup_tc being called. After this setup, the HTB algorithm is done completely in the NIC. An SQ (send queue) is created for every leaf class and attached to the hierarchy, so that the NIC can calculate and obey aggregated rate limits, too. In the future, it can be changed, so that multiple SQs will back a single leaf class. ndo_select_queue is responsible for selecting the right queue that serves the traffic class of each packet. The data path works as follows: a packet is classified by clsact, the driver selects a hardware queue according to its class, and the packet is enqueued into this queue's qdisc. This solution addresses two main problems of scaling HTB: 1. Contention by flow classification. Currently the filters are attached to the HTB instance as follows: # tc filter add dev eth0 parent 1:0 protocol ip flower dst_port 80 classid 1:10 It's possible to move classification to clsact egress hook, which is thread-safe and lock-free: # tc filter add dev eth0 egress protocol ip flower dst_port 80 action skbedit priority 1:10 This way classification still happens in software, but the lock contention is eliminated, and it happens before selecting the TX queue, allowing the driver to translate the class to the corresponding hardware queue in ndo_select_queue. Note that this is already compatible with non-offloaded HTB and doesn't require changes to the kernel nor iproute2. 2. Contention by handling packets. HTB is not multi-queue, it attaches to a whole net device, and handling of all packets takes the same lock. When HTB is offloaded, it registers itself as a multi-queue qdisc, similarly to mq: HTB is attached to the netdev, and each queue has its own qdisc. Some features of HTB may be not supported by some particular hardware, for example, the maximum number of classes may be limited, the granularity of rate and ceil parameters may be different, etc. - so, the offload is not enabled by default, a new parameter is used to enable it: # tc qdisc replace dev eth0 root handle 1: htb offload Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: sched: Add extack to Qdisc_class_ops.delete	Maxim Mikityanskiy
	In a following commit, sch_htb will start using extack in the delete class operation to pass hardware errors in offload mode. This commit prepares for that by adding the extack parameter to this callback and converting usage of the existing qdiscs. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: sched: Add multi-queue support to sch_tree_lock	Maxim Mikityanskiy
	The existing qdiscs that set TCQ_F_MQROOT don't use sch_tree_lock. However, hardware-offloaded HTB will start setting this flag while also using sch_tree_lock. The current implementation of sch_tree_lock basically locks on qdisc->dev_queue->qdisc, and it works fine when the tree is attached to some queue. However, it's not the case for MQROOT qdiscs: such a qdisc is the root itself, and its dev_queue just points to queue 0, while not actually being used, because there are real per-queue qdiscs. This patch changes the logic of sch_tree_lock and sch_tree_unlock to lock the qdisc itself if it's the MQROOT. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	riscv: Fixup pfn_valid error with wrong max_mapnr	Guo Ren
	The max_mapnr is the number of PFNs, not absolute PFN offset. Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Fixes: d0d8aae64566 ("RISC-V: Set maximum number of mapped pages correctly") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmerdabbelt@google.com>
2021-01-22	Merge branch 'tcp-add-cmsg-rx-timestamps-to-rx-zerocopy'	Jakub Kicinski
	Arjun Roy says: ==================== tcp: add CMSG+rx timestamps to rx. zerocopy Provide CMSG and receive timestamp support to TCP receive zerocopy. Patch 1 refactors CMSG pending state for tcp_recvmsg() to avoid the use of magic numbers; patch 2 implements receive timestamp via CMSG support for receive zerocopy, and uses the constants added in patch 1. ==================== Link: https://lore.kernel.org/r/20210121004148.2340206-1-arjunroy.kdev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	tcp: Add receive timestamp support for receive zerocopy.	Arjun Roy
	tcp_recvmsg() uses the CMSG mechanism to receive control information like packet receive timestamps. This patch adds CMSG fields to struct tcp_zerocopy_receive, and provides receive timestamps if available to the user. Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	tcp: Remove CMSG magic numbers for tcp_recvmsg().	Arjun Roy
	At present, tcp_recvmsg() uses flags to track if any CMSGs are pending and what those CMSGs are. These flags are currently magic numbers, used only within tcp_recvmsg(). To prepare for receive timestamp support in tcp receive zerocopy, gently refactor these magic numbers into enums. Signed-off-by: Arjun Roy <arjunroy@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	Merge branch 'net-bridge-multicast-add-initial-eht-support'	Jakub Kicinski
	Nikolay Aleksandrov says: ==================== net: bridge: multicast: add initial EHT support This set adds explicit host tracking support for IGMPv3/MLDv2. The already present per-port fast leave flag is used to enable it since that is the primary goal of EHT, to track a group and its S,Gs usage per-host and when left without any interested hosts delete them before the standard timers. The EHT code is pretty self-contained and not enabled by default. There is no new uAPI added, all of the functionality is currently hidden behind the fast leave flag. In the future that will change (more below). The host tracking uses two new sets per port group: one having an entry for each host which contains that host's view of the group (source list and filter mode), and one set which contains an entry for each source having an internal set which contains an entry for each host that has reported an interest for that source. RB trees are used for all sets so they're compact when not used and fast when we need to do lookups. To illustrate it: [ bridge port group ] ` [ host set (rb) ] ` [ host entry with a list of sources and filter mode ] ` [ source set (rb) ] ` [ source entry ] ` [ source host set (rb) ] ` [ source host entry with a timer ] The number of tracked sources per host is limited to the maximum total number of S,G entries per port group - PG_SRC_ENT_LIMIT (currently 32). The number of hosts is unlimited, I think the argument that a local attacker can exhaust the memory/cause high CPU usage can be applied to fdb entries as well which are unlimited. In the future if needed we can add an option to limit these, but I don't think it's necessary for a start. All of the new sets are protected by the bridge's multicast lock. I'm pretty sure we'll be changing the cases and improving the convergence time in the future, but this seems like a good start. Patch breakdown: patch 1 - 4: minor cleanups and preparations for EHT patch 5: adds the new structures which will be used in the following patches patch 6: adds support to create, destroy and lookup host entries patch 7: adds support to create, delete and lokup source set entries patch 8: adds a host "delete" function which is just a host's source list flush since that would automatically delete the host patch 9 - 10: add support for handling all IGMPv3/MLDv2 report types more information can be found in the individual patches patch 11: optmizes a specific TO_INCLUDE use-case with host timeouts patch 12: handles per-host filter mode changing (include <-> exclude) patch 13: pulls out block group deletion since now it can be deleted in both filter modes patch 14: marks deletions done due to fast leave Future plans: - export host information - add an option to reduce queries - add an option to limit the number of host entries - tune more fast leave cases for quicker convergence By the way I think this is the first open-source EHT implementation, I couldn't find any while researching it. :) ==================== Link: https://lore.kernel.org/r/20210120145203.1109140-1-razor@blackwall.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: mark IGMPv3/MLDv2 fast-leave deletes	Nikolay Aleksandrov
	Mark groups which were deleted due to fast leave/EHT. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: handle block pg delete for all cases	Nikolay Aleksandrov
	A block report can result in empty source and host sets for both include and exclude groups so if there are no hosts left we can safely remove the group. Pull the block group handling so it can cover both cases and add a check if EHT requires the delete. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: add EHT host filter_mode handling	Nikolay Aleksandrov
	We should be able to handle host filter mode changing. For exclude mode we must create a zero-src entry so the group will be kept even without any S,G entries (non-zero source sets). That entry doesn't count to the entry limit and can always be created, its timer is refreshed on new exclude reports and if we change the host filter mode to include then it gets removed and we rely only on the non-zero source sets. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: optimize TO_INCLUDE EHT timeouts	Nikolay Aleksandrov
	This is an optimization specifically for TO_INCLUDE which sends queries for the older entries and thus lowers the S,G timers to LMQT. If we have the following situation for a group in either include or exclude mode: - host A was interested in srcs X and Y, but is timing out - host B sends TO_INCLUDE src Z, the bridge lowers X and Y's timeouts to LMQT - host B sends BLOCK src Z after LMQT time has passed => since host B is the last host we can delete the group, but if we still have host A's EHT entries for X and Y (i.e. if they weren't lowered to LMQT previously) then we'll have to wait another LMQT time before deleting the group, with this optimization we can directly remove it regardless of the group mode as there are no more interested hosts Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: add EHT include and exclude handling	Nikolay Aleksandrov
	Add support for IGMPv3/MLDv2 include and exclude EHT handling. Similar to how the reports are processed we have 2 cases when the group is in include or exclude mode, these are processed as follows: - group include - is_include: create missing entries - to_include: flush existing entries and create a new set from the report, obviously if the src set is empty then we delete the group - group exclude - is_exclude: create missing entries - to_exclude: flush existing entries and create a new set from the report, any empty source set entries are removed If the group is in a different mode then we just flush all entries reported by the host and we create a new set with the new mode entries created from the report. If the report is include type, the source list is empty and the group has empty sources' set then we remove it. Any source set entries which are empty are removed as well. If the group is in exclude mode it can exist without any S,G entries (allowing for all traffic to pass). Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: add EHT allow/block handling	Nikolay Aleksandrov
	Add support for IGMPv3/MLDv2 allow/block EHT handling. Similar to how the reports are processed we have 2 cases when the group is in include or exclude mode, these are processed as follows: - group include - allow: create missing entries - block: remove existing matching entries and remove the corresponding S,G entries if there are no more set host entries, then possibly delete the whole group if there are no more S,G entries - group exclude - allow - host include: create missing entries - host exclude: remove existing matching entries and remove the corresponding S,G entries if there are no more set host entries - block - host include: remove existing matching entries and remove the corresponding S,G entries if there are no more set host entries, then possibly delete the whole group if there are no more S,G entries - host exclude: create missing entries Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: add EHT host delete function	Nikolay Aleksandrov
	Now that we can delete set entries, we can use that to remove EHT hosts. Since the group's host set entries exist only when there are related source set entries we just have to flush all source set entries joined by the host set entry and it will be automatically removed. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: add EHT source set handling functions	Nikolay Aleksandrov
	Add EHT source set and set-entry create, delete and lookup functions. These allow to manipulate source sets which contain their own host sets with entries which joined that S,G. We're limiting the maximum number of tracked S,G entries per host to PG_SRC_ENT_LIMIT (currently 32) which is the current maximum of S,G entries for a group. There's a per-set timer which will be used to destroy the whole set later. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: add EHT host handling functions	Nikolay Aleksandrov
	Add functions to create, destroy and lookup an EHT host. These are per-host entries contained in the eht_host_tree in net_bridge_port_group which are used to store a list of all sources (S,G) entries joined for that group by each host, the host's current filter mode and total number of joined entries. No functional changes yet, these would be used in later patches. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: add EHT structures and definitions	Nikolay Aleksandrov
	Add EHT structures for tracking hosts and sources per group. We keep one set for each host which has all of the host's S,G entries, and one set for each multicast source which has all hosts that have joined that S,G. For each host, source entry we record the filter_mode and we keep an expiry timer. There is also one global expiry timer per source set, it is updated with each set entry update, it will be later used to lower the set's timer instead of lowering each entry's timer separately. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: calculate idx position without changing ptr	Nikolay Aleksandrov
	We need to preserve the srcs pointer since we'll be passing it for EHT handling later. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: __grp_src_block_incl can modify pg	Nikolay Aleksandrov
	Prepare __grp_src_block_incl() for being able to cause a notification due to changes. Currently it cannot happen, but EHT would change that since we'll be deleting sources immediately. Make sure that if the pg is deleted we don't return true as that would cause the caller to access freed pg. This patch shouldn't cause any functional change. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: pass host src address to IGMPv3/MLDv2 functions	Nikolay Aleksandrov
	We need to pass the host address so later it can be used for explicit host tracking. No functional change. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: bridge: multicast: rename src_size to addr_size	Nikolay Aleksandrov
	Rename src_size argument to addr_size in preparation for passing host address as an argument to IGMPv3/MLDv2 functions. No functional change. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: hns3: replace skb->csum_not_inet with skb_csum_is_sctp	Xin Long
	Commit fa8211701043 ("net: add inline function skb_csum_is_sctp") missed replacing skb->csum_not_inet check in hns3. This patch is to replace it with skb_csum_is_sctp(). Reported-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/3ad3c22c08beb0947f5978e790bd98d2aa063df9.1611307861.git.lucien.xin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	Merge branch 'mptcp-re-enable-sndbuf-autotune'	Jakub Kicinski
	Paolo Abeni says: ==================== mptcp: re-enable sndbuf autotune The sendbuffer autotuning was unintentionally disabled as a side effect of the recent workqueue removal refactor. These patches re-enable id, with some extra care: with autotuning enable/large send buffer we need a more accurate packet scheduler to be able to use efficiently the available subflow bandwidth, especially when the subflows have different capacities. The first patch cleans-up subflow socket handling, making the actual re-enable (patch 2) simpler. Patches 3 and 4 improve the packet scheduler, to better cope with non trivial scenarios and large send buffer. Finally patch 5 adds and uses some infrastructure to avoid the workqueue usage for the packet scheduler operations introduced by the previous patches. ==================== Link: https://lore.kernel.org/r/cover.1611153172.git.pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	mptcp: implement delegated actions	Paolo Abeni
	On MPTCP-level ack reception, the packet scheduler may select a subflow other then the current one. Prior to this commit we rely on the workqueue to trigger action on such subflow. This changeset introduces an infrastructure that allows any MPTCP subflow to schedule actions (MPTCP xmit) on others subflows without resorting to (multiple) process reschedule. A dummy NAPI instance is used instead. When MPTCP needs to trigger action an a different subflow, it enqueues the target subflow on the NAPI backlog and schedule such instance as needed. The dummy NAPI poll method walks the sockets backlog and tries to acquire the (BH) socket lock on each of them. If the socket is owned by the user space, the action will be completed by the sock release cb, otherwise push is started. This change leverages the delegated action infrastructure to avoid invoking the MPTCP worker to spool the pending data, when the packet scheduler picks a subflow other then the one currently processing the incoming MPTCP-level ack. Additionally we further refine the subflow selection invoking the packet scheduler for each chunk of data even inside __mptcp_subflow_push_pending(). v1 -> v2: - fix possible UaF at shutdown time, resetting sock ops after removing the ulp context Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	mptcp: schedule work for better snd subflow selection	Paolo Abeni
	Otherwise the packet scheduler policy will not be enforced when pushing pending data at MPTCP-level ack reception time. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	mptcp: do not queue excessive data on subflows	Paolo Abeni
	The current packet scheduler can enqueue up to sndbuf data on each subflow. If the send buffer is large and the subflows are not symmetric, this could lead to suboptimal aggregate bandwidth utilization. Limit the amount of queued data to the maximum send window. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	mptcp: re-enable sndbuf autotune	Paolo Abeni
	After commit 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks"), MPTCP never sets the flag bit SOCK_NOSPACE on its subflow. As a side effect, autotune never takes place, as it happens inside tcp_new_space(), which in turn is called only when the mentioned bit is set. Let's sendmsg() set the subflows NOSPACE bit when looking for more memory and use the subflow write_space callback to propagate the snd buf update and wake-up the user-space. Additionally, this allows dropping a bunch of duplicate code and makes the SNDBUF_LIMITED chrono relevant again for MPTCP subflows. Fixes: 6e628cd3a8f7 ("mptcp: use mptcp release_cb for delayed tasks") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	mptcp: always graft subflow socket to parent	Paolo Abeni
	Currently, incoming subflows link to the parent socket, while outgoing ones link to a per subflow socket. The latter is not really needed, except at the initial connect() time and for the first subflow. Always graft the outgoing subflow to the parent socket and free the unneeded ones early. This allows some code cleanup, reduces the amount of memory used and will simplify the next patch Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	sfc: reduce the number of requested xdp ev queues	Ivan Babrou
	Without this change the driver tries to allocate too many queues, breaching the number of available msi-x interrupts on machines with many logical cpus and default adapter settings: Insufficient resources for 12 XDP event queues (24 other channels, max 32) Which in turn triggers EINVAL on XDP processing: sfc 0000:86:00.0 ext0: XDP TX failed (-22) Signed-off-by: Ivan Babrou <ivan@cloudflare.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://lore.kernel.org/r/20210120212759.81548-1-ivan@cloudflare.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: fec: put child node on error path	Pan Bian
	Also decrement the reference count of child device on error path. Fixes: 3e782985cb3c ("net: ethernet: fec: Allow configuration of MDIO bus speed") Signed-off-by: Pan Bian <bianpan2016@163.com> Link: https://lore.kernel.org/r/20210120122037.83897-1-bianpan2016@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: stmmac: dwmac-intel-plat: remove config data on error	Pan Bian
	Remove the config data when rate setting fails. Fixes: 9efc9b2b04c7 ("net: stmmac: Add dwmac-intel-plat for GBE driver") Signed-off-by: Pan Bian <bianpan2016@163.com> Link: https://lore.kernel.org/r/20210120110745.36412-1-bianpan2016@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	tcp: add TTL to SCM_TIMESTAMPING_OPT_STATS	Yousuk Seung
	This patch adds TCP_NLA_TTL to SCM_TIMESTAMPING_OPT_STATS that exports the time-to-live or hop limit of the latest incoming packet with SCM_TSTAMP_ACK. The value exported may not be from the packet that acks the sequence when incoming packets are aggregated. Exporting the time-to-live or hop limit value of incoming packets helps to estimate the hop count of the path of the flow that may change over time. Signed-off-by: Yousuk Seung <ysseung@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Link: https://lore.kernel.org/r/20210120204155.552275-1-ysseung@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	tcp: remove unused ICSK_TIME_EARLY_RETRANS	Pengcheng Yang
	Since the early retransmit has been removed by commit bec41a11dd3d ("tcp: remove early retransmit"), we also remove the unused ICSK_TIME_EARLY_RETRANS macro. Signed-off-by: Pengcheng Yang <yangpc@wangsu.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/1611239473-27304-1-git-send-email-yangpc@wangsu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: phy: realtek: Add support for RTL9000AA/AN	Yuusuke Ashizuka
	RTL9000AA/AN as 100BASE-T1 is following: - 100 Mbps - Full duplex - Link Status Change Interrupt - Master/Slave configuration Signed-off-by: Yuusuke Ashizuka <ashiduka@fujitsu.com> Signed-off-by: Torii Kenichi <torii.ken1@fujitsu.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20210121080254.21286-1-ashiduka@fujitsu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	dt-bindings: net: renesas,etheravb: Add r8a779a0 support	Wolfram Sang
	Document the compatible value for the RAVB block in the Renesas R-Car V3U (R8A779A0) SoC. This variant has no stream buffer, so we only need to add the new compatible and add it to the TX delay block. Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Link: https://lore.kernel.org/r/20210121100619.5653-2-wsa+renesas@sang-engineering.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: octeontx2: Make sure the buffer is 128 byte aligned	Kevin Hao
	The octeontx2 hardware needs the buffer to be 128 byte aligned. But in the current implementation of napi_alloc_frag(), it can't guarantee the return address is 128 byte aligned even the request size is a multiple of 128 bytes, so we have to request an extra 128 bytes and use the PTR_ALIGN() to make sure that the buffer is aligned correctly. Fixes: 7a36e4918e30 ("octeontx2-pf: Use the napi_alloc_frag() to alloc the pool buffers") Reported-by: Subbaraya Sundeep <sbhatta@marvell.com> Signed-off-by: Kevin Hao <haokexin@gmail.com> Tested-by: Subbaraya Sundeep <sbhatta@marvell.com> Link: https://lore.kernel.org/r/20210121070906.25380-1-haokexin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	net: usb: qmi_wwan: added support for Thales Cinterion PLSx3 modem family	Giacinto Cifelli
	Bus 003 Device 009: ID 1e2d:006f Device Descriptor: bLength 18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass 239 Miscellaneous Device bDeviceSubClass 2 ? bDeviceProtocol 1 Interface Association bMaxPacketSize0 64 idVendor 0x1e2d idProduct 0x006f bcdDevice 0.00 iManufacturer 3 Cinterion Wireless Modules iProduct 2 PLSx3 iSerial 4 fa3c1419 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 303 bNumInterfaces 9 bConfigurationValue 1 iConfiguration 1 Cinterion Configuration bmAttributes 0xe0 Self Powered Remote Wakeup MaxPower 500mA Interface Association: bLength 8 bDescriptorType 11 bFirstInterface 0 bInterfaceCount 2 bFunctionClass 2 Communications bFunctionSubClass 2 Abstract (modem) bFunctionProtocol 1 AT-commands (v.25ter) iFunction 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 2 Abstract (modem) bInterfaceProtocol 1 AT-commands (v.25ter) iInterface 0 CDC Header: bcdCDC 1.10 CDC ACM: bmCapabilities 0x02 line coding and serial state CDC Call Management: bmCapabilities 0x03 call management use DataInterface bDataInterface 1 CDC Union: bMasterInterface 0 bSlaveInterface 1 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 5 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 1 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 Unused bInterfaceProtocol 0 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x82 EP 2 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x01 EP 1 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Interface Association: bLength 8 bDescriptorType 11 bFirstInterface 2 bInterfaceCount 2 bFunctionClass 2 Communications bFunctionSubClass 2 Abstract (modem) bFunctionProtocol 1 AT-commands (v.25ter) iFunction 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 2 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 2 Abstract (modem) bInterfaceProtocol 1 AT-commands (v.25ter) iInterface 0 CDC Header: bcdCDC 1.10 CDC ACM: bmCapabilities 0x02 line coding and serial state CDC Call Management: bmCapabilities 0x03 call management use DataInterface bDataInterface 3 CDC Union: bMasterInterface 2 bSlaveInterface 3 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x83 EP 3 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 5 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 3 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 Unused bInterfaceProtocol 0 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x84 EP 4 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x02 EP 2 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Interface Association: bLength 8 bDescriptorType 11 bFirstInterface 4 bInterfaceCount 2 bFunctionClass 2 Communications bFunctionSubClass 2 Abstract (modem) bFunctionProtocol 1 AT-commands (v.25ter) iFunction 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 4 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 2 Abstract (modem) bInterfaceProtocol 1 AT-commands (v.25ter) iInterface 0 CDC Header: bcdCDC 1.10 CDC ACM: bmCapabilities 0x02 line coding and serial state CDC Call Management: bmCapabilities 0x03 call management use DataInterface bDataInterface 5 CDC Union: bMasterInterface 4 bSlaveInterface 5 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x85 EP 5 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 5 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 5 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 Unused bInterfaceProtocol 0 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x86 EP 6 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x03 EP 3 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Interface Association: bLength 8 bDescriptorType 11 bFirstInterface 6 bInterfaceCount 2 bFunctionClass 2 Communications bFunctionSubClass 2 Abstract (modem) bFunctionProtocol 1 AT-commands (v.25ter) iFunction 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 6 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 2 Communications bInterfaceSubClass 2 Abstract (modem) bInterfaceProtocol 1 AT-commands (v.25ter) iInterface 0 CDC Header: bcdCDC 1.10 CDC ACM: bmCapabilities 0x02 line coding and serial state CDC Call Management: bmCapabilities 0x03 call management use DataInterface bDataInterface 7 CDC Union: bMasterInterface 6 bSlaveInterface 7 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x87 EP 7 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 5 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 7 bAlternateSetting 0 bNumEndpoints 2 bInterfaceClass 10 CDC Data bInterfaceSubClass 0 Unused bInterfaceProtocol 0 iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x88 EP 8 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x04 EP 4 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber 8 bAlternateSetting 0 bNumEndpoints 3 bInterfaceClass 255 Vendor Specific Class bInterfaceSubClass 255 Vendor Specific Subclass bInterfaceProtocol 255 Vendor Specific Protocol iInterface 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x89 EP 9 IN bmAttributes 3 Transfer Type Interrupt Synch Type None Usage Type Data wMaxPacketSize 0x0040 1x 64 bytes bInterval 5 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x8a EP 10 IN bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x05 EP 5 OUT bmAttributes 2 Transfer Type Bulk Synch Type None Usage Type Data wMaxPacketSize 0x0200 1x 512 bytes bInterval 0 Device Qualifier (for other device speed): bLength 10 bDescriptorType 6 bcdUSB 2.00 bDeviceClass 239 Miscellaneous Device bDeviceSubClass 2 ? bDeviceProtocol 1 Interface Association bMaxPacketSize0 64 bNumConfigurations 1 Device Status: 0x0000 (Bus Powered) Cc: stable@vger.kernel.org Signed-off-by: Giacinto Cifelli <gciofono@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Link: https://lore.kernel.org/r/20210120045650.10855-1-gciofono@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-01-22	Merge tag 'imx-fixes-5.11-2' of ↵	Arnd Bergmann
	git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux into arm/fixes i.MX fixes for 5.11, round 2: - Fix pcf2127 reset for imx7d-flex-concentrator board. - Fix i.MX6 suspend with Thumb-2 kernel. - Fix ethernet-phy address issue on imx6qdl-sr-som board. - Fix GPIO3 `gpio-ranges` on i.MX8MP. - Select SOC_BUS for IMX_SCU driver to fix build issue. * tag 'imx-fixes-5.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/shawnguo/linux: firmware: imx: select SOC_BUS to fix firmware build arm64: dts: imx8mp: Correct the gpio ranges of gpio3 ARM: dts: imx6qdl-sr-som: fix some cubox-i platforms ARM: imx: build suspend-imx6.S with arm instruction set ARM: dts: imx7d-flex-concentrator: fix pcf2127 reset Link: https://lore.kernel.org/r/20210119091949.GD4356@dragon Signed-off-by: Arnd Bergmann <arnd@arndb.de>