summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2021-10-24net: dsa: introduce locking for the address lists on CPU and DSA portsVladimir Oltean
Now that the rtnl_mutex is going away for dsa_port_{host_,}fdb_{add,del}, no one is serializing access to the address lists that DSA keeps for the purpose of reference counting on shared ports (CPU and cascade ports). It can happen for one dsa_switch_do_fdb_del to do list_del on a dp->fdbs element while another dsa_switch_do_fdb_{add,del} is traversing dp->fdbs. We need to avoid that. Currently dp->mdbs is not at risk, because dsa_switch_do_mdb_{add,del} still runs under the rtnl_mutex. But it would be nice if it would not depend on that being the case. So let's introduce a mutex per port (the address lists are per port too) and share it between dp->mdbs and dp->fdbs. The place where we put the locking is interesting. It could be tempting to put a DSA-level lock which still serializes calls to .port_fdb_{add,del}, but it would still not avoid concurrency with other driver code paths that are currently under rtnl_mutex (.port_fdb_dump, .port_fast_age). So it would add a very false sense of security (and adding a global switch-wide lock in DSA to resynchronize with the rtnl_lock is also counterproductive and hard). So the locking is intentionally done only where the dp->fdbs and dp->mdbs lists are traversed. That means, from a driver perspective, that .port_fdb_add will be called with the dp->addr_lists_lock mutex held on the CPU port, but not held on user ports. This is done so that driver writers are not encouraged to rely on any guarantee offered by dp->addr_lists_lock. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-22devlink: Clean not-executed param notificationsLeon Romanovsky
The parameters are registered before devlink_register() and all the notifications are delayed. This patch removes not-possible parameters notifications along with addition of code annotation logic. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22devlink: Remove not-executed trap group notificationsLeon Romanovsky
The trap logic is registered before devlink_register() and all the notifications are delayed. This patch removes not-possible trap group notifications along with addition of code annotation logic. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22devlink: Remove not-executed trap policer notificationsLeon Romanovsky
The trap policer logic is registered before devlink_register() and all the notifications are delayed. This patch removes not-possible notifications along with addition of code annotation logic. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22devlink: Delete obsolete parameters publish APILeon Romanovsky
The change of devlink_register() to be last devlink command together with delayed notification logic made the publish API to be obsolete. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22gre/sit: Don't generate link-local addr if addr_gen_mode is ↵Stephen Suryaputra
IN6_ADDR_GEN_MODE_NONE When addr_gen_mode is set to IN6_ADDR_GEN_MODE_NONE, the link-local addr should not be generated. But it isn't the case for GRE (as well as GRE6) and SIT tunnels. Make it so that tunnels consider the addr_gen_mode, especially for IN6_ADDR_GEN_MODE_NONE. Do this in add_v4_addrs() to cover both GRE and SIT only if the addr scope is link. Signed-off-by: Stephen Suryaputra <ssuryaextr@gmail.com> Acked-by: Antonio Quartulli <a@unstable.cc> Link: https://lore.kernel.org/r/20211020200618.467342-1-ssuryaextr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22Merge tag 'mac80211-next-for-net-next-2021-10-21' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next Johannes Berg says: ==================== Quite a few changes: * the applicable eth_hw_addr_set() and const hw_addr changes * various code cleanups/refactorings * stack usage reductions across the wireless stack * some unstructured find_ie() -> structured find_element() changes * a few more pieces of multi-BSSID support * some 6 GHz regulatory support * 6 GHz support in hwsim, for testing userspace code * Light Communications (LC, 802.11bb) early band definitions to be able to add a first driver soon * tag 'mac80211-next-for-net-next-2021-10-21' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next: (35 commits) cfg80211: fix kernel-doc for MBSSID EMA mac80211: Prevent AP probing during suspend nl80211: Add LC placeholder band definition to nl80211_band ... ==================== Link: https://lore.kernel.org/r/20211021154953.134849-1-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
Lots of simnple overlapping additions. With a build fix from Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21mac80211: Prevent AP probing during suspendLoic Poulain
Submitting AP probe/null during suspend can cause unexpected disconnect on resume because of timeout waiting for ack status: wlan0: Failed to send nullfunc to AP 11:22:33:44:55:66 after 500ms, disconnecting This is especially the case when we enter suspend when a scan is ongoing, indeed, scan is cancelled from __ieee80211_suspend, leading to a corresponding (aborted) scan complete event, which in turn causes the submission of an immediate monitor null frame (restart_sta_timer). The corresponding packet or ack will not be processed before resuming, causing a timeout & disconnect on resume. Delay the AP probing when suspending/suspended. Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Link: https://lore.kernel.org/r/1634805927-1113-1-git-send-email-loic.poulain@linaro.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21nl80211: Add LC placeholder band definition to nl80211_bandSrinivasan Raju
Define LC band which is a draft under IEEE 802.11bb. Current NL80211_BAND_LC is a placeholder band and will be more defined IEEE 802.11bb progresses. Signed-off-by: Srinivasan Raju <srini.raju@purelifi.com> Link: https://lore.kernel.org/r/20211018100143.7565-2-srini.raju@purelifi.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21mac80211: split beacon retrieval functionsAloka Dixit
Split __ieee80211_beacon_get() into a separate function for AP mode ieee80211_beacon_get_ap(). Also, move the code common to all modes (AP, adhoc and mesh) to a separate function ieee80211_beacon_get_finish(). Signed-off-by: Aloka Dixit <alokad@codeaurora.org> Link: https://lore.kernel.org/r/20211006040938.9531-2-alokad@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21cfg80211: separate get channel number from iesWen Gong
Get channel number from ies is a common logic, so separate it to a new function, which could also be used by lower driver. Signed-off-by: Wen Gong <wgong@codeaurora.org> Link: https://lore.kernel.org/r/20210930081533.4898-1-wgong@codeaurora.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21mac80211: use ieee80211_bss_get_elem() in most placesJohannes Berg
There are a number of uses of ieee80211_bss_get_ie(), replace most of them with ieee80211_bss_get_elem(). Link: https://lore.kernel.org/r/20210930131130.9a413f12a151.I0699ba7e48c9d88dbbfa3107cf4d34a8345d02a0@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21cfg80211: scan: use element finding functions in easy casesJohannes Berg
There are a few easy cases where we only check for NULL or have just simple use of the result, this can be done with the element finding functions instead. Link: https://lore.kernel.org/r/20210930131130.f27c8a7ec264.Iadb03c4307e9216e080ce513e8ad4048cd020b25@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21nl80211: use element finding functionsJohannes Berg
The element finding functions are safer, so use them instead of the "find_ie" functions. Link: https://lore.kernel.org/r/20210930131130.b838f139cc8e.I2b641262d3fc6e0d498719bf343fdc1c0833b845@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21mac80211: fils: use cfg80211_find_ext_elem()Johannes Berg
Replace the use of cfg80211_find_ext_ie() with the more structured cfg80211_find_ext_elem(). Link: https://lore.kernel.org/r/20210930131130.17ecf37f0605.I853c2f9c2117a713deca9b8deb3552796d98ffac@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21mac80211: fix memory leaks with element parsingJohannes Berg
My previous commit 5d24828d05f3 ("mac80211: always allocate struct ieee802_11_elems") had a few bugs and leaked the new allocated struct in a few error cases, fix that. Fixes: 5d24828d05f3 ("mac80211: always allocate struct ieee802_11_elems") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20211001211108.9839928e42e0.Ib81ca187d3d3af7ed1bfeac2e00d08a4637c8025@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21mac80211: use eth_hw_addr_set()Jakub Kicinski
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Convert mac80211 from memcpy(... ETH_ADDR) to eth_hw_addr_set(): @@ expression dev, np; @@ - memcpy(dev->dev_addr, np, ETH_ALEN) + eth_hw_addr_set(dev, np) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20211019162816.1384077-1-kuba@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21mac80211: debugfs: calculate free buffer size correctlyMordechay Goodstein
In breaking patch buf memory moved from stack to heap and sizeof(buf) change from size of actual memory to size of the pointer to the heap. Fix this by holding a separated variable for allocate size. Fixes: 01f84f0ed3b4 ("mac80211: reduce stack usage in debugfs") Signed-off-by: Mordechay Goodstein <mordechay.goodstein@intel.com> Link: https://lore.kernel.org/r/20211021163035.b9ae48c06e27.I6a6ed197110eae28cf4f6e38ce36828a7c136337@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2021-10-21net/core: Remove unused assignment operations and variableluo penghao
Although if_info_size is assigned, it has not been used. And the variable should also be deleted. The clang_analyzer complains as follows: net/core/rtnetlink.c:3806: warning: Although the value stored to 'if_info_size' is used in the enclosing expression, the value is never actually read from 'if_info_size'. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: luo penghao <luo.penghao@zte.com.cn> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21net: stats: Read the statistics in ___gnet_stats_copy_basic() instead of adding.Sebastian Andrzej Siewior
Since the rework, the statistics code always adds up the byte and packet value(s). On 32bit architectures a seqcount_t is used in gnet_stats_basic_sync to ensure that the 64bit values are not modified during the read since two 32bit loads are required. The usage of a seqcount_t requires a lock to ensure that only one writer is active at a time. This lock leads to disabled preemption during the update. The lack of disabling preemption is now creating a warning as reported by Naresh since the query done by gnet_stats_copy_basic() is in preemptible context. For ___gnet_stats_copy_basic() there is no need to disable preemption since the update is performed on stack and can't be modified by another writer. Instead of disabling preemption, to avoid the warning, simply create a read function to just read the values and return as u64. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Fixes: 67c9e6270f301 ("net: sched: Protect Qdisc::bstats with u64_stats") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21net: dsa: tag_8021q: make dsa_8021q_{rx,tx}_vid take dp as argumentVladimir Oltean
Pass a single argument to dsa_8021q_rx_vid and dsa_8021q_tx_vid that contains the necessary information from the two arguments that are currently provided: the switch and the port number. Also rename those functions so that they have a dsa_port_* prefix, since they operate on a struct dsa_port *. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21net: dsa: tag_sja1105: do not open-code dsa_switch_for_each_portVladimir Oltean
Find the remaining iterators over dst->ports that only filter for the ports belonging to a certain switch, and replace those with the dsa_switch_for_each_port helper that we have now. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21net: dsa: convert cross-chip notifiers to iterate using dpVladimir Oltean
The majority of cross-chip switch notifiers need to filter in some way over the type of ports: some install VLANs etc on all cascade ports. The difference is that the matching function, which filters by port type, is separate from the function where the iteration happens. So this patch needs to refactor the matching functions' prototypes as well, to take the dp as argument. In a future patch/series, I might convert dsa_towards_port to return a struct dsa_port *dp too, but at the moment it is a bit entangled with dsa_routing_port which is also used by mv88e6xxx and they both return an int port. So keep dsa_towards_port the way it is and convert it into a dp using dsa_to_port. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21net: dsa: remove gratuitous use of dsa_is_{user,dsa,cpu}_portVladimir Oltean
Find the occurrences of dsa_is_{user,dsa,cpu}_port where a struct dsa_port *dp was already available in the function scope, and replace them with the dsa_port_is_{user,dsa,cpu} equivalent function which uses that dp directly and does not perform another hidden dsa_to_port(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21net: dsa: do not open-code dsa_switch_for_each_portVladimir Oltean
Find the remaining iterators over dst->ports that only filter for the ports belonging to a certain switch, and replace those with the dsa_switch_for_each_port helper that we have now. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21net: dsa: remove the "dsa_to_port in a loop" antipattern from the coreVladimir Oltean
Ever since Vivien's conversion of the ds->ports array into a dst->ports list, and the introduction of dsa_to_port, iterations through the ports of a switch became quadratic whenever dsa_to_port was needed. dsa_to_port can either be called directly, or indirectly through the dsa_is_{user,cpu,dsa,unused}_port helpers. Use the newly introduced dsa_switch_for_each_port() iteration macro that works with the iterator variable being a struct dsa_port *dp directly, and not an int i. It is an expensive variable to go from i to dp, but cheap to go from dp to i. This macro iterates through the entire ds->dst->ports list and filters by the ports belonging just to the switch provided as argument. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter fixes for net: 1) Crash due to missing initialization of timer data in xt_IDLETIMER, from Juhee Kang. 2) NF_CONNTRACK_SECMARK should be bool in Kconfig, from Vegard Nossum. 3) Skip netdev events on netns removal, from Florian Westphal. 4) Add testcase to show port shadowing via UDP, also from Florian. 5) Remove pr_debug() code in ip6t_rt, this fixes a crash due to unsafe access to non-linear skbuff, from Xin Long. 6) Make net/ipv4/vs/debug_level read-only from non-init netns, from Antoine Tenart. 7) Remove bogus invocation to bash in selftests/netfilter/nft_flowtable.sh also from Florian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20fq_codel: generalise ce_threshold marking for subset of trafficToke Høiland-Jørgensen
Commit e72aeb9ee0e3 ("fq_codel: implement L4S style ce_threshold_ect1 marking") expanded the ce_threshold feature of FQ-CoDel so it can be applied to a subset of the traffic, using the ECT(1) bit of the ECN field as the classifier. However, hard-coding ECT(1) as the only classifier for this feature seems limiting, so let's expand it to be more general. To this end, change the parameter from a ce_threshold_ect1 boolean, to a one-byte selector/mask pair (ce_threshold_{selector,mask}) which is applied to the whole diffserv/ECN field in the IP header. This makes it possible to classify packets by any value in either the ECN field or the diffserv field. In particular, setting a selector of INET_ECN_ECT_1 and a mask of INET_ECN_MASK corresponds to the functionality before this patch, and a mask of ~INET_ECN_MASK allows using the selector as a straight-forward match against a diffserv code point: # apply ce_threshold to ECT(1) traffic tc qdisc replace dev eth0 root fq_codel ce_threshold 1ms ce_threshold_selector 0x1/0x3 # apply ce_threshold to ECN-capable traffic marked as diffserv AF22 tc qdisc replace dev eth0 root fq_codel ce_threshold 1ms ce_threshold_selector 0x50/0xfc Regardless of the selector chosen, the normal rules for ECN-marking of packets still apply, i.e., the flow must still declare itself ECN-capable by setting one of the bits in the ECN field to get marked at all. v2: - Add tc usage examples to patch description Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20211019174709.69081-1-toke@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-20net-core: use netdev_* calls for kernel messagesJesse Brandeburg
While loading a driver and changing the number of queues, I noticed this message in the kernel log: "[253489.070080] Number of in use tx queues changed invalidating tc mappings. Priority traffic classification disabled!" But I had no idea what interface was being talked about because this message used pr_warn(). After investigating, it appears we can use the netdev_* helpers already defined to create predictably formatted messages, and that already handle <unknown netdev> cases, in more of the messages in dev.c. After this change, this message (and others) will look like this: "[ 170.181093] ice 0000:3b:00.0 ens785f0: Number of in use tx queues changed invalidating tc mappings. Priority traffic classification disabled!" One goal here was not to change the message significantly from the original format so as to not break user's expectations, so I just changed messages that used pr_* and generally started with %s == dev->name. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20batman-adv: use eth_hw_addr_set() instead of ether_addr_copy()Jakub Kicinski
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Convert batman from ether_addr_copy() to eth_hw_addr_set(): @@ expression dev, np; @@ - ether_addr_copy(dev->dev_addr, np) + eth_hw_addr_set(dev, np) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20mac802154: use dev_addr_set() - manualJakub Kicinski
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20mac802154: use dev_addr_set()Jakub Kicinski
Commit 406f42fa0d3c ("net-next: When a bond have a massive amount of VLANs...") introduced a rbtree for faster Ethernet address look up. To maintain netdev->dev_addr in this tree we need to make all the writes to it got through appropriate helpers. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-20batman-adv: prepare for const netdev->dev_addrJakub Kicinski
netdev->dev_addr will be constant soon, make sure the qualifier is propagated thru batman-adv. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19net: dsa: Fix an error handling path in 'dsa_switch_parse_ports_of()'Christophe JAILLET
If we return before the end of the 'for_each_child_of_node()' iterator, the reference taken on 'port' must be released. Add the missing 'of_node_put()' calls. Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Link: https://lore.kernel.org/r/15d5310d1d55ad51c1af80775865306d92432e03.1634587046.git.christophe.jaillet@wanadoo.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-10-19devlink: Remove extra device_lock assert checksLeon Romanovsky
PCI core code in the pci_call_probe() has a path that doesn't hold device_lock. It happens because the ->probe() is called through the workqueue mechanism. 349 static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, 350 const struct pci_device_id *id) 351 { 352 .... 377 if (cpu < nr_cpu_ids) 378 error = work_on_cpu(cpu, local_pci_probe, &ddi); Luckily enough, the core still ensures that only single flow is executed, so it safe to remove the assert checks that anyway were added for annotations purposes. Fixes: b88f7b1203bf ("devlink: Annotate devlink API calls") Reported-by: Amit Cohen <amcohen@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19net: sched: Allow statistics reads from softirq.Sebastian Andrzej Siewior
Eric reported that the rate estimator reads statics from the softirq which in turn triggers a warning introduced in the statistics rework. The warning is too cautious. The updates happen in the softirq context so reads from softirq are fine since the writes can not be preempted. The updates/writes happen during qdisc_run() which ensures one writer and the softirq context. The remaining bad context for reading statistics remains in hard-IRQ because it may preempt a writer. Fixes: 29cbcd8582837 ("net: sched: Remove Qdisc::running sequence counter") Reported-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19net: sch_tbf: Add a graft commandPetr Machata
As another qdisc is linked to the TBF, the latter should issue an event to give drivers a chance to react to the grafting. In other qdiscs, this event is called GRAFT, so follow suit with TBF as well. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-19can: isotp: isotp_sendmsg(): fix return error on FC timeout on TX pathMarc Kleine-Budde
When the a large chunk of data send and the receiver does not send a Flow Control frame back in time, the sendmsg() does not return a error code, but the number of bytes sent corresponding to the size of the packet. If a timeout occurs the isotp_tx_timer_handler() is fired, sets sk->sk_err and calls the sk->sk_error_report() function. It was wrongly expected that the error would be propagated to user space in every case. For isotp_sendmsg() blocking on wait_event_interruptible() this is not the case. This patch fixes the problem by checking if sk->sk_err is set and returning the error to user space. Fixes: e057dd3fc20f ("can: add ISO 15765-2:2016 transport protocol") Link: https://github.com/hartkopp/can-isotp/issues/42 Link: https://github.com/hartkopp/can-isotp/pull/43 Link: https://lore.kernel.org/all/20210507091839.1366379-1-mkl@pengutronix.de Cc: stable@vger.kernel.org Reported-by: Sottas Guillaume (LMB) <Guillaume.Sottas@liebherr.com> Tested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-10-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS updates for net-next The following patchset contains Netfilter/IPVS for net-next: 1) Add new run_estimation toggle to IPVS to stop the estimation_timer logic, from Dust Li. 2) Relax superfluous dynset check on NFT_SET_TIMEOUT. 3) Add egress hook, from Lukas Wunner. 4) Nowadays, almost all hook functions in x_table land just call the hook evaluation loop. Remove remaining hook wrappers from iptables and IPVS. From Florian Westphal. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: dsa: tag_rtl8_4: add realtek 8 byte protocol 4 tagAlvin Šipraga
This commit implements a basic version of the 8 byte tag protocol used in the Realtek RTL8365MB-VC unmanaged switch, which carries with it a protocol version of 0x04. The implementation itself only handles the parsing of the EtherType value and Realtek protocol version, together with the source or destination port fields. The rest is left unimplemented for now. The tag format is described in a confidential document provided to my company by Realtek Semiconductor Corp. Permission has been granted by the vendor to publish this driver based on that material, together with an extract from the document describing the tag format and its fields. It is hoped that this will help future implementors who do not have access to the material but who wish to extend the functionality of drivers for chips which use this protocol. In addition, two possible values of the REASON field are specified, based on experiments on my end. Realtek does not specify what value this field can take. Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: dsa: move NET_DSA_TAG_RTL4_A to right place in Kconfig/MakefileAlvin Šipraga
Move things around a little so that this tag driver is alphabetically ordered. The Kconfig file is sorted based on the tristate text. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: dsa: allow reporting of standard ethtool stats for slave devicesAlvin Šipraga
Jakub pointed out that we have a new ethtool API for reporting device statistics in a standardized way, via .get_eth_{phy,mac,ctrl}_stats. Add a small amount of plumbing to allow DSA drivers to take advantage of this when exposing statistics. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net/sched: act_ct: Fix byte count on fragmented packetsPaul Blakey
First fragmented packets (frag offset = 0) byte len is zeroed when stolen by ip_defrag(). And since act_ct update the stats only afterwards (at end of execute), bytes aren't correctly accounted for such packets. To fix this, move stats update to start of action execute. Fixes: b57dc7c13ea9 ("net/sched: Introduce action ct") Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: sched: Remove Qdisc::running sequence counterAhmed S. Darwish
The Qdisc::running sequence counter has two uses: 1. Reliably reading qdisc's tc statistics while the qdisc is running (a seqcount read/retry loop at gnet_stats_add_basic()). 2. As a flag, indicating whether the qdisc in question is running (without any retry loops). For the first usage, the Qdisc::running sequence counter write section, qdisc_run_begin() => qdisc_run_end(), covers a much wider area than what is actually needed: the raw qdisc's bstats update. A u64_stats sync point was thus introduced (in previous commits) inside the bstats structure itself. A local u64_stats write section is then started and stopped for the bstats updates. Use that u64_stats sync point mechanism for the bstats read/retry loop at gnet_stats_add_basic(). For the second qdisc->running usage, a __QDISC_STATE_RUNNING bit flag, accessed with atomic bitops, is sufficient. Using a bit flag instead of a sequence counter at qdisc_run_begin/end() and qdisc_is_running() leads to the SMP barriers implicitly added through raw_read_seqcount() and write_seqcount_begin/end() getting removed. All call sites have been surveyed though, and no required ordering was identified. Now that the qdisc->running sequence counter is no longer used, remove it. Note, using u64_stats implies no sequence counter protection for 64-bit architectures. This can lead to the qdisc tc statistics "packets" vs. "bytes" values getting out of sync on rare occasions. The individual values will still be valid. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data typesAhmed S. Darwish
The only factor differentiating per-CPU bstats data type (struct gnet_stats_basic_cpu) from the packed non-per-CPU one (struct gnet_stats_basic_packed) was a u64_stats sync point inside the former. The two data types are now equivalent: earlier commits added a u64_stats sync point to the latter. Combine both data types into "struct gnet_stats_basic_sync". This eliminates redundancy and simplifies the bstats read/write APIs. Use u64_stats_t for bstats "packets" and "bytes" data types. On 64-bit architectures, u64_stats sync points do not use sequence counter protection. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: sched: Use _bstats_update/set() instead of raw writesAhmed S. Darwish
The Qdisc::running sequence counter, used to protect Qdisc::bstats reads from parallel writes, is in the process of being removed. Qdisc::bstats read/writes will synchronize using an internal u64_stats sync point instead. Modify all bstats writes to use _bstats_update(). This ensures that the internal u64_stats sync point is always acquired and released as appropriate. Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18net: sched: Protect Qdisc::bstats with u64_statsAhmed S. Darwish
The not-per-CPU variant of qdisc tc (traffic control) statistics, Qdisc::gnet_stats_basic_packed bstats, is protected with Qdisc::running sequence counter. This sequence counter is used for reliably protecting bstats reads from parallel writes. Meanwhile, the seqcount's write section covers a much wider area than bstats update: qdisc_run_begin() => qdisc_run_end(). That read/write section asymmetry can lead to needless retries of the read section. To prepare for removing the Qdisc::running sequence counter altogether, introduce a u64_stats sync point inside bstats instead. Modify _bstats_update() to start/end the bstats u64_stats write section. For bisectability, and finer commits granularity, the bstats read section is still protected with a Qdisc::running read/retry loop and qdisc_run_begin/end() still starts/ends that seqcount write section. Once all call sites are modified to use _bstats_update(), the Qdisc::running seqcount will be removed and bstats read/retry loop will be modified to utilize the internal u64_stats sync point. Note, using u64_stats implies no sequence counter protection for 64-bit architectures. This can lead to the statistics "packets" vs. "bytes" values getting out of sync on rare occasions. The individual values will still be valid. [bigeasy: Minor commit message edits, init all gnet_stats_basic_packed.] Signed-off-by: Ahmed S. Darwish <a.darwish@linutronix.de> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18gen_stats: Move remaining users to gnet_stats_add_queue().Sebastian Andrzej Siewior
The gnet_stats_queue::qlen member is only used in the SMP-case. qdisc_qstats_qlen_backlog() needs to add qdisc_qlen() to qstats.qlen to have the same value as that provided by qdisc_qlen_sum(). gnet_stats_copy_queue() needs to overwritte the resulting qstats.qlen field whith the caller submitted qlen value. It might be differ from the submitted value. Let both functions use gnet_stats_add_queue() and remove unused __gnet_stats_copy_queue(). Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-10-18mq, mqprio: Use gnet_stats_add_queue().Sebastian Andrzej Siewior
gnet_stats_add_basic() and gnet_stats_add_queue() add up the statistics so they can be used directly for both the per-CPU and global case. gnet_stats_add_queue() copies either Qdisc's per-CPU gnet_stats_queue::qlen or the global member. The global gnet_stats_queue::qlen isn't touched in the per-CPU case so there is no need to consider it in the global-case. In the per-CPU case, the sum of global gnet_stats_queue::qlen and the per-CPU gnet_stats_queue::qlen was assigned to sch->q.qlen and sch->qstats.qlen. Now both fields are copied individually. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>