summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)Author
2021-12-16net: phylink: add pcs_validate() methodRussell King (Oracle)
Add a hook for PCS to validate the link parameters. This avoids MAC drivers having to have knowledge of their PCS in their validate() method, thereby allowing several MAC drivers to be simplfied. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16net: phylink: add mac_select_pcs() method to phylink_mac_opsRussell King (Oracle)
mac_select_pcs() allows us to have an explicit point to query which PCS the MAC wishes to use for a particular PHY interface mode, thereby allowing us to add support to validate the link settings with the PCS. Phylink will also use this to select the PCS to be used during a major configuration event without the MAC driver needing to call phylink_set_pcs(). Note that if mac_select_pcs() is present, the supported_interfaces bitmap must be filled in; this avoids mac_select_pcs() being called with PHY_INTERFACE_MODE_NA when we want to get support for all interface types. Phylink will return an error in phylink_create() unless this condition is satisfied. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16Merge branch 'mlx5-next' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Saeed Mahameed says: ==================== mlx5-next branch 2021-12-15 Hi Dave, Jakub, Jason This pulls mlx5-next branch into net-next and rdma branches. All patches already reviewed on both rdma and netdev mailing lists. Please pull and let me know if there's any problem. 1) Add multiple FDB steering priorities [1] 2) Introduce HW bits needed to configure MAC list size of VF/SF. Required for ("net/mlx5: Memory optimizations") upcoming series [2]. [1] https://lore.kernel.org/netdev/20211201193621.9129-1-saeed@kernel.org/ [2] https://lore.kernel.org/lkml/20211208141722.13646-1-shayd@nvidia.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-15net/mlx5: Introduce log_max_current_uc_list_wr_supported bitShay Drory
Downstream patch will use this bit in order to know whether the device supports changing of max_uc_list. Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-15net: add net device refcount tracker to struct packet_typeEric Dumazet
Most notable changes are in af_packet, tipc ones are trivial. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jon Maloy <jmaloy@redhat.com> Cc: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14net: dev_replace_track() cleanupEric Dumazet
Use existing helpers (netdev_tracker_free() and netdev_tracker_alloc()) to remove ifdefery. Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20211214151515.312535-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-14net: dsa: sja1105: fix broken connection with the sja1110 taggerVladimir Oltean
The driver was incorrectly converted assuming that "sja1105" is the only tagger supported by this driver. This results in SJA1110 switches failing to probe: sja1105 spi1.0: Unable to connect to tag protocol "sja1110": -EPROTONOSUPPORT sja1105: probe of spi1.2 failed with error -93 Add DSA_TAG_PROTO_SJA1110 to the list of supported taggers by the sja1105 driver. The sja1105_tagger_data structure format is common for the two tagging protocols. Fixes: c79e84866d2a ("net: dsa: tag_sja1105: convert to tagger-owned data") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-13net/mlx5: Separate FDB namespaceMaor Gottlieb
This patch doesn't add an additional namespaces, but just separates the naming to be used by each FDB user, bypass and kernel. Downstream patches will actually split this up and allow to have more than single priority for the bypass users. Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-12-13u64_stats: Disable preemption on 32bit UP+SMP PREEMPT_RT during updates.Sebastian Andrzej Siewior
On PREEMPT_RT the seqcount_t for synchronisation is required on 32bit architectures even on UP because the softirq (and the threaded IRQ handler) can be preempted. With the seqcount_t for synchronisation, a reader with higher priority can preempt the writer and then spin endlessly in read_seqcount_begin() while the writer can't make progress. To avoid such a lock up on PREEMPT_RT the writer must disable preemption during the update. There is no need to disable interrupts because no writer is using this API in hard-IRQ context on PREEMPT_RT. Disable preemption on 32bit-RT within the u64_stats write section. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12net: dsa: tag_sja1105: split sja1105_tagger_data into private and public ↵Vladimir Oltean
sections The sja1105 driver messes with the tagging protocol's state when PTP RX timestamping is enabled/disabled. This is fundamentally necessary because the tagger needs to know what to do when it receives a PTP packet. If RX timestamping is enabled, then a metadata follow-up frame is expected, and this holds the (partial) timestamp. So the tagger plays hide-and-seek with the network stack until it also gets the metadata frame, and then presents a single packet, the timestamped PTP packet. But when RX timestamping isn't enabled, there is no metadata frame expected, so the hide-and-seek game must be turned off and the packet must be delivered right away to the network stack. Considering this, we create a pseudo isolation by devising two tagger methods callable by the switch: one to get the RX timestamping state, and one to set it. Since we can't export symbols between the tagger and the switch driver, these methods are exposed through function pointers. After this change, the public portion of the sja1105_tagger_data contains only function pointers. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12Revert "net: dsa: move sja1110_process_meta_tstamp inside the tagging ↵Vladimir Oltean
protocol driver" This reverts commit 6d709cadfde68dbd12bef12fcced6222226dcb06. The above change was done to avoid calling symbols exported by the switch driver from the tagging protocol driver. With the tagger-owned storage model, we have a new option on our hands, and that is for the switch driver to provide a data consumer handler in the form of a function pointer inside the ->connect_tag_protocol() method. Having a function pointer avoids the problems of the exported symbols approach. By creating a handler for metadata frames holding TX timestamps on SJA1110, we are able to eliminate an skb queue from the tagger data, and replace it with a simple, and stateless, function pointer. This skb queue is now handled exclusively by sja1105_ptp.c, which makes the code easier to follow, as it used to be before the reverted patch. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12net: dsa: tag_sja1105: convert to tagger-owned dataVladimir Oltean
Currently, struct sja1105_tagger_data is a part of struct sja1105_private, and is used by the sja1105 driver to populate dp->priv. With the movement towards tagger-owned storage, the sja1105 driver should not be the owner of this memory. This change implements the connection between the sja1105 switch driver and its tagging protocol, which means that sja1105_tagger_data no longer stays in dp->priv but in ds->tagger_data, and that the sja1105 driver now only populates the sja1105_port_deferred_xmit callback pointer. The kthread worker is now the responsibility of the tagger. The sja1105 driver also alters the tagger's state some more, especially with regard to the PTP RX timestamping state. This will be fixed up a bit in further changes. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12net: dsa: sja1105: move ts_id from sja1105_tagger_dataVladimir Oltean
The TX timestamp ID is incremented by the SJA1110 PTP timestamping callback (->port_tx_timestamp) for every packet, when cloning it. It isn't used by the tagger at all, even though it sits inside the struct sja1105_tagger_data. Also, serialization to this structure is currently done through tagger_data->meta_lock, which is a cheap hack because the meta_lock isn't used for anything else on SJA1110 (sja1105_rcv_meta_state_machine isn't called). This change moves ts_id from sja1105_tagger_data to sja1105_private and introduces a dedicated spinlock for it, also in sja1105_private. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12net: dsa: sja1105: make dp->priv point directly to sja1105_tagger_dataVladimir Oltean
The design of the sja1105 tagger dp->priv is that each port has a separate struct sja1105_port, and the sp->data pointer points to a common struct sja1105_tagger_data. We have removed all per-port members accessible by the tagger, and now only struct sja1105_tagger_data remains. Make dp->priv point directly to this. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12net: dsa: sja1105: remove hwts_tx_en from tagger dataVladimir Oltean
This tagger property is in fact not used at all by the tagger, only by the switch driver. Therefore it makes sense to be moved to sja1105_private. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12net: dsa: sja1105: bring deferred xmit implementation in line with ocelot-8021qVladimir Oltean
When the ocelot-8021q driver was converted to deferred xmit as part of commit 8d5f7954b7c8 ("net: dsa: felix: break at first CPU port during init and teardown"), the deferred implementation was deliberately made subtly different from what sja1105 has. The implementation differences lied on the following observations: - There might be a race between these two lines in tag_sja1105.c: skb_queue_tail(&sp->xmit_queue, skb_get(skb)); kthread_queue_work(sp->xmit_worker, &sp->xmit_work); and the skb dequeue logic in sja1105_port_deferred_xmit(). For example, the xmit_work might be already queued, however the work item has just finished walking through the skb queue. Because we don't check the return code from kthread_queue_work, we don't do anything if the work item is already queued. However, nobody will take that skb and send it, at least until the next timestampable skb is sent. This creates additional (and avoidable) TX timestamping latency. To close that race, what the ocelot-8021q driver does is it doesn't keep a single work item per port, and a skb timestamping queue, but rather dynamically allocates a work item per packet. - It is also unnecessary to have more than one kthread that does the work. So delete the per-port kthread allocations and replace them with a single kthread which is global to the switch. This change brings the two implementations in line by applying those observations to the sja1105 driver as well. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-12net: dsa: tag_ocelot: convert to tagger-owned dataVladimir Oltean
The felix driver makes very light use of dp->priv, and the tagger is effectively stateless. dp->priv is practically only needed to set up a callback to perform deferred xmit of PTP and STP packets using the ocelot-8021q tagging protocol (the main ocelot tagging protocol makes no use of dp->priv, although this driver sets up dp->priv irrespective of actual tagging protocol in use). struct felix_port (what used to be pointed to by dp->priv) is removed and replaced with a two-sided structure. The public side of this structure, visible to the switch driver, is ocelot_8021q_tagger_data. The private side is ocelot_8021q_tagger_private, and the latter structure physically encapsulates the former. The public half of the tagger data structure can be accessed through a helper of the same name (ocelot_8021q_tagger_data) which also sanity-checks the protocol currently in use by the switch. The public/private split was requested by Andrew Lunn. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-10Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextJakub Kicinski
Andrii Nakryiko says: ==================== bpf-next 2021-12-10 v2 We've added 115 non-merge commits during the last 26 day(s) which contain a total of 182 files changed, 5747 insertions(+), 2564 deletions(-). The main changes are: 1) Various samples fixes, from Alexander Lobakin. 2) BPF CO-RE support in kernel and light skeleton, from Alexei Starovoitov. 3) A batch of new unified APIs for libbpf, logging improvements, version querying, etc. Also a batch of old deprecations for old APIs and various bug fixes, in preparation for libbpf 1.0, from Andrii Nakryiko. 4) BPF documentation reorganization and improvements, from Christoph Hellwig and Dave Tucker. 5) Support for declarative initialization of BPF_MAP_TYPE_PROG_ARRAY in libbpf, from Hengqi Chen. 6) Verifier log fixes, from Hou Tao. 7) Runtime-bounded loops support with bpf_loop() helper, from Joanne Koong. 8) Extend branch record capturing to all platforms that support it, from Kajol Jain. 9) Light skeleton codegen improvements, from Kumar Kartikeya Dwivedi. 10) bpftool doc-generating script improvements, from Quentin Monnet. 11) Two libbpf v0.6 bug fixes, from Shuyi Cheng and Vincent Minet. 12) Deprecation warning fix for perf/bpf_counter, from Song Liu. 13) MAX_TAIL_CALL_CNT unification and MIPS build fix for libbpf, from Tiezhu Yang. 14) BTF_KING_TYPE_TAG follow-up fixes, from Yonghong Song. 15) Selftests fixes and improvements, from Ilya Leoshkevich, Jean-Philippe Brucker, Jiri Olsa, Maxim Mikityanskiy, Tirthendu Sarkar, Yucong Sun, and others. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (115 commits) libbpf: Add "bool skipped" to struct bpf_map libbpf: Fix typo in btf__dedup@LIBBPF_0.0.2 definition bpftool: Switch bpf_object__load_xattr() to bpf_object__load() selftests/bpf: Remove the only use of deprecated bpf_object__load_xattr() selftests/bpf: Add test for libbpf's custom log_buf behavior selftests/bpf: Replace all uses of bpf_load_btf() with bpf_btf_load() libbpf: Deprecate bpf_object__load_xattr() libbpf: Add per-program log buffer setter and getter libbpf: Preserve kernel error code and remove kprobe prog type guessing libbpf: Improve logging around BPF program loading libbpf: Allow passing user log setting through bpf_object_open_opts libbpf: Allow passing preallocated log_buf when loading BTF into kernel libbpf: Add OPTS-based bpf_btf_load() API libbpf: Fix bpf_prog_load() log_buf logic for log_level 0 samples/bpf: Remove unneeded variable bpf: Remove redundant assignment to pointer t selftests/bpf: Fix a compilation warning perf/bpf_counter: Use bpf_map_create instead of bpf_create_map samples: bpf: Fix 'unknown warning group' build warning on Clang samples: bpf: Fix xdp_sample_user.o linking with Clang ... ==================== Link: https://lore.kernel.org/r/20211210234746.2100561-1-andrii@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10net: add netns refcount tracker to struct seq_net_privateEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-10net: add networking namespace refcount trackerEric Dumazet
We have 100+ syzbot reports about netns being dismantled too soon, still unresolved as of today. We think a missing get_net() or an extra put_net() is the root cause. In order to find the bug(s), and be able to spot future ones, this patch adds CONFIG_NET_NS_REFCNT_TRACKER and new helpers to precisely pair all put_net() with corresponding get_net(). To use these helpers, each data structure owning a refcount should also use a "netns_tracker" to pair the get and put. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09skbuff: Extract list pointers to silence compiler warningsKees Cook
Under both -Warray-bounds and the object_size sanitizer, the compiler is upset about accessing prev/next of sk_buff when the object it thinks it is coming from is sk_buff_head. The warning is a false positive due to the compiler taking a conservative approach, opting to warn at casting time rather than access time. However, in support of enabling -Warray-bounds globally (which has found many real bugs), arrange things for sk_buff so that the compiler can unambiguously see that there is no intention to access anything except prev/next. Introduce and cast to a separate struct sk_buff_list, which contains _only_ the first two fields, silencing the warnings: In file included from ./include/net/net_namespace.h:39, from ./include/linux/netdevice.h:37, from net/core/netpoll.c:17: net/core/netpoll.c: In function 'refill_skbs': ./include/linux/skbuff.h:2086:9: warning: array subscript 'struct sk_buff[0]' is partly outside array bounds of 'struct sk_buff_head[1]' [-Warray-bounds] 2086 | __skb_insert(newsk, next->prev, next, list); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ net/core/netpoll.c:49:28: note: while referencing 'skb_pool' 49 | static struct sk_buff_head skb_pool; | ^~~~~~~~ This change results in no executable instruction differences. Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20211207062758.2324338-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09Merge tag 'net-5.16-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and netfilter. Current release - regressions: - bpf, sockmap: re-evaluate proto ops when psock is removed from sockmap Current release - new code bugs: - bpf: fix bpf_check_mod_kfunc_call for built-in modules - ice: fixes for TC classifier offloads - vrf: don't run conntrack on vrf with !dflt qdisc Previous releases - regressions: - bpf: fix the off-by-two error in range markings - seg6: fix the iif in the IPv6 socket control block - devlink: fix netns refcount leak in devlink_nl_cmd_reload() - dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" - dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports Previous releases - always broken: - ethtool: do not perform operations on net devices being unregistered - udp: use datalen to cap max gso segments - ice: fix races in stats collection - fec: only clear interrupt of handling queue in fec_enet_rx_queue() - m_can: pci: fix incorrect reference clock rate - m_can: disable and ignore ELO interrupt - mvpp2: fix XDP rx queues registering Misc: - treewide: add missing includes masked by cgroup -> bpf.h dependency" * tag 'net-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits) net: dsa: mv88e6xxx: allow use of PHYs on CPU and DSA ports net: wwan: iosm: fixes unable to send AT command during mbim tx net: wwan: iosm: fixes net interface nonfunctional after fw flash net: wwan: iosm: fixes unnecessary doorbell send net: dsa: felix: Fix memory leak in felix_setup_mmio_filtering MAINTAINERS: s390/net: remove myself as maintainer net/sched: fq_pie: prevent dismantle issue net: mana: Fix memory leak in mana_hwc_create_wq seg6: fix the iif in the IPv6 socket control block nfp: Fix memory leak in nfp_cpp_area_cache_add() nfc: fix potential NULL pointer deref in nfc_genl_dump_ses_done nfc: fix segfault in nfc_genl_dump_devices_done udp: using datalen to cap max gso segments net: dsa: mv88e6xxx: error handling for serdes_power functions can: kvaser_usb: get CAN clock frequency from device can: kvaser_pciefd: kvaser_pciefd_rx_error_frame(): increase correct stats->{rx,tx}_errors counter net: mvpp2: fix XDP rx queues registering vmxnet3: fix minimum vectors alloc issue net, neigh: clear whole pneigh_entry at alloc time net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's" ...
2021-12-09net: phylink: use legacy_pre_march2020Russell King (Oracle)
Use the legacy flag to indicate whether we should operate in legacy mode. This allows us to stop using the presence of a PCS as an indicator to the age of the phylink user, and make PCS presence optional. Legacy mode involves: 1) calling mac_config() whenever the link comes up 2) calling mac_config() whenever the inband advertisement changes, possibly followed by a call to mac_an_restart() 3) making use of mac_an_restart() 4) making use of mac_pcs_get_state() All the above functionality was moved to a seperate "PCS" block of operations in March 2020. Update the documents to indicate that the differences that this flag makes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09net: phylink: add legacy_pre_march2020 indicatorRussell King (Oracle)
Add a boolean to phylink_config to indicate whether a driver has not been updated for the changes in commit 7cceb599d15d ("net: phylink: avoid mac_config calls"), and thus are reliant on the old behaviour. We were currently keying the phylink behaviour on the presence of a PCS, but this is sub-optimal for modern drivers that may not have a PCS. This commit merely introduces the new flag, but does not add any use, since we need all legacy drivers to set this flag before it can be used. Once these legacy drivers have been updated, we can remove this flag. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-09Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID fixes from Jiri Kosina: - fixes for various drivers which assume that a HID device is on USB transport, but that might not necessarily be the case, as the device can be faked by uhid. (Greg, Benjamin Tissoires) - fix for spurious wakeups on certain Lenovo notebooks (Thomas Weißschuh) - a few other device-specific quirks * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: Ignore battery for Elan touchscreen on Asus UX550VE HID: intel-ish-hid: ipc: only enable IRQ wakeup when requested HID: google: add eel USB id HID: add USB_HID dependancy to hid-prodikeys HID: add USB_HID dependancy to hid-chicony HID: bigbenff: prevent null pointer dereference HID: sony: fix error path in probe HID: add USB_HID dependancy on some USB HID drivers HID: check for valid USB device for many HID drivers HID: wacom: fix problems when device is not a valid USB device HID: add hid_is_usb() function to make it simpler for USB detection HID: quirks: Add quirk for the Microsoft Surface 3 type-cover
2021-12-08net: wwan: make debugfs optionalSergey Ryazanov
Debugfs interface is optional for the regular modem use. Some distros and users will want to disable this feature for security or kernel size reasons. So add a configuration option that allows to completely disable the debugfs interface of the WWAN devices. A primary considered use case for this option was embedded firmwares. For example, in OpenWrt, you can not completely disable debugfs, as a lot of wireless stuff can only be configured and monitored with the debugfs knobs. At the same time, reducing the size of a kernel and modules is an essential task in the world of embedded software. Disabling the WWAN and IOSM debugfs interfaces allows us to save 50K (x86-64 build) of space for module storage. Not much, but already considerable when you only have 16MB of storage. So it is hard to just disable whole debugfs. Users need some fine grained set of options to control which debugfs interface is important and should be available and which is not. The new configuration symbol is enabled by default and is hidden under the EXPERT option. So a regular user would not be bothered by another one configuration question. While an embedded distro maintainer will be able to a little more reduce the final image size. Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge tag 'linux-can-next-for-5.17-20211208' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== can-next 2021-12-08 The first patch is by Vincent Mailhol and replaces the custom CAN units with generic one form linux/units.h. The next 3 patches are by Evgeny Boger and add Allwinner R40 support to the sun4i CAN driver. Andy Shevchenko contributes 4 patches to the hi311x CAN driver, consisting of cleanups and converting the driver to the device property API. * tag 'linux-can-next-for-5.17-20211208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: can: hi311x: hi3110_can_probe(): convert to use dev_err_probe() can: hi311x: hi3110_can_probe(): make use of device property API can: hi311x: hi3110_can_probe(): try to get crystal clock rate from property can: hi311x: hi3110_can_probe(): use devm_clk_get_optional() to get the input clock ARM: dts: sun8i: r40: add node for CAN controller can: sun4i_can: add support for R40 CAN controller dt-bindings: net: can: add support for Allwinner R40 CAN controller can: bittiming: replace CAN units with the generic ones from linux/units.h ==================== Link: https://lore.kernel.org/r/20211208125055.223141-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Daniel Borkmann says: ==================== bpf 2021-12-08 We've added 12 non-merge commits during the last 22 day(s) which contain a total of 29 files changed, 659 insertions(+), 80 deletions(-). The main changes are: 1) Fix an off-by-two error in packet range markings and also add a batch of new tests for coverage of these corner cases, from Maxim Mikityanskiy. 2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh. 3) Fix two functional regressions and a build warning related to BTF kfunc for modules, from Kumar Kartikeya Dwivedi. 4) Fix outdated code and docs regarding BPF's migrate_disable() use on non- PREEMPT_RT kernels, from Sebastian Andrzej Siewior. 5) Add missing includes in order to be able to detangle cgroup vs bpf header dependencies, from Jakub Kicinski. 6) Fix regression in BPF sockmap tests caused by missing detachment of progs from sockets when they are removed from the map, from John Fastabend. 7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF dispatcher, from Björn Töpel. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Add selftests to cover packet access corner cases bpf: Fix the off-by-two error in range markings treewide: Add missing includes masked by cgroup -> bpf dependency tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets bpf: Fix bpf_check_mod_kfunc_call for built-in modules bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL mips, bpf: Fix reference to non-existing Kconfig symbol bpf: Make sure bpf_disable_instrumentation() is safe vs preemption. Documentation/locking/locktypes: Update migrate_disable() bits. bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap bpf, sockmap: Attach map progs to psock early for feature probes bpf, x86: Fix "no previous prototype" warning ==================== Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: keep the bridge_dev and bridge_num as part of the same structureVladimir Oltean
The main desire behind this is to provide coherent bridge information to the fast path without locking. For example, right now we set dp->bridge_dev and dp->bridge_num from separate code paths, it is theoretically possible for a packet transmission to read these two port properties consecutively and find a bridge number which does not correspond with the bridge device. Another desire is to start passing more complex bridge information to dsa_switch_ops functions. For example, with FDB isolation, it is expected that drivers will need to be passed the bridge which requested an FDB/MDB entry to be offloaded, and along with that bridge_dev, the associated bridge_num should be passed too, in case the driver might want to implement an isolation scheme based on that number. We already pass the {bridge_dev, bridge_num} pair to the TX forwarding offload switch API, however we'd like to remove that and squash it into the basic bridge join/leave API. So that means we need to pass this pair to the bridge join/leave API. During dsa_port_bridge_leave, first we unset dp->bridge_dev, then we call the driver's .port_bridge_leave with what used to be our dp->bridge_dev, but provided as an argument. When bridge_dev and bridge_num get folded into a single structure, we need to preserve this behavior in dsa_port_bridge_leave: we need a copy of what used to be in dp->bridge. Switch drivers check bridge membership by comparing dp->bridge_dev with the provided bridge_dev, but now, if we provide the struct dsa_bridge as a pointer, they cannot keep comparing dp->bridge to the provided pointer, since this only points to an on-stack copy. To make this obvious and prevent driver writers from forgetting and doing stupid things, in this new API, the struct dsa_bridge is provided as a full structure (not very large, contains an int and a pointer) instead of a pointer. An explicit comparison function needs to be used to determine bridge membership: dsa_port_offloads_bridge(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: make dp->bridge_num one-basedVladimir Oltean
I have seen too many bugs already due to the fact that we must encode an invalid dp->bridge_num as a negative value, because the natural tendency is to check that invalid value using (!dp->bridge_num). Latest example can be seen in commit 1bec0f05062c ("net: dsa: fix bridge_num not getting cleared after ports leaving the bridge"). Convert the existing users to assume that dp->bridge_num == 0 is the encoding for invalid, and valid bridge numbers start from 1. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08can: bittiming: replace CAN units with the generic ones from linux/units.hVincent Mailhol
In [1], we introduced a set of units in linux/can/bittiming.h. Since then, generic SI prefixes were added to linux/units.h in [2]. Those new prefixes can perfectly replace CAN specific ones. This patch replaces all occurrences of the CAN units with their corresponding prefix (from linux/units) and the unit (as a comment) according to below table. CAN units SI metric prefix (from linux/units) + unit (as a comment) ------------------------------------------------------------------------ CAN_KBPS KILO /* BPS */ CAN_MBPS MEGA /* BPS */ CAM_MHZ MEGA /* Hz */ The definition are then removed from linux/can/bittiming.h [1] commit 1d7750760b70 ("can: bittiming: add CAN_KBPS, CAN_MBPS and CAN_MHZ macros") [2] commit 26471d4a6cf8 ("units: Add SI metric prefix definitions") Link: https://lore.kernel.org/all/20211124014536.782550-1-mailhol.vincent@wanadoo.fr Suggested-by: Jimmy Assarsson <extja@kvaser.com> Suggested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07net: phy: Remove unnecessary indentation in the comments of phy_deviceYanteng Si
Fix warning as: linux-next/Documentation/networking/kapi:122: ./include/linux/phy.h:543: WARNING: Unexpected indentation. linux-next/Documentation/networking/kapi:122: ./include/linux/phy.h:544: WARNING: Block quote ends without a blank line; unexpected unindent. linux-next/Documentation/networking/kapi:122: ./include/linux/phy.h:546: WARNING: Unexpected indentation. Suggested-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Yanteng Si <siyanteng@loongson.cn> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07Merge tag 'wireless-drivers-next-2021-12-07' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for v5.17 First set of patches for v5.17. The biggest change is the iwlmei driver for Intel's AMT devices. Also now WCN6855 support in ath11k should be usable. Major changes: ath10k * fetch (pre-)calibration data via nvmem subsystem ath11k * enable 802.11 power save mode in station mode for qca6390 and wcn6855 * trace log support * proper board file detection for WCN6855 based on PCI ids * BSS color change support rtw88 * add debugfs file to force lowest basic rate * add quirk to disable PCI ASPM on HP 250 G7 Notebook PC mwifiex * add quirk to disable deep sleep with certain hardware revision in Surface Book 2 devices iwlwifi * add iwlmei driver for co-operating with Intel's Active Management Technology (AMT) devices * tag 'wireless-drivers-next-2021-12-07' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next: (87 commits) iwlwifi: mei: fix linking when tracing is not enabled rtlwifi: rtl8192de: Style clean-ups mwl8k: Use named struct for memcpy() region intersil: Use struct_group() for memcpy() region libertas_tf: Use struct_group() for memcpy() region libertas: Use struct_group() for memcpy() region wlcore: no need to initialise statics to false rsi: Fix out-of-bounds read in rsi_read_pkt() rsi: Fix use-after-free in rsi_rx_done_handler() brcmfmac: Configure keep-alive packet on suspend wilc1000: remove '-Wunused-but-set-variable' warning in chip_wakeup() iwlwifi: mvm: read the rfkill state and feed it to iwlmei iwlwifi: mvm: add vendor commands needed for iwlmei iwlwifi: integrate with iwlmei iwlwifi: mei: add debugfs hooks iwlwifi: mei: add the driver to allow cooperation with CSME mei: bus: add client dma interface mwifiex: Ignore BTCOEX events from the 88W8897 firmware mwifiex: Ensure the version string from the firmware is 0-terminated mwifiex: Add quirk to disable deep sleep with certain hardware revision ... ==================== Link: https://lore.kernel.org/r/20211207144211.A9949C341C1@smtp.kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07net: watchdog: add net device refcount trackerEric Dumazet
Add a netdevice_tracker inside struct net_device, to track the self reference when a device has an active watchdog timer. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07vlan: add net device refcount trackerEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07net: eql: add net device refcount trackerEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07tcp: expose __tcp_sock_set_cork and __tcp_sock_set_nodelayMaxim Galaganov
Expose __tcp_sock_set_cork() and __tcp_sock_set_nodelay() for use in MPTCP setsockopt code -- namely for syncing MPTCP socket options with subflows inside sync_socket_options() while already holding the subflow socket lock. Acked-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Maxim Galaganov <max@internet.ru> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06net: fix recent csum changesEric Dumazet
Vladimir reported csum issues after my recent change in skb_postpull_rcsum() Issue here is the following: initial skb->csum is the csum of [part to be pulled][rest of packet] Old code: skb->csum = csum_sub(skb->csum, csum_partial(pull, pull_length, 0)); New code: skb->csum = ~csum_partial(pull, pull_length, ~skb->csum); This is broken if the csum of [pulled part] happens to be equal to skb->csum, because end result of skb->csum is 0 in new code, instead of being 0xffffffff David Laight suggested to use skb->csum = -csum_partial(pull, pull_length, -skb->csum); I based my patches on existing code present in include/net/seg6.h, update_csum_diff4() and update_csum_diff16() which might need a similar fix. I guess that my tests, mostly pulling 40 bytes of IPv6 header were not providing enough entropy to hit this bug. v2: added wsum_negate() to make sparse happy. Fixes: 29c3002644bd ("net: optimize skb_postpull_rcsum()") Fixes: 0bd28476f636 ("gro: optimize skb_gro_postpull_rcsum()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Vladimir Oltean <vladimir.oltean@nxp.com> Suggested-by: David Laight <David.Laight@ACULAB.COM> Cc: David Lebrun <dlebrun@google.com> Tested-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20211204045356.3659278-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06netpoll: add net device refcount tracker to struct netpollEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06ipmr, ip6mr: add net device refcount tracker to struct vif_deviceEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06net: linkwatch: add net device refcount trackerEric Dumazet
Add a netdevice_tracker inside struct net_device, to track the self reference when a device is in lweventlist. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06ipv4: add net device refcount tracker to struct in_deviceEric Dumazet
Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06net: dst: add net device refcount tracking to dst_entryEric Dumazet
We want to track all dev_hold()/dev_put() to ease leak hunting. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06net: add net device refcount tracker to struct netdev_queueEric Dumazet
This will help debugging pesky netdev reference leaks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06net: add net device refcount tracker to struct netdev_rx_queueEric Dumazet
This helps debugging net device refcount leaks. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06net: add net device refcount tracker infrastructureEric Dumazet
net device are refcounted. Over the years we had numerous bugs caused by imbalanced dev_hold() and dev_put() calls. The general idea is to be able to precisely pair each decrement with a corresponding prior increment. Both share a cookie, basically a pointer to private data storing stack traces. This patch adds dev_hold_track() and dev_put_track(). To use these helpers, each data structure owning a refcount should also use a "netdevice_tracker" to pair the hold and put. netdevice_tracker dev_tracker; ... dev_hold_track(dev, &dev_tracker, GFP_ATOMIC); ... dev_put_track(dev, &dev_tracker); Whenever a leak happens, we will get precise stack traces of the point dev_hold_track() happened, at device dismantle phase. We will also get a stack trace if too many dev_put_track() for the same netdevice_tracker are attempted. This is guarded by CONFIG_NET_DEV_REFCNT_TRACKER option. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06lib: add reference counting tracking infrastructureEric Dumazet
It can be hard to track where references are taken and released. In networking, we have annoying issues at device or netns dismantles, and we had various proposals to ease root causing them. This patch adds new infrastructure pairing refcount increases and decreases. This will self document code, because programmers will have to associate increments/decrements. This is controled by CONFIG_REF_TRACKER which can be selected by users of this feature. This adds both cpu and memory costs, and thus should probably be used with care. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-06Merge tag 'regulator-fix-v5.16-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fix from Mark Brown: "Documentation fix for v5.17. A fix for bitrot in the documentation for protection interrupts that crept in as the code was revised during review" * tag 'regulator-fix-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: Update protection IRQ helper docs
2021-12-05Merge tag 'sched_urgent_for_v5.16_rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Borislav Petkov: - Properly init uclamp_flags of a runqueue, on first enqueuing - Fix preempt= callback return values - Correct utime/stime resource usage reporting on nohz_full to return the proper times instead of shorter ones * tag 'sched_urgent_for_v5.16_rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/uclamp: Fix rq->uclamp_max not set on first enqueue preempt/dynamic: Fix setup_preempt_mode() return value sched/cputime: Fix getrusage(RUSAGE_THREAD) with nohz_full