summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-05-08net: stmmac: dwmac-ipq806x: account for rgmii-txid/rxid/id phy-modeChristian Marangi
Currently the ipq806x dwmac driver is almost always used attached to the CPU port of a switch and phy-mode was always set to "rgmii" or "sgmii". Some device came up with a special configuration where the PHY is directly attached to the GMAC port and in those case phy-mode needs to be set to "rgmii-id" to make the PHY correctly work and receive packets. Since the driver supports only "rgmii" and "sgmii" mode, when "rgmii-id" (or variants) mode is set, the mode is rejected and probe fails. Add support also for these phy-modes to correctly setup PHYs that requires delay applied to tx/rx. Signed-off-by: Christian Marangi <ansuelsmth@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: bridge: switchdev: Improve error message for port_obj_add/del functionsOleksij Rempel
Enhance the error reporting mechanism in the switchdev framework to provide more informative and user-friendly error messages. Following feedback from users struggling to understand the implications of error messages like "failed (err=-28) to add object (id=2)", this update aims to clarify what operation failed and how this might impact the system or network. With this change, error messages now include a description of the failed operation, the specific object involved, and a brief explanation of the potential impact on the system. This approach helps administrators and developers better understand the context and severity of errors, facilitating quicker and more effective troubleshooting. Example of the improved logging: [ 70.516446] ksz-switch spi0.0 uplink: Failed to add Port Multicast Database entry (object id=2) with error: -ENOSPC (-28). [ 70.516446] Failure in updating the port's Multicast Database could lead to multicast forwarding issues. [ 70.516446] Current HW/SW setup lacks sufficient resources. This comprehensive update includes handling for a range of switchdev object IDs, ensuring that most operations within the switchdev framework benefit from clearer error reporting. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: phy: marvell-88q2xxx: add support for Rev B1 and B2Gregor Herburger
Different revisions of the Marvell 88q2xxx phy needs different init sequences. Add init sequence for Rev B1 and Rev B2. Rev B2 init sequence skips one register write. Tested-by: Dimitri Fedrau <dima.fedrau@gmail.com> Signed-off-by: Gregor Herburger <gregor.herburger@ew.tq-group.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08appletalk: Improve handling of broadcast packetsVincent Duvert
When a broadcast AppleTalk packet is received, prefer queuing it on the socket whose address matches the address of the interface that received the packet (and is listening on the correct port). Userspace applications that handle such packets will usually send a response on the same socket that received the packet; this fix allows the response to be sent on the correct interface. If a socket matching the interface's address is not found, an arbitrary socket listening on the correct port will be used, if any. This matches the implementation's previous behavior. Fixes atalkd's responses to network information requests when multiple network interfaces are configured to use AppleTalk. Link: https://lore.kernel.org/netdev/20200722113752.1218-2-vincent.ldev@duvert.net/ Link: https://gist.github.com/VinDuv/4db433b6dce39d51a5b7847ee749b2a4 Signed-off-by: Vincent Duvert <vincent.ldev@duvert.net> Signed-off-by: Doug Brown <doug@schmorgal.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net/ipv4: add tracepoint for icmp_sendPeilin He
Introduce a tracepoint for icmp_send, which can help users to get more detail information conveniently when icmp abnormal events happen. 1. Giving an usecase example: ============================= When an application experiences packet loss due to an unreachable UDP destination port, the kernel will send an exception message through the icmp_send function. By adding a trace point for icmp_send, developers or system administrators can obtain detailed information about the UDP packet loss, including the type, code, source address, destination address, source port, and destination port. This facilitates the trouble-shooting of UDP packet loss issues especially for those network-service applications. 2. Operation Instructions: ========================== Switch to the tracing directory. cd /sys/kernel/tracing Filter for destination port unreachable. echo "type==3 && code==3" > events/icmp/icmp_send/filter Enable trace event. echo 1 > events/icmp/icmp_send/enable 3. Result View: ================ udp_client_erro-11370 [002] ...s.12 124.728002: icmp_send: icmp_send: type=3, code=3. From 127.0.0.1:41895 to 127.0.0.1:6666 ulen=23 skbaddr=00000000589b167a Signed-off-by: Peilin He <he.peilin@zte.com.cn> Signed-off-by: xu xin <xu.xin16@zte.com.cn> Reviewed-by: Yunkai Zhang <zhang.yunkai@zte.com.cn> Cc: Yang Yang <yang.yang29@zte.com.cn> Cc: Liu Chun <liu.chun2@zte.com.cn> Cc: Xuexin Jiang <jiang.xuexin@zte.com.cn> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: bridge: fix corrupted ethernet header on multicast-to-unicastFelix Fietkau
The change from skb_copy to pskb_copy unfortunately changed the data copying to omit the ethernet header, since it was pulled before reaching this point. Fix this by calling __skb_push/pull around pskb_copy. Fixes: 59c878cbcdd8 ("net: bridge: fix multicast-to-unicast with fraglist GSO") Signed-off-by: Felix Fietkau <nbd@nbd.name> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08Merge branch 'ksz-dcb-dscp'David S. Miller
Oleksij Rempel says: ==================== add DCB and DSCP support for KSZ switches This patch series is aimed at improving support for DCB (Data Center Bridging) and DSCP (Differentiated Services Code Point) on KSZ switches. The main goal is to introduce global DSCP and PCP (Priority Code Point) mapping support, addressing the limitation of KSZ switches not having per-port DSCP priority mapping. This involves extending the DSA framework with new callbacks for managing trust settings for global DSCP and PCP maps. Additionally, we introduce IEEE 802.1q helpers for default configurations, benefiting other drivers too. Change logs are in separate patches. Compared to v6 this series includes some new patches for DSCP global mapping support and QoS selftest script for KSZ9477 switches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08selftests: microchip: add test for QoS support on KSZ9477 switch familyOleksij Rempel
Add tests covering following functionality on KSZ9477 switch family: - default port priority - global DSCP to Internal Priority Mapping - apptrust configuration This script was tested on KSZ9893R Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: dsa: microchip: add support DSCP priority mappingOleksij Rempel
Microchip KSZ and LAN variants do not have per port DSCP priority configuration. Instead there is a global DSCP mapping table. This patch provides write access to this global DSCP map. In case entry is "deleted", we map corresponding DSCP entry to a best effort prio, which is expected to be the default priority for all untagged traffic. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: dsa: add support switches global DSCP priority mappingOleksij Rempel
Some switches like Microchip KSZ variants do not support per port DSCP priority configuration. Instead there is a global DSCP mapping table. To handle it, we will accept set/del request to any of user ports to make global configuration and update dcb app entries for all other ports. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: dsa: microchip: let DCB code do PCP and DSCP policy configurationOleksij Rempel
802.1P (PCP) and DiffServ (DSCP) are handled now by DCB code. Let it do all needed initial configuration. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: dsa: microchip: init predictable IPV to queue mapping for all non ↵Oleksij Rempel
KSZ8xxx variants Init priority to queue mapping in the way as it shown in IEEE 802.1Q mapping example. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: dsa: microchip: enable ETS support for KSZ989X variantsOleksij Rempel
I tested ETS support on KSZ9893, so it should work other KSZ989X variants too, which was till not listed as support. With this change we now officially not support only ksz8 family of chips. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: dsa: microchip: dcb: add special handling for KSZ88X3 familyOleksij Rempel
KSZ88X3 switches have different behavior on different ports: - It seems to be not possible to disable VLAN PCP classification on port 2. It means, as soon as mutliqueue support is enabled, frames with VLAN tag will get PCP prios. This behavior do not affect Port 1 - it is possible to disable PCP prios. - DSCP classification is not working on Port 2. Since there are still usable configuration combinations, I added some quirks to make sure user will get appropriate error message if not possible configuration is chosen. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: dsa: microchip: add support for different DCB app configurationsOleksij Rempel
Add DCB support to configure app trust sources and default port priority. Following commands can be used for testing: dcb apptrust set dev lan1 order pcp dscp dcb app replace dev lan1 default-prio 3 Since it is not possible to configure DSCP-Prio mapping per port, this patch provide only ability to read switch global dscp-prio mapping and way to enable/disable app trust for DSCP. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: dsa: microchip: add multi queue support for KSZ88X3 variantsOleksij Rempel
KSZ88X3 switches support up to 4 queues. Rework ksz8795_set_prio_queue() to support KSZ8795 and KSZ88X3 families of switches. Per default, configure KSZ88X3 to use one queue, since it need special handling due to priority related errata. Errata handling is implemented in a separate patch. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: add IEEE 802.1q specific helpersOleksij Rempel
IEEE 802.1q specification provides recommendation and examples which can be used as good default values for different drivers. This patch implements mapping examples documented in IEEE 802.1Q-2022 in Annex I "I.3 Traffic type to traffic class mapping" and IETF DSCP naming and mapping DSCP to Traffic Type inspired by RFC8325. This helpers will be used in followup patches for dsa/microchip DCB implementation. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: dsa: microchip: add IPV information supportOleksij Rempel
Most of Microchip KSZ switches use Internal Priority Value associated with every frame. For example, it is possible to map any VLAN PCP or DSCP value to IPV and at the end, map IPV to a queue. Since amount of IPVs is not equal to amount of queues, add this information and make use of it in some functions. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08net: dsa: add support for DCB get/set apptrust configurationOleksij Rempel
Add DCB support to get/set trust configuration for different packet priority information sources. Some switch allow to chose different source of packet priority classification. For example on KSZ switches it is possible to configure VLAN PCP and/or DSCP sources. Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-08timers/migration: Prevent out of bounds access on failureLevi Yun
When tmigr_setup_groups() fails the level 0 group allocation, then the cleanup derefences index -1 of the local stack array. Prevent this by checking the loop condition first. Fixes: 7ee988770326 ("timers: Implement the hierarchical pull model") Signed-off-by: Levi Yun <ppbuk5246@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Link: https://lore.kernel.org/r/20240506041059.86877-1-ppbuk5246@gmail.com
2024-05-08erofs: derive fsid from on-disk UUID for .statfs() if possibleHongzhen Luo
Use the superblock's UUID to generate the fsid when it's non-null. Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Reviewed-by: Jingbo Xu <jefflexu@linux.alibaba.com> Signed-off-by: Hongzhen Luo <hongzhen@linux.alibaba.com> Link: https://lore.kernel.org/r/20240409113022.74720-1-hongzhen@linux.alibaba.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-05-08erofs: add a reserved buffer pool for lz4 decompressionChunhai Guo
This adds a special global buffer pool (in the end) for reserved pages. Using a reserved pool for LZ4 decompression significantly reduces the time spent on extra temporary page allocation for the extreme cases in low memory scenarios. The table below shows the reduction in time spent on page allocation for LZ4 decompression when using a reserved pool. The results were obtained from multi-app launch benchmarks on ARM64 Android devices running the 5.15 kernel with an 8-core CPU and 8GB of memory. In the benchmark, we launched 16 frequently-used apps, and the camera app was the last one in each round. The data in the table is the average time of camera app for each round. After using the reserved pool, there was an average improvement of 150ms in the overall launch time of our camera app, which was obtained from the systrace log. +--------------+---------------+--------------+---------+ | | w/o page pool | w/ page pool | diff | +--------------+---------------+--------------+---------+ | Average (ms) | 3434 | 21 | -99.38% | +--------------+---------------+--------------+---------+ Based on the benchmark logs, 64 pages are sufficient for 95% of scenarios. This value can be adjusted with a module parameter `reserved_pages`. The default value is 0. This pool is currently only used for the LZ4 decompressor, but it can be applied to more decompressors if needed. Signed-off-by: Chunhai Guo <guochunhai@vivo.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240402131523.2703948-1-guochunhai@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-05-08erofs: do not use pagepool in z_erofs_gbuf_growsize()Chunhai Guo
Let's use alloc_pages_bulk_array() for simplicity and get rid of unnecessary pagepool. Signed-off-by: Chunhai Guo <guochunhai@vivo.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240402092757.2635257-1-guochunhai@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-05-08erofs: rename per-CPU buffers to global buffer pool and make it configurableChunhai Guo
It will cost more time if compressed buffers are allocated on demand for low-latency algorithms (like lz4) so EROFS uses per-CPU buffers to keep compressed data if in-place decompression is unfulfilled. While it is kind of wasteful of memory for a device with hundreds of CPUs, and only a small number of CPUs concurrently decompress most of the time. This patch renames it as 'global buffer pool' and makes it configurable. This allows two or more CPUs to share a common buffer to reduce memory occupation. Suggested-by: Gao Xiang <xiang@kernel.org> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Chunhai Guo <guochunhai@vivo.com> Link: https://lore.kernel.org/r/20240402100036.2673604-1-guochunhai@vivo.com Signed-off-by: Sandeep Dhavale <dhavale@google.com> Link: https://lore.kernel.org/r/20240408215231.3376659-1-dhavale@google.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-05-08erofs: rename utils.c to zutil.cChunhai Guo
Currently, utils.c is only useful if CONFIG_EROFS_FS_ZIP is on. So let's rename it to zutil.c as well as avoid its inclusion if CONFIG_EROFS_FS_ZIP is explicitly disabled. Signed-off-by: Chunhai Guo <guochunhai@vivo.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Link: https://lore.kernel.org/r/20240401135550.2550043-1-guochunhai@vivo.com Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2024-05-08uapi: stddef.h: Provide UAPI macros for __counted_by_{le, be}Erick Archer
This commit can be considered an addition to commit ca7e324e8ad3 ("compiler_types: add Endianness-dependent __counted_by_{le,be}") [1]. In the commit referenced above the __counted_by_{le,be}() attributes were defined based on platform's endianness with the goal to that the structures contain flexible arrays at the end, and the counter for, can be annotated with these attributes. So, this commit only provide UAPI macros for UAPI structs that will gain annotations for __counted_by_{le, be} attributes. And it is the previous step to be able to use these attributes in UAPI. Link: https://lore.kernel.org/r/20240327142241.1745989-2-aleksander.lobakin@intel.com Suggested-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Erick Archer <erick.archer@outlook.com> Link: https://lore.kernel.org/r/AS8PR02MB72372E45071E8821C07236F78BE42@AS8PR02MB7237.eurprd02.prod.outlook.com Fixes: ca7e324e8ad3 ("compiler_types: add Endianness-dependent __counted_by_{le,be}") Signed-off-by: Kees Cook <keescook@chromium.org>
2024-05-08virtiofs: include a newline in sysfs tagBrian Foster
The internal tag string doesn't contain a newline. Append one when emitting the tag via sysfs. [Stefan] Orthogonal to the newline issue, sysfs_emit(buf, "%s", fs->tag) is needed to prevent format string injection. Signed-off-by: Brian Foster <bfoster@redhat.com> Fixes: a8f62f50b4e4 ("virtiofs: export filesystem tags through sysfs") Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2024-05-08x86/pci/ce4100: Remove unused 'struct sim_reg_op'Dr. David Alan Gilbert
'struct sim_reg_op' wasn't ever used since it was introduced 14 years ago via: 91d8037f563e ("ce4100: Add PCI register emulation for CE4100") Remove it. [ mingo: Improved the changelog. ] Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20240507232348.46677-1-linux@treblig.org
2024-05-08Merge tag 'qcom-arm64-for-6.10-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/dt A few more Qualcomm Arm64 DeviceTree updates for v6.10 This corrects the obviously broken compatible of the USB VBUS regulator in PM6150. It clears the odd-looking default address on QCS404 EVB, with the expectation that a proper address is provides by other means. The newly added SM8650 GPU node is corrected with a missing memory region. The third DWC3 instance on SC8280XP is added, and enabled on Lenovo Thinkpad X13s to give working fingerprint sensor. * tag 'qcom-arm64-for-6.10-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: dts: qcom: pm6150: correct USB VBUS regulator compatible arm64: dts: qcom: qcs404: fix bluetooth device address arm64: dts: qcom: sc8280xp-x13s: enable USB MP and fingerprint reader arm64: dts: qcom: sc8280xp: Add USB DWC3 Multiport controller arm64: dts: qcom: sm8650: Fix GPU cx_mem size Link: https://lore.kernel.org/r/20240508021820.206441-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-08Merge tag 'qcom-arm64-defconfig-for-6.10-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/defconfig One more Qualcomm Arm64 defconfig update for v6.10 This enables the SM6115 interconnect provider, to make it possible to boot boards on this SoC. * tag 'qcom-arm64-defconfig-for-6.10-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: arm64: defconfig: select INTERCONNECT_QCOM_SM6115 as built-in Link: https://lore.kernel.org/r/20240508021312.206121-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-08Merge tag 'qcom-drivers-for-6.10-2' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/drivers A few more Qualcomm driver updates for v6.10 This fixes a sleep-while-atomic issue in pmic_glink, stemming from the fact that the GLINK callback comes from interrupt context. It fixes the Bluetooth address in the example of qcom,wcnss, and it enables UEFI variables on SC8180X devices (Primus and Flex 5G). * tag 'qcom-drivers-for-6.10-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: firmware: qcom: uefisecapp: Allow on sc8180x Primus and Flex 5G soc: qcom: pmic_glink: Make client-lock non-sleeping dt-bindings: soc: qcom,wcnss: fix bluetooth address example Link: https://lore.kernel.org/r/20240508020900.204413-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-05-08spi: Remove unneded check for orig_nentsAndy Shevchenko
Both dma_unmap_sgtable() and sg_free_table() in spi_unmap_buf_attrs() have checks for orig_nents against 0. No need to duplicate this. All the same applies to other DMA mapping API calls. Also note, there is no other user in the kernel that does this kind of checks. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240507201028.564630-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-05-07Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-05-06 (ice) This series contains updates to ice driver only. Paul adds support for additional E830 devices and adjusts naming for existing E830 devices. Marcin commonizes a couple of TC setup calls to reduce duplicated code. Mateusz adds ice_vsi_cfg_params into ice_vsi to consolidate info. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: refactor struct ice_vsi_cfg_params to be inside of struct ice_vsi ice: Deduplicate tc action setup ice: update E830 device ids and comments ice: add additional E830 device ids ==================== Link: https://lore.kernel.org/r/20240506170827.948682-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07net: usb: sr9700: stop lying about skb->truesizeEric Dumazet
Some usb drivers set small skb->truesize and break core networking stacks. In this patch, I removed one of the skb->truesize override. I also replaced one skb_clone() by an allocation of a fresh and small skb, to get minimally sized skbs, like we did in commit 1e2c61172342 ("net: cdc_ncm: reduce skb truesize in rx path") and 4ce62d5b2f7a ("net: usb: ax88179_178a: stop lying about skb->truesize") Fixes: c9b37458e956 ("USB2NET : SR9700 : One chip USB 1.1 USB2NET SR9700Device Driver Support") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240506143939.3673865-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07net: usb: smsc75xx: stop lying about skb->truesizeEric Dumazet
Some usb drivers try to set small skb->truesize and break core networking stacks. In this patch, I removed one of the skb->truesize override. I also replaced one skb_clone() by an allocation of a fresh and small skb, to get minimally sized skbs, like we did in commit 1e2c61172342 ("net: cdc_ncm: reduce skb truesize in rx path") and 4ce62d5b2f7a ("net: usb: ax88179_178a: stop lying about skb->truesize") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Steve Glendinning <steve.glendinning@shawell.net> Link: https://lore.kernel.org/r/20240506142358.3657918-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07usb: aqc111: stop lying about skb->truesizeEric Dumazet
Some usb drivers try to set small skb->truesize and break core networking stacks. I replace one skb_clone() by an allocation of a fresh and small skb, to get minimally sized skbs, like we did in commit 1e2c61172342 ("net: cdc_ncm: reduce skb truesize in rx path") and 4ce62d5b2f7a ("net: usb: ax88179_178a: stop lying about skb->truesize") Fixes: 361459cd9642 ("net: usb: aqc111: Implement RX data path") Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240506135546.3641185-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07mptcp: only allow set existing scheduler for net.mptcp.schedulerGregory Detal
The current behavior is to accept any strings as inputs, this results in an inconsistent result where an unexisting scheduler can be set: # sysctl -w net.mptcp.scheduler=notdefault net.mptcp.scheduler = notdefault This patch changes this behavior by checking for existing scheduler before accepting the input. Fixes: e3b2870b6d22 ("mptcp: add a new sysctl scheduler") Cc: stable@vger.kernel.org Signed-off-by: Gregory Detal <gregory.detal@gmail.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Tested-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240506-upstream-net-20240506-mptcp-sched-exist-v1-1-2ed1529e521e@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07selftests/net: fix uninitialized variablesJohn Hubbard
When building with clang, via: make LLVM=1 -C tools/testing/selftest ...clang warns about three variables that are not initialized in all cases: 1) The opt_ipproto_off variable is used uninitialized if "testname" is not "ip". Willem de Bruijn pointed out that this is an actual bug, and suggested the fix that I'm using here (thanks!). 2) The addr_len is used uninitialized, but only in the assert case, which bails out, so this is harmless. 3) The family variable in add_listener() is only used uninitialized in the error case (neither IPv4 nor IPv6 is specified), so it's also harmless. Fix by initializing each variable. Signed-off-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240506190204.28497-1-jhubbard@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07lib: Allow for the DIM library to be modularFlorian Fainelli
Allow the Dynamic Interrupt Moderation (DIM) library to be built as a module. This is particularly useful in an Android GKI (Google Kernel Image) configuration where everything is built as a module, including Ethernet controller drivers. Having to build DIMLIB into the kernel image with potentially no user is wasteful. Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20240506175040.410446-1-florian.fainelli@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07nfc: nci: Fix kcov check in nci_rx_work()Tetsuo Handa
Commit 7e8cdc97148c ("nfc: Add KCOV annotations") added kcov_remote_start_common()/kcov_remote_stop() pair into nci_rx_work(), with an assumption that kcov_remote_stop() is called upon continue of the for loop. But commit d24b03535e5e ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet") forgot to call kcov_remote_stop() before break of the for loop. Reported-by: syzbot <syzbot+0438378d6f157baae1a2@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=0438378d6f157baae1a2 Fixes: d24b03535e5e ("nfc: nci: Fix uninit-value in nci_dev_up and nci_ntf_packet") Suggested-by: Andrey Konovalov <andreyknvl@gmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://lore.kernel.org/r/6d10f829-5a0c-405a-b39a-d7266f3a1a0b@I-love.SAKURA.ne.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07mptcp: fix possible NULL dereferencesEric Dumazet
subflow_add_reset_reason(skb, ...) can fail. We can not assume mptcp_get_ext(skb) always return a non NULL pointer. syzbot reported: general protection fault, probably for non-canonical address 0xdffffc0000000003: 0000 [#1] PREEMPT SMP KASAN PTI KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] CPU: 0 PID: 5098 Comm: syz-executor132 Not tainted 6.9.0-rc6-syzkaller-01478-gcdc74c9d06e7 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 RIP: 0010:subflow_v6_route_req+0x2c7/0x490 net/mptcp/subflow.c:388 Code: 8d 7b 07 48 89 f8 48 c1 e8 03 42 0f b6 04 20 84 c0 0f 85 c0 01 00 00 0f b6 43 07 48 8d 1c c3 48 83 c3 18 48 89 d8 48 c1 e8 03 <42> 0f b6 04 20 84 c0 0f 85 84 01 00 00 0f b6 5b 01 83 e3 0f 48 89 RSP: 0018:ffffc9000362eb68 EFLAGS: 00010206 RAX: 0000000000000003 RBX: 0000000000000018 RCX: ffff888022039e00 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff88807d961140 R08: ffffffff8b6cb76b R09: 1ffff1100fb2c230 R10: dffffc0000000000 R11: ffffed100fb2c231 R12: dffffc0000000000 R13: ffff888022bfe273 R14: ffff88802cf9cc80 R15: ffff88802ad5a700 FS: 0000555587ad2380(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f420c3f9720 CR3: 0000000022bfc000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> tcp_conn_request+0xf07/0x32c0 net/ipv4/tcp_input.c:7180 tcp_rcv_state_process+0x183c/0x4500 net/ipv4/tcp_input.c:6663 tcp_v6_do_rcv+0x8b2/0x1310 net/ipv6/tcp_ipv6.c:1673 tcp_v6_rcv+0x22b4/0x30b0 net/ipv6/tcp_ipv6.c:1910 ip6_protocol_deliver_rcu+0xc76/0x1570 net/ipv6/ip6_input.c:438 ip6_input_finish+0x186/0x2d0 net/ipv6/ip6_input.c:483 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 __netif_receive_skb_one_core net/core/dev.c:5625 [inline] __netif_receive_skb+0x1ea/0x650 net/core/dev.c:5739 netif_receive_skb_internal net/core/dev.c:5825 [inline] netif_receive_skb+0x1e8/0x890 net/core/dev.c:5885 tun_rx_batched+0x1b7/0x8f0 drivers/net/tun.c:1549 tun_get_user+0x2f35/0x4560 drivers/net/tun.c:2002 tun_chr_write_iter+0x113/0x1f0 drivers/net/tun.c:2048 call_write_iter include/linux/fs.h:2110 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0xa84/0xcb0 fs/read_write.c:590 ksys_write+0x1a0/0x2c0 fs/read_write.c:643 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: 3e140491dd80 ("mptcp: support rstreason for passive reset") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20240506123032.3351895-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07selftests: netfilter: conntrack_tcp_unreplied.sh: wait for initial ↵Florian Westphal
connection attempt Netdev CI reports occasional failures with this test ("ERROR: ns2-dX6bUE did not pick up tcp connection from peer"). Add explicit busywait call until the initial connection attempt shows up in conntrack rather than a one-shot 'must exist' check. Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://lore.kernel.org/r/20240506114320.12178-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-07Merge branch 'libbpf: further struct_ops fixes and improvements'Martin KaFai Lau
Andrii Nakryiko says: ==================== Fix yet another case of mishandling SEC("struct_ops") programs that were nulled out programmatically through BPF skeleton by the user. While at it, add some improvements around detecting and reporting errors, specifically a common case of declaring SEC("struct_ops") program, but forgetting to actually make use of it by setting it as a callback implementation in SEC(".struct_ops") variable (i.e., map) declaration. A bunch of new selftests are added as well. ==================== Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-07selftests/bpf: shorten subtest names for struct_ops_module testAndrii Nakryiko
Drive-by clean up, we shouldn't use meaningless "test_" prefix for subtest names. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-8-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-07selftests/bpf: validate struct_ops early failure detection logicAndrii Nakryiko
Add a simple test that validates that libbpf will reject isolated struct_ops program early with helpful warning message. Also validate that explicit use of such BPF program through BPF skeleton after BPF object is open won't trigger any warnings. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-7-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-07libbpf: improve early detection of doomed-to-fail BPF program loadingAndrii Nakryiko
Extend libbpf's pre-load checks for BPF programs, detecting more typical conditions that are destinated to cause BPF program failure. This is an opportunity to provide more helpful and actionable error message to users, instead of potentially very confusing BPF verifier log and/or error. In this case, we detect struct_ops BPF program that was not referenced anywhere, but still attempted to be loaded (according to libbpf logic). Suggest that the program might need to be used in some struct_ops variable. User will get a message of the following kind: libbpf: prog 'test_1_forgotten': SEC("struct_ops") program isn't referenced anywhere, did you forget to use it? Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-6-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-07libbpf: fix libbpf_strerror_r() handling unknown errorsAndrii Nakryiko
strerror_r(), used from libbpf-specific libbpf_strerror_r() wrapper is documented to return error in two different ways, depending on glibc version. Take that into account when handling strerror_r()'s own errors, which happens when we pass some non-standard (internal) kernel error to it. Before this patch we'd have "ERROR: strerror_r(524)=22", which is quite confusing. Now for the same situation we'll see a bit less visually scary "unknown error (-524)". At least we won't confuse user with irrelevant EINVAL (22). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-5-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-07selftests/bpf: add another struct_ops callback use case testAndrii Nakryiko
Add a test which tests the case that was just fixed. Kernel has full type information about callback, but user explicitly nulls out the reference to declaratively set BPF program reference. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-4-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-07libbpf: handle yet another corner case of nulling out struct_ops programAndrii Nakryiko
There is yet another corner case where user can set STRUCT_OPS program reference in STRUCT_OPS map to NULL, but libbpf will fail to disable autoload for such BPF program. This time it's the case of "new" kernel which has type information about callback field, but user explicitly nulled-out program reference from user-space after opening BPF object. Fix, hopefully, the last remaining unhandled case. Fixes: 0737df6de946 ("libbpf: better fix for handling nulled-out struct_ops program") Fixes: f973fccd43d3 ("libbpf: handle nulled-out program in struct_ops correctly") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-3-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2024-05-07libbpf: remove unnecessary struct_ops prog validity checkAndrii Nakryiko
libbpf ensures that BPF program references set in map->st_ops->progs[i] during open phase are always valid STRUCT_OPS programs. This is done in bpf_object__collect_st_ops_relos(). So there is no need to double-check that in bpf_map__init_kern_struct_ops(). Simplify the code by removing unnecessary check. Also, we avoid using local prog variable to keep code similar to the upcoming fix, which adds similar logic in another part of bpf_map__init_kern_struct_ops(). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240507001335.1445325-2-andrii@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>