summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-03-11selftests: forwarding: Add a test for NH group statsPetr Machata
Add to lib.sh support for fetching NH stats, and a new library, router_mpath_nh_lib.sh, with the common code for testing NH stats. Use the latter from router_mpath_nh.sh and router_mpath_nh_res.sh. The test works by sending traffic through a NH group, and checking that the reported values correspond to what the link that ultimately receives the traffic reports having seen. Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/2a424c54062a5f1efd13b9ec5b2b0e29c6af2574.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11mlxsw: spectrum_router: Share nexthop counters in resilient groupsPetr Machata
For resilient groups, we can reuse the same counter for all the buckets that share the same nexthop. Keep a reference count per counter, and keep all these counters in a per-next hop group xarray, which serves as a NHID->counter cache. If a counter is already present for a given NHID, just take a reference and use the same counter. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/cdd00084533fc83ac5917562f54642f008205bf3.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11mlxsw: spectrum_router: Support nexthop group hardware statisticsPetr Machata
When hw_stats is set on a group, install nexthop counters on members of a group. Counter allocation request is moved from nexthop object initialization to the update code. The previous placement made sense: when the counters are enabled by dpipe, the counters are installed to all existing nexthops and all nexthops created from then on get them. For the finer-grained nexthop group statistics, this is unsuitable. The existing placement was kept for the IPv4 and IPv6 nexthops. Resilient group replacement emits a pre_replace notification, and then any bucket_replace notifications if there were any replacements at all. If the group is balanced and the nexthop composition of the replaced group didn't change, there will be no such notifiers. Therefore hook to the pre_replace notifier and mark all buckets for update, to un/install the counters. When reporting deltas for resilient groups, use the nexthop ID that we stored in a previous patch to look up to which nexthop a bucket contributes. Co-developed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/87495a72f187df2e5d491d02729c550d235fcc85.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11mlxsw: spectrum_router: Track NH ID's of group membersPetr Machata
The core interfaces for collecting per-NH statistics are built around nexthops even for resilient groups. Because mlxsw models each bucket as a nexthop, the core next hop that a given bucket contributes to needs to be looked up. In order to be able to match the two up, we need to track nexthop ID for members of group nexthop objects. For simplicity, do it for all nexthop objects, not just group members. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/184ceb6b154e08f5bcf116a705b0fcb01c31895c.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11mlxsw: spectrum_router: Add helpers for nexthop countersPetr Machata
The next patch will add the ability to share nexthop counters among mlxsw nexthops backed by the same core nexthop. To have a place to store reference count, the counter should be kept in a dedicated structure. In this patch, introduce the structure together with the related helpers, sans the refcount, which comes in the next patch. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/61f23fa4f8c5d7879f68dacd793d8ab7425f33c0.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11mlxsw: spectrum_router: Avoid allocating NH counters twicePetr Machata
mlxsw_sp_nexthop_counter_disable() decays to a nop when called on a disabled counter, but mlxsw_sp_nexthop_counter_enable() can't similarly be called on an enabled counter. This would be useful in the following patches. Add the missing condition. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/0cc9050e196366c1387ab5ee47f1cee8ecde9c86.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11mlxsw: spectrum: Allow fetch-and-clear of flow countersPetr Machata
For the report_delta-like interface like a previous patch has added for collection of NH group statistics, it's easiest to read the counter and have the HW clear it right away. Thus, change mlxsw_sp_flow_counter_get() to take a bool indicating whether this should be done. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/6a096ede8ee92d5041e3832242c3bbc137198aba.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11mlxsw: spectrum_router: Have mlxsw_sp_nexthop_counter_enable() return intPetr Machata
In order to be able to diagnose failures in counter allocation, have the function mlxsw_sp_nexthop_counter_enable() return an error code. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/e0bb5c0cc6234ade2ade1e92abac991359c3f446.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11mlxsw: spectrum_router: Rename two functionsPetr Machata
The function mlxsw_sp_nexthop_counter_alloc() doesn't directly allocate anything, and mlxsw_sp_nexthop_counter_free() doesn't directly free. For the following patches, we will need names for functions that actually do those things. Therefore rename to mlxsw_sp_nexthop_counter_enable() and mlxsw_sp_nexthop_counter_disable() to free up the namespace. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/f59272958697a718f090f59f892d32beabcd8972.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11net: nexthop: Have all NH notifiers carry NH IDPetr Machata
When sending the notifications to collect NH statistics for resilient groups, the driver will need to know the nexthop IDs in individual buckets to look up the right counter. To that end, move the nexthop ID from struct nh_notifier_grp_entry_info to nh_notifier_single_info. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/8f964cd50b1a56d3606ce7ab4c50354ae019c43b.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11net: nexthop: Initialize NH group ID in resilient NH group notifiersPetr Machata
The NEXTHOP_EVENT_RES_TABLE_PRE_REPLACE notifier currently keeps the group ID unset. That makes it impossible to look up the group for which the notifier is intended. This is not an issue at the moment, because the only client is netdevsim, and that just so that it veto replacements, which is a static property not tied to a particular group. But for any practical use, the ID is necessary. Set it. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://lore.kernel.org/r/025fef095dcfb408042568bb5439da014d47239e.1709901020.git.petrm@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11net: gro: move two declarations to include/net/gro.hEric Dumazet
Move gro_find_receive_by_type() and gro_find_complete_by_type() to include/net/gro.h where they belong. Also use _NET_GRO_H instead of _NET_IPV6_GRO_H to protect include/net/gro.h from multiple inclusions. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240308102230.296224-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11net: netconsole: Add continuation line prefix to userdata messagesMatthew Wood
Add a space (' ') prefix to every userdata line to match docs for dev-kmsg. To account for this extra character in each userdata entry, reduce userdata entry names (directory name) from 54 characters to 53. According to the dev-kmsg docs, a space is used for subsequent lines to mark them as continuation lines. > A line starting with ' ', is a continuation line, adding > key/value pairs to the log message, which provide the machine > readable context of the message, for reliable processing in > userspace. Testing for this patch:: cd /sys/kernel/config/netconsole && mkdir cmdline0 cd cmdline0 mkdir userdata/test && echo "hello" > userdata/test/value mkdir userdata/test2 && echo "hello2" > userdata/test2/value echo "message" > /dev/kmsg Outputs:: 6.8.0-rc5-virtme,12,493,231373579,-;message test=hello test2=hello2 And I confirmed all testing works as expected from the original patchset Fixes: df03f830d099 ("net: netconsole: cache userdata formatted string in netconsole_target") Signed-off-by: Matthew Wood <thepacketgeek@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://lore.kernel.org/r/20240308002525.248672-1-thepacketgeek@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11r8169: switch to new function phy_support_eeeHeiner Kallweit
Switch to new function phy_support_eee. This allows to simplify the code because data->tx_lpi_enabled is now populated by phy_ethtool_get_eee(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/92462328-5c9b-4d82-9ce4-ea974cda4900@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11net: phy: simplify a check in phy_check_link_statusHeiner Kallweit
Handling case err == 0 in the other branch allows to simplify the code. In addition I assume in "err & phydev->eee_cfg.tx_lpi_enabled" it should have been a logical and operator. It works as expected also with the bitwise and, but using a bitwise and with a bool value looks ugly to me. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/de37bf30-61dd-49f9-b645-2d8ea11ddb5d@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11net: phy: marvell-88x2222: Remove unused of_gpio.hAndy Shevchenko
of_gpio.h is deprecated and subject to remove. The driver doesn't use it, simply remove the unused header. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240307122346.3677534-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11net: dsa: mt7530: disable LEDs before resetJustin Swartz
Disable LEDs just before resetting the MT7530 to avoid situations where the ESW_P4_LED_0 and ESW_P3_LED_0 pin states may cause an unintended external crystal frequency to be selected. The HT_XTAL_FSEL (External Crystal Frequency Selection) field of HWTRAP (the Hardware Trap register) stores a 2-bit value that represents the state of the ESW_P4_LED_0 and ESW_P4_LED_0 pins (seemingly) sampled just after the MT7530 has been reset, as: ESW_P4_LED_0 ESW_P3_LED_0 Frequency ----------------------------------------- 0 1 20MHz 1 0 40MHz 1 1 25MHz The value of HT_XTAL_FSEL is bootstrapped by pulling ESW_P4_LED_0 and ESW_P3_LED_0 up or down accordingly, but: if a 40MHz crystal has been selected and the ESW_P3_LED_0 pin is high during reset, or a 20MHz crystal has been selected and the ESW_P4_LED_0 pin is high during reset, then the value of HT_XTAL_FSEL will indicate that a 25MHz crystal is present. By default, the state of the LED pins is PHY controlled to reflect the link state. To illustrate, if a board has: 5 ports with active low LED control, and HT_XTAL_FSEL bootstrapped for 40MHz. When the MT7530 is powered up without any external connection, only the LED associated with Port 3 is illuminated as ESW_P3_LED_0 is low. In this state, directly after mt7530_setup()'s reset is performed, the HWTRAP register (0x7800) reflects the intended HT_XTAL_FSEL (HWTRAP bits 10:9) of 40MHz: mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007dcf >>> bin(0x7dcf >> 9 & 0b11) '0b10' But if a cable is connected to Port 3 and the link is active before mt7530_setup()'s reset takes place, then HT_XTAL_FSEL seems to be set for 25MHz: mt7530-mdio mdio-bus:1f: mt7530_read: 00007800 == 00007fcf >>> bin(0x7fcf >> 9 & 0b11) '0b11' Once HT_XTAL_FSEL reflects 25MHz, none of the ports are functional until the MT7621 (or MT7530 itself) is reset. By disabling the LED pins just before reset, the chance of an unintended HT_XTAL_FSEL value is reduced. Signed-off-by: Justin Swartz <justin.swartz@risingedge.co.za> Link: https://lore.kernel.org/r/20240305043952.21590-1-justin.swartz@risingedge.co.za Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11net: mdio_bus: Remove unused of_gpio.hAndy Shevchenko
of_gpio.h is deprecated and subject to remove. The driver doesn't use it, simply remove the unused header. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20240307122231.3677241-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11ptp: make ptp_class constantRicardo B. Marliere
Since commit 43a7206b0963 ("driver core: class: make class_register() take a const *"), the driver core allows for struct class to be in read-only memory, so move the ptp_class structure to be declared at build time placing it into read-only memory, instead of having to be dynamically allocated at boot time. Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240305-ptp-v1-1-ed253eb33c20@marliere.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11netlink: specs: support unterminated-okHangbin Liu
ynl-gen-c.py supports check unterminated-ok, but the yaml schemas don't have this key. Add this to the yaml files. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240308081239.3281710-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11tools: ynl-gen: support using pre-defined values in attr checksHangbin Liu
Support using pre-defined values in checks so we don't need to use hard code number for the string, binary length. e.g. we have a definition like #define TEAM_STRING_MAX_LEN 32 Which defined in yaml like: definitions: - name: string-max-len type: const value: 32 It can be used in the attribute-sets like attribute-sets: - name: attr-option name-prefix: team-attr-option- attributes: - name: name type: string checks: len: string-max-len With this patch it will be converted to [TEAM_ATTR_OPTION_NAME] = { .type = NLA_STRING, .len = TEAM_STRING_MAX_LEN, } Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://lore.kernel.org/r/20240311140727.109562-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11net: page_pool: factor out page_pool recycle checkMina Almasry
The check is duplicated in 2 places, factor it out into a common helper. Signed-off-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://lore.kernel.org/r/20240308204500.1112858-1-almasrymina@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-11selftests/bpf: Add fexit and kretprobe triggering benchmarksAndrii Nakryiko
We already have kprobe and fentry benchmarks. Let's add kretprobe and fexit ones for completeness. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240309005124.3004446-1-andrii@kernel.org
2024-03-11mm: Introduce vmap_page_range() to map pages in PCI address spaceAlexei Starovoitov
ioremap_page_range() should be used for ranges within vmalloc range only. The vmalloc ranges are allocated by get_vm_area(). PCI has "resource" allocator that manages PCI_IOBASE, IO_SPACE_LIMIT address range, hence introduce vmap_page_range() to be used exclusively to map pages in PCI address space. Fixes: 3e49a866c9dc ("mm: Enforce VM_IOREMAP flag and range in ioremap_page_range.") Reported-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Christoph Hellwig <hch@lst.de> Tested-by: Miguel Ojeda <ojeda@kernel.org> Link: https://lore.kernel.org/bpf/CANiq72ka4rir+RTN2FQoT=Vvprp_Ao-CvoYEkSNqtSY+RZj+AA@mail.gmail.com
2024-03-11Merge branch 'tcp-wmem-data-races'David S. Miller
Jason Xing says: ==================== annotate data-races around sysctl_tcp_wmem[0] Adding simple READ_ONCE() can avoid reading the sysctl knob meanwhile someone is trying to change it. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11tcp: annotate a data-race around sysctl_tcp_wmem[0]Jason Xing
When reading wmem[0], it could be changed concurrently without READ_ONCE() protection. So add one annotation here. Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11mptcp: annotate a data-race around sysctl_tcp_wmem[0]Jason Xing
It's possible that writer and the reader can manipulate the same sysctl knob concurrently. Using READ_ONCE() to prevent reading an old value. Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11ynl: samples: fix recycling rate calculationJakub Kicinski
Running the page-pool sample on production machines under moderate networking load shows recycling rate higher than 100%: $ page-pool eth0[2] page pools: 14 (zombies: 0) refs: 89088 bytes: 364904448 (refs: 0 bytes: 0) recycling: 100.3% (alloc: 1392:2290247724 recycle: 469289484:1828235386) Note that outstanding refs (89088) == slow alloc * cache size (1392 * 64) which means this machine is recycling page pool pages perfectly, not a single page has been released. The extra 0.3% is because sample ignores allocations from the ptr_ring. Treat those the same as alloc_fast, the ring vs cache alloc is already captured accurately enough by recycling stats. With the fix: $ page-pool eth0[2] page pools: 14 (zombies: 0) refs: 89088 bytes: 364904448 (refs: 0 bytes: 0) recycling: 100.0% (alloc: 1392:2331141604 recycle: 473625579:1857460661) Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11udp: no longer touch sk->sk_refcnt in early demuxEric Dumazet
After commits ca065d0cf80f ("udp: no longer use SLAB_DESTROY_BY_RCU") and 7ae215d23c12 ("bpf: Don't refcount LISTEN sockets in sk_assign()") UDP early demux no longer need to grab a refcount on the UDP socket. This save two atomic operations per incoming packet for connected sockets. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Joe Stringer <joe@wand.net.nz> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11Merge branch 'getsockopt-parameter-validation'David S. Miller
Gavrilov Ilia says: ==================== fix incorrect parameter validation in the *_get_sockopt() functions This v2 series fix incorrent parameter validation in *_get_sockopt() functions in several places. version 2 changes: - reword the patch description - add two patches for net/kcm and net/x25 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11net/x25: fix incorrect parameter validation in the x25_getsockopt() functionGavrilov Ilia
The 'len' variable can't be negative when assigned the result of 'min_t' because all 'min_t' parameters are cast to unsigned int, and then the minimum one is chosen. To fix the logic, check 'len' as read from 'optlen', where the types of relevant variables are (signed) int. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11net: kcm: fix incorrect parameter validation in the kcm_getsockopt) functionGavrilov Ilia
The 'len' variable can't be negative when assigned the result of 'min_t' because all 'min_t' parameters are cast to unsigned int, and then the minimum one is chosen. To fix the logic, check 'len' as read from 'optlen', where the types of relevant variables are (signed) int. Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11udp: fix incorrect parameter validation in the udp_lib_getsockopt() functionGavrilov Ilia
The 'len' variable can't be negative when assigned the result of 'min_t' because all 'min_t' parameters are cast to unsigned int, and then the minimum one is chosen. To fix the logic, check 'len' as read from 'optlen', where the types of relevant variables are (signed) int. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11l2tp: fix incorrect parameter validation in the pppol2tp_getsockopt() functionGavrilov Ilia
The 'len' variable can't be negative when assigned the result of 'min_t' because all 'min_t' parameters are cast to unsigned int, and then the minimum one is chosen. To fix the logic, check 'len' as read from 'optlen', where the types of relevant variables are (signed) int. Fixes: 3557baabf280 ("[L2TP]: PPP over L2TP driver core") Reviewed-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11ipmr: fix incorrect parameter validation in the ip_mroute_getsockopt() functionGavrilov Ilia
The 'olr' variable can't be negative when assigned the result of 'min_t' because all 'min_t' parameters are cast to unsigned int, and then the minimum one is chosen. To fix the logic, check 'olr' as read from 'optlen', where the types of relevant variables are (signed) int. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11tcp: fix incorrect parameter validation in the do_tcp_getsockopt() functionGavrilov Ilia
The 'len' variable can't be negative when assigned the result of 'min_t' because all 'min_t' parameters are cast to unsigned int, and then the minimum one is chosen. To fix the logic, check 'len' as read from 'optlen', where the types of relevant variables are (signed) int. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11Merge branch 'qmc-hdlc'David S. Miller
Herve Codina says: ==================== Add support for QMC HDLC This series introduces the QMC HDLC support. Patches were previously sent as part of a full feature series and were previously reviewed in that context: "Add support for QMC HDLC, framer infrastructure and PEF2256 framer" [1] In order to ease the merge, the full feature series has been split and needed parts were merged in v6.8-rc1: - "Prepare the PowerQUICC QMC and TSA for the HDLC QMC driver" [2] - "Add support for framer infrastructure and PEF2256 framer" [3] This series contains patches related to the QMC HDLC part (QMC HDLC driver): - Introduce the QMC HDLC driver (patches 1 and 2) - Add timeslots change support in QMC HDLC (patch 3) - Add framer support as a framer consumer in QMC HDLC (patch 4) Compare to the original full feature series, a modification was done on patch 3 in order to use a coherent prefix in the commit title. I kept the patches unsquashed as they were previously sent and reviewed. Of course, I can squash them if needed. Compared to the previous iteration: https://lore.kernel.org/linux-kernel/20240306080726.167338-1-herve.codina@bootlin.com/ this v7 series mainly: - Rename a variable. - Fix reverse xmas tree declarations. - Add 'Acked-by' tag. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11net: wan: fsl_qmc_hdlc: Add framer supportHerve Codina
Add framer support in the fsl_qmc_hdlc driver in order to be able to signal carrier changes to the network stack based on the framer status Also use this framer to provide information related to the E1/T1 line interface on IF_GET_IFACE and configure the line interface according to IF_IFACE_{E1,T1} information. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11net: wan: fsl_qmc_hdlc: Add runtime timeslots changes supportHerve Codina
QMC channels support runtime timeslots changes but nothing is done at the QMC HDLC driver to handle these changes. Use existing IFACE ioctl in order to configure the timeslots to use. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11lib/bitmap: Introduce bitmap_scatter() and bitmap_gather() helpersAndy Shevchenko
These helpers scatters or gathers a bitmap with the help of the mask position bits parameter. bitmap_scatter() does the following: src: 0000000001011010 |||||| +------+||||| | +----+|||| | |+----+||| | || +-+|| | || | || mask: ...v..vv...v..vv ...0..11...0..10 dst: 0000001100000010 and bitmap_gather() performs this one: mask: ...v..vv...v..vv src: 0000001100000010 ^ ^^ ^ 0 | || | 10 | || > 010 | |+--> 1010 | +--> 11010 +----> 011010 dst: 0000000000011010 bitmap_gather() can the seen as the reverse bitmap_scatter() operation. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/lkml/20230926052007.3917389-3-andriy.shevchenko@linux.intel.com/ Co-developed-by: Herve Codina <herve.codina@bootlin.com> Signed-off-by: Herve Codina <herve.codina@bootlin.com> Acked-by: Yury Norov <yury.norov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11MAINTAINERS: Add the Freescale QMC HDLC driver entryHerve Codina
After contributing the driver, add myself as the maintainer for the Freescale QMC HDLC driver. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11net: wan: Add support for QMC HDLCHerve Codina
The QMC HDLC driver provides support for HDLC using the QMC (QUICC Multichannel Controller) to transfer the HDLC data. Signed-off-by: Herve Codina <herve.codina@bootlin.com> Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Acked-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-11Merge branch '100GbE' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ethtool: ice: Support for RSS settings to GTP Takeru Hayasaka enables RSS functionality for GTP packets on ice driver with ethtool. A user can include TEID and make RSS work for GTP-U over IPv4 by doing the following:`ethtool -N ens3 rx-flow-hash gtpu4 sde` In addition to gtpu(4|6), we now support gtpc(4|6),gtpc(4|6)t,gtpu(4|6)e, gtpu(4|6)u, and gtpu(4|6)d. gtpc(4|6): Used for GTP-C in IPv4 and IPv6, where the GTP header format does not include a TEID. gtpc(4|6)t: Used for GTP-C in IPv4 and IPv6, with a GTP header format that includes a TEID. gtpu(4|6): Used for GTP-U in both IPv4 and IPv6 scenarios. gtpu(4|6)e: Used for GTP-U with extended headers in both IPv4 and IPv6. gtpu(4|6)u: Used when the PSC (PDU session container) in the GTP-U extended header includes Uplink, applicable to both IPv4 and IPv6. gtpu(4|6)d: Used when the PSC in the GTP-U extended header includes Downlink, for both IPv4 and IPv6. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-09arm64, bpf: Use bpf_prog_pack for arm64 bpf trampolinePuranjay Mohan
We used bpf_prog_pack to aggregate bpf programs into huge page to relieve the iTLB pressure on the system. This was merged for ARM64[1] We can apply it to bpf trampoline as well. This would increase the preformance of fentry and struct_ops programs. [1] https://lore.kernel.org/bpf/20240228141824.119877-1-puranjay12@gmail.com/ Signed-off-by: Puranjay Mohan <puranjay12@gmail.com> Reviewed-by: Pu Lehui <pulehui@huawei.com> Message-ID: <20240304202803.31400-1-puranjay12@gmail.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-08Merge tag 'mlx5-socket-direct-v3' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== Support Multi-PF netdev (Socket Direct) This series adds support for combining multiple devices (PFs) of the same port under one netdev instance. Passing traffic through different devices belonging to different NUMA sockets saves cross-numa traffic and allows apps running on the same netdev from different numas to still feel a sense of proximity to the device and achieve improved performance. We achieve this by grouping PFs together, and creating the netdev only once all group members are probed. Symmetrically, we destroy the netdev once any of the PFs is removed. The channels are distributed between all devices, a proper configuration would utilize the correct close numa when working on a certain app/cpu. We pick one device to be a primary (leader), and it fills a special role. The other devices (secondaries) are disconnected from the network in the chip level (set to silent mode). All RX/TX traffic is steered through the primary to/from the secondaries. Currently, we limit the support to PFs only, and up to two devices (sockets). * tag 'mlx5-socket-direct-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: Documentation: networking: Add description for multi-pf netdev net/mlx5: Enable SD feature net/mlx5e: Block TLS device offload on combined SD netdev net/mlx5e: Support per-mdev queue counter net/mlx5e: Support cross-vhca RSS net/mlx5e: Let channels be SD-aware net/mlx5e: Create EN core HW resources for all secondary devices net/mlx5e: Create single netdev per SD group net/mlx5: SD, Add debugfs net/mlx5: SD, Add informative prints in kernel log net/mlx5: SD, Implement steering for primary and secondaries net/mlx5: SD, Implement devcom communication and primary election net/mlx5: SD, Implement basic query and instantiation net/mlx5: SD, Introduce SD lib net/mlx5: Add MPIR bit in mcam_access_reg ==================== Link: https://lore.kernel.org/r/20240307084229.500776-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-08Merge tag 'for-net-next-2024-03-08' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next Luiz Augusto von Dentz says: ==================== bluetooth-next pull request for net-next: - hci_conn: Only do ACL connections sequentially - hci_core: Cancel request on command timeout - Remove CONFIG_BT_HS - btrtl: Add the support for RTL8852BT/RTL8852BE-VT - btusb: Add support Mediatek MT7920 - btusb: Add new VID/PID 13d3/3602 for MT7925 - Add new quirk for broken read key length on ATS2851 * tag 'for-net-next-2024-03-08' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (52 commits) Bluetooth: hci_sync: Fix UAF in hci_acl_create_conn_sync Bluetooth: Fix eir name length Bluetooth: ISO: Align broadcast sync_timeout with connection timeout Bluetooth: Add new quirk for broken read key length on ATS2851 Bluetooth: mgmt: remove NULL check in add_ext_adv_params_complete() Bluetooth: mgmt: remove NULL check in mgmt_set_connectable_complete() Bluetooth: btusb: Add support Mediatek MT7920 Bluetooth: btmtk: Add MODULE_FIRMWARE() for MT7922 Bluetooth: btnxpuart: Fix btnxpuart_close Bluetooth: ISO: Clean up returns values in iso_connect_ind() Bluetooth: fix use-after-free in accessing skb after sending it Bluetooth: af_bluetooth: Fix deadlock Bluetooth: bnep: Fix out-of-bound access Bluetooth: btusb: Fix memory leak Bluetooth: msft: Fix memory leak Bluetooth: hci_core: Fix possible buffer overflow Bluetooth: btrtl: fix out of bounds memory access Bluetooth: hci_h5: Add ability to allocate memory for private data Bluetooth: hci_sync: Fix overwriting request callback Bluetooth: hci_sync: Use QoS to determine which PHY to scan ... ==================== Link: https://lore.kernel.org/r/20240308181056.120547-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-08Merge tag 'ieee802154-for-net-next-2024-03-07' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan-next Stefan Schmidt says: ==================== pull-request: ieee802154-next 2024-03-07 Various cross tree patches for ieee802154v drivers and a resource leak fix for ieee802154 llsec. Andy Shevchenko changed GPIO header usage for at86rf230 and mcr20a to only include needed headers. Bo Liu converted the at86rf230, mcr20a and mrf24j40 driver regmap support to use the maple tree register cache. Fedor Pchelkin fixed a resource leak in the llsec key deletion path. Ricardo B. Marliere made wpan_phy_class const. Tejun Heo removed WQ_UNBOUND from a workqueue call in ca8210. * tag 'ieee802154-for-net-next-2024-03-07' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan-next: ieee802154: cfg802154: make wpan_phy_class constant ieee802154: mcr20a: Remove unused of_gpio.h ieee802154: at86rf230: Replace of_gpio.h by proper one mac802154: fix llsec key resources release in mac802154_llsec_key_del ieee802154: ca8210: Drop spurious WQ_UNBOUND from alloc_ordered_workqueue() call net: ieee802154: mrf24j40: convert to use maple tree register cache net: ieee802154: mcr20a: convert to use maple tree register cache net: ieee802154: at86rf230: convert to use maple tree register cache ==================== Link: https://lore.kernel.org/r/20240307195105.292085-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-08tools: ynl: Fix spelling mistake "Constructred" -> "Constructed"Colin Ian King
There is a spelling mistake in an error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20240308084458.2045266-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-08ipv4: raw: check sk->sk_rcvbuf earlierEric Dumazet
There is no point cloning an skb and having to free the clone if the receive queue of the raw socket is full. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240307163020.2524409-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-08ipv6: raw: check sk->sk_rcvbuf earlierEric Dumazet
There is no point cloning an skb and having to free the clone if the receive queue of the raw socket is full. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20240307162943.2523817-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>