summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-10-10x86/bugs: Do not use UNTRAIN_RET with IBPB on entryJohannes Wikner
Since X86_FEATURE_ENTRY_IBPB will invalidate all harmful predictions with IBPB, no software-based untraining of returns is needed anymore. Currently, this change affects retbleed and SRSO mitigations so if either of the mitigations is doing IBPB and the other one does the software sequence, the latter is not needed anymore. [ bp: Massage commit message. ] Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Johannes Wikner <kwikner@ethz.ch> Cc: <stable@kernel.org>
2024-10-10x86/bugs: Skip RSB fill at VMEXITJohannes Wikner
entry_ibpb() is designed to follow Intel's IBPB specification regardless of CPU. This includes invalidating RSB entries. Hence, if IBPB on VMEXIT has been selected, entry_ibpb() as part of the RET untraining in the VMEXIT path will take care of all BTB and RSB clearing so there's no need to explicitly fill the RSB anymore. [ bp: Massage commit message. ] Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Johannes Wikner <kwikner@ethz.ch> Cc: <stable@kernel.org>
2024-10-10x86/entry: Have entry_ibpb() invalidate return predictionsJohannes Wikner
entry_ibpb() should invalidate all indirect predictions, including return target predictions. Not all IBPB implementations do this, in which case the fallback is RSB filling. Prevent SRSO-style hijacks of return predictions following IBPB, as the return target predictor can be corrupted before the IBPB completes. [ bp: Massage. ] Signed-off-by: Johannes Wikner <kwikner@ethz.ch> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org>
2024-10-10x86/cpufeatures: Add a IBPB_NO_RET BUG flagJohannes Wikner
Set this flag if the CPU has an IBPB implementation that does not invalidate return target predictions. Zen generations < 4 do not flush the RSB when executing an IBPB and this bug flag denotes that. [ bp: Massage. ] Signed-off-by: Johannes Wikner <kwikner@ethz.ch> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org>
2024-10-10x86/cpufeatures: Define X86_FEATURE_AMD_IBPB_RETJim Mattson
AMD's initial implementation of IBPB did not clear the return address predictor. Beginning with Zen4, AMD's IBPB *does* clear the return address predictor. This behavior is enumerated by CPUID.80000008H:EBX.IBPB_RET[30]. Define X86_FEATURE_AMD_IBPB_RET for use in KVM_GET_SUPPORTED_CPUID, when determining cross-vendor capabilities. Suggested-by: Venkatesh Srinivas <venkateshs@chromium.org> Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@kernel.org>
2024-10-10drm/fbdev-dma: Only cleanup deferred I/O if necessaryJanne Grunau
Commit 5a498d4d06d6 ("drm/fbdev-dma: Only install deferred I/O if necessary") initializes deferred I/O only if it is used. drm_fbdev_dma_fb_destroy() however calls fb_deferred_io_cleanup() unconditionally with struct fb_info.fbdefio == NULL. KASAN with the out-of-tree Apple silicon display driver posts following warning from __flush_work() of a random struct work_struct instead of the expected NULL pointer derefs. [ 22.053799] ------------[ cut here ]------------ [ 22.054832] WARNING: CPU: 2 PID: 1 at kernel/workqueue.c:4177 __flush_work+0x4d8/0x580 [ 22.056597] Modules linked in: uhid bnep uinput nls_ascii ip6_tables ip_tables i2c_dev loop fuse dm_multipath nfnetlink zram hid_magicmouse btrfs xor xor_neon brcmfmac_wcc raid6_pq hci_bcm4377 bluetooth brcmfmac hid_apple brcmutil nvmem_spmi_mfd simple_mfd_spmi dockchannel_hid cfg80211 joydev regmap_spmi nvme_apple ecdh_generic ecc macsmc_hid rfkill dwc3 appledrm snd_soc_macaudio macsmc_power nvme_core apple_isp phy_apple_atc apple_sart apple_rtkit_helper apple_dockchannel tps6598x macsmc_hwmon snd_soc_cs42l84 videobuf2_v4l2 spmi_apple_controller nvmem_apple_efuses videobuf2_dma_sg apple_z2 videobuf2_memops spi_nor panel_summit videobuf2_common asahi videodev pwm_apple apple_dcp snd_soc_apple_mca apple_admac spi_apple clk_apple_nco i2c_pasemi_platform snd_pcm_dmaengine mc i2c_pasemi_core mux_core ofpart adpdrm drm_dma_helper apple_dart apple_soc_cpufreq leds_pwm phram [ 22.073768] CPU: 2 UID: 0 PID: 1 Comm: systemd-shutdow Not tainted 6.11.2-asahi+ #asahi-dev [ 22.075612] Hardware name: Apple MacBook Pro (13-inch, M2, 2022) (DT) [ 22.077032] pstate: 01400005 (nzcv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) [ 22.078567] pc : __flush_work+0x4d8/0x580 [ 22.079471] lr : __flush_work+0x54/0x580 [ 22.080345] sp : ffffc000836ef820 [ 22.081089] x29: ffffc000836ef880 x28: 0000000000000000 x27: ffff80002ddb7128 [ 22.082678] x26: dfffc00000000000 x25: 1ffff000096f0c57 x24: ffffc00082d3e358 [ 22.084263] x23: ffff80004b7862b8 x22: dfffc00000000000 x21: ffff80005aa1d470 [ 22.085855] x20: ffff80004b786000 x19: ffff80004b7862a0 x18: 0000000000000000 [ 22.087439] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000005 [ 22.089030] x14: 1ffff800106ddf0a x13: 0000000000000000 x12: 0000000000000000 [ 22.090618] x11: ffffb800106ddf0f x10: dfffc00000000000 x9 : 1ffff800106ddf0e [ 22.092206] x8 : 0000000000000000 x7 : aaaaaaaaaaaaaaaa x6 : 0000000000000001 [ 22.093790] x5 : ffffc000836ef728 x4 : 0000000000000000 x3 : 0000000000000020 [ 22.095368] x2 : 0000000000000008 x1 : 00000000000000aa x0 : 0000000000000000 [ 22.096955] Call trace: [ 22.097505] __flush_work+0x4d8/0x580 [ 22.098330] flush_delayed_work+0x80/0xb8 [ 22.099231] fb_deferred_io_cleanup+0x3c/0x130 [ 22.100217] drm_fbdev_dma_fb_destroy+0x6c/0xe0 [drm_dma_helper] [ 22.101559] unregister_framebuffer+0x210/0x2f0 [ 22.102575] drm_fb_helper_unregister_info+0x48/0x60 [ 22.103683] drm_fbdev_dma_client_unregister+0x4c/0x80 [drm_dma_helper] [ 22.105147] drm_client_dev_unregister+0x1cc/0x230 [ 22.106217] drm_dev_unregister+0x58/0x570 [ 22.107125] apple_drm_unbind+0x50/0x98 [appledrm] [ 22.108199] component_del+0x1f8/0x3a8 [ 22.109042] dcp_platform_shutdown+0x24/0x38 [apple_dcp] [ 22.110357] platform_shutdown+0x70/0x90 [ 22.111219] device_shutdown+0x368/0x4d8 [ 22.112095] kernel_restart+0x6c/0x1d0 [ 22.112946] __arm64_sys_reboot+0x1c8/0x328 [ 22.113868] invoke_syscall+0x78/0x1a8 [ 22.114703] do_el0_svc+0x124/0x1a0 [ 22.115498] el0_svc+0x3c/0xe0 [ 22.116181] el0t_64_sync_handler+0x70/0xc0 [ 22.117110] el0t_64_sync+0x190/0x198 [ 22.117931] ---[ end trace 0000000000000000 ]--- Signed-off-by: Janne Grunau <j@jannau.net> Fixes: 5a498d4d06d6 ("drm/fbdev-dma: Only install deferred I/O if necessary") Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/ZwLNuZL-8Gh5UUQb@robin
2024-10-09of: Fix unbalanced of node refcount and memory leaksJinjie Ruan
Got following report when doing overlay_test: OF: ERROR: memory leak, expected refcount 1 instead of 2, of_node_get()/of_node_put() unbalanced - destroy cset entry: attach overlay node /kunit-test OF: ERROR: memory leak before free overlay changeset, /kunit-test In of_overlay_apply_kunit_cleanup(), the "np" should be associated with fake instead of test to call of_node_put(), so the node is put before the overlay is removed. It also fix the following memory leaks: unreferenced object 0xffffff80c7d22800 (size 256): comm "kunit_try_catch", pid 236, jiffies 4294894764 hex dump (first 32 bytes): d0 26 d4 c2 80 ff ff ff 00 00 00 00 00 00 00 00 .&.............. 60 19 75 c1 80 ff ff ff 00 00 00 00 00 00 00 00 `.u............. backtrace (crc ee0a471c): [<0000000058ea1340>] kmemleak_alloc+0x34/0x40 [<00000000c538ac7e>] __kmalloc_cache_noprof+0x26c/0x2f4 [<00000000119f34f3>] __of_node_dup+0x4c/0x328 [<00000000b212ca39>] build_changeset_next_level+0x2cc/0x4c0 [<00000000eb208e87>] of_overlay_fdt_apply+0x930/0x1334 [<000000005bdc53a3>] of_overlay_fdt_apply_kunit+0x54/0x10c [<00000000143acd5d>] of_overlay_apply_kunit_cleanup+0x12c/0x524 [<00000000a813abc8>] kunit_try_run_case+0x13c/0x3ac [<00000000d77ab00c>] kunit_generic_run_threadfn_adapter+0x80/0xec [<000000000b296be1>] kthread+0x2e8/0x374 [<0000000007bd1c51>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c1751960 (size 16): comm "kunit_try_catch", pid 236, jiffies 4294894764 hex dump (first 16 bytes): 6b 75 6e 69 74 2d 74 65 73 74 00 c1 80 ff ff ff kunit-test...... backtrace (crc 18196259): [<0000000058ea1340>] kmemleak_alloc+0x34/0x40 [<0000000071006e2c>] __kmalloc_node_track_caller_noprof+0x300/0x3e0 [<00000000b16ac6cb>] kstrdup+0x48/0x84 [<0000000050e3373b>] __of_node_dup+0x60/0x328 [<00000000b212ca39>] build_changeset_next_level+0x2cc/0x4c0 [<00000000eb208e87>] of_overlay_fdt_apply+0x930/0x1334 [<000000005bdc53a3>] of_overlay_fdt_apply_kunit+0x54/0x10c [<00000000143acd5d>] of_overlay_apply_kunit_cleanup+0x12c/0x524 [<00000000a813abc8>] kunit_try_run_case+0x13c/0x3ac [<00000000d77ab00c>] kunit_generic_run_threadfn_adapter+0x80/0xec [<000000000b296be1>] kthread+0x2e8/0x374 [<0000000007bd1c51>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c2e96e00 (size 192): comm "kunit_try_catch", pid 236, jiffies 4294894764 hex dump (first 32 bytes): 80 19 75 c1 80 ff ff ff 0b 00 00 00 00 00 00 00 ..u............. a0 19 75 c1 80 ff ff ff 00 6f e9 c2 80 ff ff ff ..u......o...... backtrace (crc 1924cba4): [<0000000058ea1340>] kmemleak_alloc+0x34/0x40 [<00000000c538ac7e>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000009fdd35ad>] __of_prop_dup+0x7c/0x2ec [<00000000aa4e0111>] add_changeset_property+0x548/0x9e0 [<000000004777e25b>] build_changeset_next_level+0xd4/0x4c0 [<00000000a9c93f8a>] build_changeset_next_level+0x3a8/0x4c0 [<00000000eb208e87>] of_overlay_fdt_apply+0x930/0x1334 [<000000005bdc53a3>] of_overlay_fdt_apply_kunit+0x54/0x10c [<00000000143acd5d>] of_overlay_apply_kunit_cleanup+0x12c/0x524 [<00000000a813abc8>] kunit_try_run_case+0x13c/0x3ac [<00000000d77ab00c>] kunit_generic_run_threadfn_adapter+0x80/0xec [<000000000b296be1>] kthread+0x2e8/0x374 [<0000000007bd1c51>] ret_from_fork+0x10/0x20 unreferenced object 0xffffff80c1751980 (size 16): comm "kunit_try_catch", pid 236, jiffies 4294894764 hex dump (first 16 bytes): 63 6f 6d 70 61 74 69 62 6c 65 00 c1 80 ff ff ff compatible...... backtrace (crc 42df3c87): [<0000000058ea1340>] kmemleak_alloc+0x34/0x40 [<0000000071006e2c>] __kmalloc_node_track_caller_noprof+0x300/0x3e0 [<00000000b16ac6cb>] kstrdup+0x48/0x84 [<00000000a8888fd8>] __of_prop_dup+0xb0/0x2ec [<00000000aa4e0111>] add_changeset_property+0x548/0x9e0 [<000000004777e25b>] build_changeset_next_level+0xd4/0x4c0 [<00000000a9c93f8a>] build_changeset_next_level+0x3a8/0x4c0 [<00000000eb208e87>] of_overlay_fdt_apply+0x930/0x1334 [<000000005bdc53a3>] of_overlay_fdt_apply_kunit+0x54/0x10c [<00000000143acd5d>] of_overlay_apply_kunit_cleanup+0x12c/0x524 [<00000000a813abc8>] kunit_try_run_case+0x13c/0x3ac [<00000000d77ab00c>] kunit_generic_run_threadfn_adapter+0x80/0xec [<000000000b296be1>] kthread+0x2e8/0x374 unreferenced object 0xffffff80c2e96f00 (size 192): comm "kunit_try_catch", pid 236, jiffies 4294894764 hex dump (first 32 bytes): 40 f7 bb c6 80 ff ff ff 0b 00 00 00 00 00 00 00 @............... c0 19 75 c1 80 ff ff ff 00 00 00 00 00 00 00 00 ..u............. backtrace (crc f2f57ea7): [<0000000058ea1340>] kmemleak_alloc+0x34/0x40 [<00000000c538ac7e>] __kmalloc_cache_noprof+0x26c/0x2f4 [<000000009fdd35ad>] __of_prop_dup+0x7c/0x2ec [<00000000aa4e0111>] add_changeset_property+0x548/0x9e0 [<000000004777e25b>] build_changeset_next_level+0xd4/0x4c0 [<00000000a9c93f8a>] build_changeset_next_level+0x3a8/0x4c0 [<00000000eb208e87>] of_overlay_fdt_apply+0x930/0x1334 [<000000005bdc53a3>] of_overlay_fdt_apply_kunit+0x54/0x10c [<00000000143acd5d>] of_overlay_apply_kunit_cleanup+0x12c/0x524 [<00000000a813abc8>] kunit_try_run_case+0x13c/0x3ac [<00000000d77ab00c>] kunit_generic_run_threadfn_adapter+0x80/0xec [<000000000b296be1>] kthread+0x2e8/0x374 [<0000000007bd1c51>] ret_from_fork+0x10/0x20 ...... How to reproduce: CONFIG_OF_OVERLAY_KUNIT_TEST=y, CONFIG_DEBUG_KMEMLEAK=y and CONFIG_DEBUG_KMEMLEAK_AUTO_SCAN=y, launch the kernel. Fixes: 5c9dd72d8385 ("of: Add a KUnit test for overlays and test managed APIs") Reviewed-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Link: https://lore.kernel.org/r/20241010034416.2324196-1-ruanjinjie@huawei.com Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
2024-10-09Merge branch 'ipv4-namespacify-ipv4-address-hash-table'Jakub Kicinski
Kuniyuki Iwashima says: ==================== ipv4: Namespacify IPv4 address hash table. This is a prep of per-net RTNL conversion for RTM_(NEW|DEL|SET)ADDR. Currently, each IPv4 address is linked to the global hash table, and this needs to be protected by another global lock or namespacified to support per-net RTNL. Adding a global lock will cause deadlock in the rtnetlink path and GC, rtnetlink check_lifetime |- rtnl_net_lock(net) |- acquire the global lock |- acquire the global lock |- check ifa's netns `- put ifa into hash table `- rtnl_net_lock(net) so we need to namespacify the hash table. The IPv6 one is already namespacified, let's follow that. v2: https://lore.kernel.org/netdev/20241004195958.64396-1-kuniyu@amazon.com/ v1: https://lore.kernel.org/netdev/20241001024837.96425-1-kuniyu@amazon.com/ ==================== Link: https://patch.msgid.link/20241008172906.1326-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv4: Retire global IPv4 hash table inet_addr_lst.Kuniyuki Iwashima
No one uses inet_addr_lst anymore, so let's remove it. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-5-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv4: Namespacify IPv4 address GC.Kuniyuki Iwashima
Each IPv4 address could have a lifetime, which is useful for DHCP, and GC is periodically executed as check_lifetime_work. check_lifetime() does the actual GC under RTNL. 1. Acquire RTNL 2. Iterate inet_addr_lst 3. Remove IPv4 address if expired 4. Release RTNL Namespacifying the GC is required for per-netns RTNL, but using the per-netns hash table will shorten the time on the hash bucket iteration under RTNL. Let's add per-netns GC work and use the per-netns hash table. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-4-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv4: Use per-netns hash table in inet_lookup_ifaddr_rcu().Kuniyuki Iwashima
Now, all IPv4 addresses are put in the per-netns hash table. Let's use it in inet_lookup_ifaddr_rcu(). Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-3-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv4: Link IPv4 address to per-netns hash table.Kuniyuki Iwashima
As a prep for per-netns RTNL conversion, we want to namespacify the IPv4 address hash table and the GC work. Let's allocate the per-netns IPv4 address hash table to net->ipv4.inet_addr_lst and link IPv4 addresses into it. The actual users will be converted later. Note that the IPv6 address hash table is already namespacified. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008172906.1326-2-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-10-08 (ice, iavf, igb, e1000e, e1000) This series contains updates to ice, iavf, igb, e1000e, and e1000 drivers. For ice: Wojciech adds support for ethtool reset. Paul adds support for hardware based VF mailbox limits for E830 devices. Jake adjusts to a common iterator in ice_vc_cfg_qs_msg() and moves storing of max_frame and rx_buf_len from VSI struct to the ring structure. Hongbo Li uses assign_bit() to replace an open-coded instance. Markus Elfring adjusts a couple of PTP error paths to use a common, shared exit point. Yue Haibing removes unused declarations. For iavf: Yue Haibing removes unused declarations. For igb: Yue Haibing removes unused declarations. For e1000e: Takamitsu Iwai removes unneccessary writel() calls. Joe Damato adds support for netdev-genl support to query IRQ, NAPI, and queue information. For e1000: Joe Damato adds support for netdev-genl support to query IRQ, NAPI, and queue information. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: e1000: Link NAPI instances to queues and IRQs e1000e: Link NAPI instances to queues and IRQs e1000e: Remove duplicated writel() in e1000_configure_tx/rx() igb: Cleanup unused declarations iavf: Remove unused declarations ice: Cleanup unused declarations ice: Use common error handling code in two functions ice: Make use of assign_bit() API ice: store max_frame and rx_buf_len only in ice_rx_ring ice: consistently use q_idx in ice_vc_cfg_qs_msg() ice: add E830 HW VF mailbox message limit support ice: Implement ethtool reset support ==================== Link: https://patch.msgid.link/20241008233441.928802-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-10-08 (ice, i40e, igb, e1000e) This series contains updates to ice, i40e, igb, and e1000e drivers. For ice: Marcin allows driver to load, into safe mode, when DDP package is missing or corrupted and adjusts the netif_is_ice() check to account for when the device is in safe mode. He also fixes an out-of-bounds issue when MSI-X are increased for VFs. Wojciech clears FDB entries on reset to match the hardware state. For i40e: Aleksandr adds locking around MACVLAN filters to prevent memory leaks due to concurrency issues. For igb: Mohamed Khalfella adds a check to not attempt to bring up an already running interface on non-fatal PCIe errors. For e1000e: Vitaly changes board type for I219 to more closely match the hardware and stop PHY issues. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: change I219 (19) devices to ADP igb: Do not bring the device up after non-fatal error i40e: Fix macvlan leak by synchronizing access to mac_filter_hash ice: Fix increasing MSI-X on VF ice: Flush FDB entries before reset ice: Fix netif_is_ice() in Safe Mode ice: Fix entering Safe Mode ==================== Link: https://patch.msgid.link/20241008230050.928245-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09Fix misspelling of "accept*" in netAlexander Zubkov
Several files have "accept*" misspelled as "accpet*" in the comments. Fix all such occurrences. Signed-off-by: Alexander Zubkov <green@qrator.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241008162756.22618-2-green@qrator.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09net_sched: sch_sfq: handle bigger packetsEric Dumazet
SFQ has an assumption on dealing with packets smaller than 64KB. Even before BIG TCP, TCA_STAB can provide arbitrary big values in qdisc_pkt_len(skb) It is time to switch (struct sfq_slot)->allot to a 32bit field. sizeof(struct sfq_slot) is now 64 bytes, giving better cache locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20241008111603.653140-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09net: stmmac: Add DW QoS Eth v4/v5 ip payload error statisticsMinda Chen
Add DW QoS Eth v4/v5 ip payload error statistics, and rename descriptor bit macro because v4/v5 descriptor IPCE bit claims ip checksum error or TCP/UDP/ICMP segment length error. Here is bit description from DW QoS Eth data book(Part 19.6.2.2) bit7 IPCE: IP Payload Error When this bit is programmed, it indicates either of the following: 1).The 16-bit IP payload checksum (that is, the TCP, UDP, or ICMP checksum) calculated by the MAC does not match the corresponding checksum field in the received segment. 2).The TCP, UDP, or ICMP segment length does not match the payload length value in the IP Header field. 3).The TCP, UDP, or ICMP segment length is less than minimum allowed segment length for TCP, UDP, or ICMP. Signed-off-by: Minda Chen <minda.chen@starfivetech.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Link: https://patch.msgid.link/20241008111443.81467-1-minda.chen@starfivetech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09Merge branch 'mptcp-misc-fixes-involving-fallback-to-tcp'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: misc. fixes involving fallback to TCP - Patch 1: better handle DSS corruptions from a bugged peer: reducing warnings, doing a fallback or a reset depending on the subflow state. For >= v5.7. - Patch 2: fix DSS corruption due to large pmtu xmit, where MPTCP was not taken into account. For >= v5.6. - Patch 3: fallback when MPTCP opts are dropped after the first data packet, instead of resetting the connection. For >= v5.6. - Patch 4: restrict the removal of a subflow to other closing states, a better fix, for a recent one. For >= v5.10. ==================== Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-0-c6fb8e93e551@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09mptcp: pm: do not remove closing subflowsMatthieu Baerts (NGI0)
In a previous fix, the in-kernel path-manager has been modified not to retrigger the removal of a subflow if it was already closed, e.g. when the initial subflow is removed, but kept in the subflows list. To be complete, this fix should also skip the subflows that are in any closing state: mptcp_close_ssk() will initiate the closure, but the switch to the TCP_CLOSE state depends on the other peer. Fixes: 58e1b66b4e4b ("mptcp: pm: do not remove already closed subflows") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-4-c6fb8e93e551@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09mptcp: fallback when MPTCP opts are dropped after 1st dataMatthieu Baerts (NGI0)
As reported by Christoph [1], before this patch, an MPTCP connection was wrongly reset when a host received a first data packet with MPTCP options after the 3wHS, but got the next ones without. According to the MPTCP v1 specs [2], a fallback should happen in this case, because the host didn't receive a DATA_ACK from the other peer, nor receive data for more than the initial window which implies a DATA_ACK being received by the other peer. The patch here re-uses the same logic as the one used in other places: by looking at allow_infinite_fallback, which is disabled at the creation of an additional subflow. It's not looking at the first DATA_ACK (or implying one received from the other side) as suggested by the RFC, but it is in continuation with what was already done, which is safer, and it fixes the reported issue. The next step, looking at this first DATA_ACK, is tracked in [4]. This patch has been validated using the following Packetdrill script: 0 socket(..., SOCK_STREAM, IPPROTO_MPTCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 // 3WHS is OK +0.0 < S 0:0(0) win 65535 <mss 1460, sackOK, nop, nop, nop, wscale 6, mpcapable v1 flags[flag_h] nokey> +0.0 > S. 0:0(0) ack 1 <mss 1460, nop, nop, sackOK, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey]> +0.1 < . 1:1(0) ack 1 win 2048 <mpcapable v1 flags[flag_h] key[ckey=2, skey]> +0 accept(3, ..., ...) = 4 // Data from the client with valid MPTCP options (no DATA_ACK: normal) +0.1 < P. 1:501(500) ack 1 win 2048 <mpcapable v1 flags[flag_h] key[skey, ckey] mpcdatalen 500, nop, nop> // From here, the MPTCP options will be dropped by a middlebox +0.0 > . 1:1(0) ack 501 <dss dack8=501 dll=0 nocs> +0.1 read(4, ..., 500) = 500 +0 write(4, ..., 100) = 100 // The server replies with data, still thinking MPTCP is being used +0.0 > P. 1:101(100) ack 501 <dss dack8=501 dsn8=1 ssn=1 dll=100 nocs, nop, nop> // But the client already did a fallback to TCP, because the two previous packets have been received without MPTCP options +0.1 < . 501:501(0) ack 101 win 2048 +0.0 < P. 501:601(100) ack 101 win 2048 // The server should fallback to TCP, not reset: it didn't get a DATA_ACK, nor data for more than the initial window +0.0 > . 101:101(0) ack 601 Note that this script requires Packetdrill with MPTCP support, see [3]. Fixes: dea2b1ea9c70 ("mptcp: do not reset MP_CAPABLE subflow on mapping errors") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/518 [1] Link: https://datatracker.ietf.org/doc/html/rfc8684#name-fallback [2] Link: https://github.com/multipath-tcp/packetdrill [3] Link: https://github.com/multipath-tcp/mptcp_net-next/issues/519 [4] Reviewed-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-3-c6fb8e93e551@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09tcp: fix mptcp DSS corruption due to large pmtu xmitPaolo Abeni
Syzkaller was able to trigger a DSS corruption: TCP: request_sock_subflow_v4: Possible SYN flooding on port [::]:20002. Sending cookies. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 5227 at net/mptcp/protocol.c:695 __mptcp_move_skbs_from_subflow+0x20a9/0x21f0 net/mptcp/protocol.c:695 Modules linked in: CPU: 0 UID: 0 PID: 5227 Comm: syz-executor350 Not tainted 6.11.0-syzkaller-08829-gaf9c191ac2a0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 RIP: 0010:__mptcp_move_skbs_from_subflow+0x20a9/0x21f0 net/mptcp/protocol.c:695 Code: 0f b6 dc 31 ff 89 de e8 b5 dd ea f5 89 d8 48 81 c4 50 01 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 98 da ea f5 90 <0f> 0b 90 e9 47 ff ff ff e8 8a da ea f5 90 0f 0b 90 e9 99 e0 ff ff RSP: 0018:ffffc90000006db8 EFLAGS: 00010246 RAX: ffffffff8ba9df18 RBX: 00000000000055f0 RCX: ffff888030023c00 RDX: 0000000000000100 RSI: 00000000000081e5 RDI: 00000000000055f0 RBP: 1ffff110062bf1ae R08: ffffffff8ba9cf12 R09: 1ffff110062bf1b8 R10: dffffc0000000000 R11: ffffed10062bf1b9 R12: 0000000000000000 R13: dffffc0000000000 R14: 00000000700cec61 R15: 00000000000081e5 FS: 000055556679c380(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000020287000 CR3: 0000000077892000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <IRQ> move_skbs_to_msk net/mptcp/protocol.c:811 [inline] mptcp_data_ready+0x29c/0xa90 net/mptcp/protocol.c:854 subflow_data_ready+0x34a/0x920 net/mptcp/subflow.c:1490 tcp_data_queue+0x20fd/0x76c0 net/ipv4/tcp_input.c:5283 tcp_rcv_established+0xfba/0x2020 net/ipv4/tcp_input.c:6237 tcp_v4_do_rcv+0x96d/0xc70 net/ipv4/tcp_ipv4.c:1915 tcp_v4_rcv+0x2dc0/0x37f0 net/ipv4/tcp_ipv4.c:2350 ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205 ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314 __netif_receive_skb_one_core net/core/dev.c:5662 [inline] __netif_receive_skb+0x2bf/0x650 net/core/dev.c:5775 process_backlog+0x662/0x15b0 net/core/dev.c:6107 __napi_poll+0xcb/0x490 net/core/dev.c:6771 napi_poll net/core/dev.c:6840 [inline] net_rx_action+0x89b/0x1240 net/core/dev.c:6962 handle_softirqs+0x2c5/0x980 kernel/softirq.c:554 do_softirq+0x11b/0x1e0 kernel/softirq.c:455 </IRQ> <TASK> __local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382 local_bh_enable include/linux/bottom_half.h:33 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:919 [inline] __dev_queue_xmit+0x1764/0x3e80 net/core/dev.c:4451 dev_queue_xmit include/linux/netdevice.h:3094 [inline] neigh_hh_output include/net/neighbour.h:526 [inline] neigh_output include/net/neighbour.h:540 [inline] ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:236 ip_local_out net/ipv4/ip_output.c:130 [inline] __ip_queue_xmit+0x118c/0x1b80 net/ipv4/ip_output.c:536 __tcp_transmit_skb+0x2544/0x3b30 net/ipv4/tcp_output.c:1466 tcp_transmit_skb net/ipv4/tcp_output.c:1484 [inline] tcp_mtu_probe net/ipv4/tcp_output.c:2547 [inline] tcp_write_xmit+0x641d/0x6bf0 net/ipv4/tcp_output.c:2752 __tcp_push_pending_frames+0x9b/0x360 net/ipv4/tcp_output.c:3015 tcp_push_pending_frames include/net/tcp.h:2107 [inline] tcp_data_snd_check net/ipv4/tcp_input.c:5714 [inline] tcp_rcv_established+0x1026/0x2020 net/ipv4/tcp_input.c:6239 tcp_v4_do_rcv+0x96d/0xc70 net/ipv4/tcp_ipv4.c:1915 sk_backlog_rcv include/net/sock.h:1113 [inline] __release_sock+0x214/0x350 net/core/sock.c:3072 release_sock+0x61/0x1f0 net/core/sock.c:3626 mptcp_push_release net/mptcp/protocol.c:1486 [inline] __mptcp_push_pending+0x6b5/0x9f0 net/mptcp/protocol.c:1625 mptcp_sendmsg+0x10bb/0x1b10 net/mptcp/protocol.c:1903 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0x1a6/0x270 net/socket.c:745 ____sys_sendmsg+0x52a/0x7e0 net/socket.c:2603 ___sys_sendmsg net/socket.c:2657 [inline] __sys_sendmsg+0x2aa/0x390 net/socket.c:2686 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7fb06e9317f9 Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffe2cfd4f98 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007fb06e97f468 RCX: 00007fb06e9317f9 RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000005 RBP: 00007fb06e97f446 R08: 0000555500000000 R09: 0000555500000000 R10: 0000555500000000 R11: 0000000000000246 R12: 00007fb06e97f406 R13: 0000000000000001 R14: 00007ffe2cfd4fe0 R15: 0000000000000003 </TASK> Additionally syzkaller provided a nice reproducer. The repro enables pmtu on the loopback device, leading to tcp_mtu_probe() generating very large probe packets. tcp_can_coalesce_send_queue_head() currently does not check for mptcp-level invariants, and allowed the creation of cross-DSS probes, leading to the mentioned corruption. Address the issue teaching tcp_can_coalesce_send_queue_head() about mptcp using the tcp_skb_can_collapse(), also reducing the code duplication. Fixes: 85712484110d ("tcp: coalesce/collapse must respect MPTCP extensions") Cc: stable@vger.kernel.org Reported-by: syzbot+d1bff73460e33101f0e7@syzkaller.appspotmail.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/513 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-2-c6fb8e93e551@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09mptcp: handle consistently DSS corruptionPaolo Abeni
Bugged peer implementation can send corrupted DSS options, consistently hitting a few warning in the data path. Use DEBUG_NET assertions, to avoid the splat on some builds and handle consistently the error, dumping related MIBs and performing fallback and/or reset according to the subflow type. Fixes: 6771bfd9ee24 ("mptcp: update mptcp ack sequence from work queue") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-1-c6fb8e93e551@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09net: netconsole: fix wrong warningBreno Leitao
A warning is triggered when there is insufficient space in the buffer for userdata. However, this is not an issue since userdata will be sent in the next iteration. Current warning message: ------------[ cut here ]------------ WARNING: CPU: 13 PID: 3013042 at drivers/net/netconsole.c:1122 write_ext_msg+0x3b6/0x3d0 ? write_ext_msg+0x3b6/0x3d0 console_flush_all+0x1e9/0x330 The code incorrectly issues a warning when this_chunk is zero, which is a valid scenario. The warning should only be triggered when this_chunk is negative. Fixes: 1ec9daf95093 ("net: netconsole: append userdata to fragmented netconsole messages") Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241008094325.896208-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09net: dsa: refuse cross-chip mirroring operationsVladimir Oltean
In case of a tc mirred action from one switch to another, the behavior is not correct. We simply tell the source switch driver to program a mirroring entry towards mirror->to_local_port = to_dp->index, but it is not even guaranteed that the to_dp belongs to the same switch as dp. For proper cross-chip support, we would need to go through the cross-chip notifier layer in switch.c, program the entry on cascade ports, and introduce new, explicit API for cross-chip mirroring, given that intermediary switches should have introspection into the DSA tags passed through the cascade port (and not just program a port mirror on the entire cascade port). None of that exists today. Reject what is not implemented so that user space is not misled into thinking it works. Fixes: f50f212749e8 ("net: dsa: Add plumbing for port mirroring") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241008094320.3340980-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv6: Remove redundant unlikely()Tobias Klauser
IS_ERR_OR_NULL() already implies unlikely(). Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241008085454.8087-1-tklauser@distanz.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09net: fec: don't save PTP state if PTP is unsupportedWei Fang
Some platforms (such as i.MX25 and i.MX27) do not support PTP, so on these platforms fec_ptp_init() is not called and the related members in fep are not initialized. However, fec_ptp_save_state() is called unconditionally, which causes the kernel to panic. Therefore, add a condition so that fec_ptp_save_state() is not called if PTP is not supported. Fixes: a1477dc87dc4 ("net: fec: Restart PPS after link state change") Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/lkml/353e41fe-6bb4-4ee9-9980-2da2a9c1c508@roeck-us.net/ Signed-off-by: Wei Fang <wei.fang@nxp.com> Reviewed-by: Csókás, Bence <csokas.bence@prolan.hu> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Guenter Roeck <linux@roeck-us.net> Link: https://patch.msgid.link/20241008061153.1977930-1-wei.fang@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv6: switch inet6_acaddr_hash() to less predictable hashEric Dumazet
commit 2384d02520ff ("net/ipv6: Add anycast addresses to a global hashtable") added inet6_acaddr_hash(), using ipv6_addr_hash() and net_hash_mix() to get hash spreading for typical users. However ipv6_addr_hash() is highly predictable and a malicious user could abuse a specific hash bucket. Switch to __ipv6_addr_jhash(). We could use a dedicated secret, or reuse net_hash_mix() as I did in this patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008121307.800040-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv6: switch inet6_addr_hash() to less predictable hashEric Dumazet
In commit 3f27fb23219e ("ipv6: addrconf: add per netns perturbation in inet6_addr_hash()"), I added net_hash_mix() in inet6_addr_hash() to get better hash dispersion, at a time all netns were sharing the hash table. Since then, commit 21a216a8fc63 ("ipv6/addrconf: allocate a per netns hash table") made the hash table per netns. We could remove the net_hash_mix() from inet6_addr_hash(), but there is still an issue with ipv6_addr_hash(). It is highly predictable and a malicious user can easily create thousands of IPv6 addresses all stored in the same hash bucket. Switch to __ipv6_addr_jhash(). We could use a dedicated secret, or reuse net_hash_mix() as I did in this patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20241008120101.734521-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09net: airoha: Fix EGRESS_RATE_METER_EN_MASK definitionLorenzo Bianconi
Fix typo in EGRESS_RATE_METER_EN_MASK mask definition. This bus in not introducing any user visible problem since, even if we are setting EGRESS_RATE_METER_EN_MASK bit in REG_EGRESS_RATE_METER_CFG register, egress QoS metering is not supported yet since we are missing some other hw configurations (e.g token bucket rate, token bucket size). Introduced by commit 23020f049327 ("net: airoha: Introduce ethernet support for EN7581 SoC") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241009-airoha-fixes-v2-1-18af63ec19bf@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09net: ibm: emac: mal: add dcr_unmap to _removeRosen Penev
It's done in probe so it should be undone here. Fixes: 1d3bb996481e ("Device tree aware EMAC driver") Signed-off-by: Rosen Penev <rosenp@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20241008233050.9422-1-rosenp@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09net: liquidio: Remove unused cn23xx_dump_pf_initialized_regsDr. David Alan Gilbert
cn23xx_dump_pf_initialized_regs() was added in 2016's commit 72c0091293c0 ("liquidio: CN23XX device init and sriov config") but hasn't been used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241009003841.254853-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ksmbd: fix user-after-free from session log offNamjae Jeon
There is racy issue between smb2 session log off and smb2 session setup. It will cause user-after-free from session log off. This add session_lock when setting SMB2_SESSION_EXPIRED and referece count to session struct not to free session while it is being used. Cc: stable@vger.kernel.org # v5.15+ Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-25282 Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2024-10-09selftests/bpf: Fix error compiling cgroup_ancestor.c with musl libcTony Ambardar
Existing code calls connect() with a 'struct sockaddr_in6 *' argument where a 'struct sockaddr *' argument is declared, yielding compile errors when building for mips64el/musl-libc: In file included from cgroup_ancestor.c:3: cgroup_ancestor.c: In function 'send_datagram': cgroup_ancestor.c:38:38: error: passing argument 2 of 'connect' from incompatible pointer type [-Werror=incompatible-pointer-types] 38 | if (!ASSERT_OK(connect(sock, &addr, sizeof(addr)), "connect")) { | ^~~~~ | | | struct sockaddr_in6 * ./test_progs.h:343:29: note: in definition of macro 'ASSERT_OK' 343 | long long ___res = (res); \ | ^~~ In file included from .../netinet/in.h:10, from .../arpa/inet.h:9, from ./test_progs.h:17: .../sys/socket.h:386:19: note: expected 'const struct sockaddr *' but argument is of type 'struct sockaddr_in6 *' 386 | int connect (int, const struct sockaddr *, socklen_t); | ^~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors This only compiles because of a glibc extension allowing declaration of the argument as a "transparent union" which includes both types above. Explicitly cast the argument to allow compiling for both musl and glibc. Cc: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Fixes: f957c230e173 ("selftests/bpf: convert test_skb_cgroup_id_user to test_progs") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Reviewed-by: Alexis Lothoré <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241008231232.634047-1-tony.ambardar@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-09riscv, bpf: Fix possible infinite tailcall when CONFIG_CFI_CLANG is enabledPu Lehui
When CONFIG_CFI_CLANG is enabled, the number of prologue instructions skipped by tailcall needs to include the kcfi instruction, otherwise the TCC will be initialized every tailcall is called, which may result in infinite tailcalls. Fixes: e63985ecd226 ("bpf, riscv64/cfi: Support kCFI + BPF on riscv64") Signed-off-by: Pu Lehui <pulehui@huawei.com> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/r/20241008124544.171161-1-pulehui@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-09selftests/bpf: fix perf_event link info name_len assertionTyrone Wu
Fix `name_len` field assertions in `bpf_link_info.perf_event` for kprobe/uprobe/tracepoint to validate correct name size instead of 0. Fixes: 23cf7aa539dc ("selftests/bpf: Add selftest for fill_link_info") Signed-off-by: Tyrone Wu <wudevelops@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20241008164312.46269-2-wudevelops@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-09bpf: fix unpopulated name_len field in perf_event link infoTyrone Wu
Previously when retrieving `bpf_link_info.perf_event` for kprobe/uprobe/tracepoint, the `name_len` field was not populated by the kernel, leaving it to reflect the value initially set by the user. This behavior was inconsistent with how other input/output string buffer fields function (e.g. `raw_tracepoint.tp_name_len`). This patch fills `name_len` with the actual size of the string name. Fixes: 1b715e1b0ec5 ("bpf: Support ->fill_link_info for perf_event") Signed-off-by: Tyrone Wu <wudevelops@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/r/20241008164312.46269-1-wudevelops@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-09bpf: use kvzmalloc to allocate BPF verifier environmentRik van Riel
The kzmalloc call in bpf_check can fail when memory is very fragmented, which in turn can lead to an OOM kill. Use kvzmalloc to fall back to vmalloc when memory is too fragmented to allocate an order 3 sized bpf verifier environment. Admittedly this is not a very common case, and only happens on systems where memory has already been squeezed close to the limit, but this does not seem like much of a hot path, and it's a simple enough fix. Signed-off-by: Rik van Riel <riel@surriel.com> Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Link: https://lore.kernel.org/r/20241008170735.16766766@imladris.surriel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-10-09Merge branch 'qca_spi-improvements-to-qca7000-sync'Jakub Kicinski
Stefan Wahren says: ==================== qca_spi: Improvements to QCA7000 sync This series contains patches which improve the QCA7000 sync behavior. ==================== Link: https://patch.msgid.link/20241007113312.38728-1-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09qca_spi: Improve reset mechanismStefan Wahren
The commit 92717c2356cb ("net: qca_spi: Avoid high load if QCA7000 is not available") fixed the high load in case the QCA7000 is not available but introduced sync delays for some corner cases like buffer errors. So add the reset requests to the atomics flags, which are polled by the SPI thread. As a result reset requests and sync state are now separated. This has the nice benefit to make the code easier to understand. Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://patch.msgid.link/20241007113312.38728-3-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09qca_spi: Count unexpected WRBUF_SPC_AVA after resetStefan Wahren
After a reset of the QCA7000, the amount of available write buffer space should match QCASPI_HW_BUF_LEN. If this is not the case this error should be counted as such. Signed-off-by: Stefan Wahren <wahrenst@gmx.net> Link: https://patch.msgid.link/20241007113312.38728-2-wahrenst@gmx.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09tcp: remove unnecessary update for tp->write_seq in tcp_connect()xin.guo
Commit 783237e8daf13 ("net-tcp: Fast Open client - sending SYN-data") introduces tcp_connect_queue_skb() and it would overwrite tcp->write_seq, so it is no need to update tp->write_seq before invoking tcp_connect_queue_skb(). Signed-off-by: xin.guo <guoxin0309@gmail.com> Link: https://patch.msgid.link/1728289544-4611-1-git-send-email-guoxin0309@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09net: ftgmac100: fixed not check status from fixed phyJacky Chou
Add error handling from calling fixed_phy_register. It may return some error, therefore, need to check the status. And fixed_phy_register needs to bind a device node for mdio. Add the mac device node for fixed_phy_register function. This is a reference to this function, of_phy_register_fixed_link(). Fixes: e24a6c874601 ("net: ftgmac100: Get link speed and duplex for NC-SI") Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com> Link: https://patch.msgid.link/20241007032435.787892-1-jacky_chou@aspeedtech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10wifi: rtw89: wow: do not configure CPU IO to receive packets for old firmwareChin-Yen Lee
The older firmware of 8852A and 8852B can't receive packets via CPU IO function and will lead to WoWLAN fail if calling it. So use firmware feature to distinguish. Signed-off-by: Chin-Yen Lee <timlee@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241004065408.10261-1-pkshih@realtek.com
2024-10-10wifi: rtw89: coex: Add function to reorder Wi-Fi firmware report indexChing-Te Ku
To parsing firmware report correctly, driver need to re-order the report index to match with different chips and different Wi-Fi firmware version. Use wrong index to parse the report will lead the coexistence run into wrong mechanism. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241003105140.10867-5-pkshih@realtek.com
2024-10-10wifi: rtw89: coex: Solved BT PAN profile idle decrease Wi-Fi throughputChing-Te Ku
Some Bluetooth device will make up connection as PAN link, though the connection is idle, it will still report the PAN link is active. The coexistence mechanism will enable TDMA to protect the PAN, it makes Wi-Fi throughput degrade at least 50%. But the link is idle, don't need so much bandwidth. Add TDMA case to let Wi-Fi can do traffic 80% bandwidth. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241003105140.10867-4-pkshih@realtek.com
2024-10-10wifi: rtw89: coex: Reorder Bluetooth info related logicChing-Te Ku
Reorder Bluetooth firmware related event index, it should be the same with Wi-Fi firmware definition. To fix coexistence can not recognize Bluetooth PAN(Personal area network) profile correctly, modified the related logic. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241003105140.10867-3-pkshih@realtek.com
2024-10-09doc: net: Fix .rst rendering of net_cachelines pagesDonald Hunter
The doc pages under /networking/net_cachelines are unreadable because they lack .rst formatting for the tabular text. Add simple table markup and tidy up the table contents: - remove dashes that represent empty cells because they render as bullets and are not needed - replace 'struct_*' with 'struct *' in the first column so that sphinx can render links for any structs that appear in the docs Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241008165329.45647-1-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10wifi: rtw89: coex: Update priority setting for Wi-Fi is scanningChing-Te Ku
Update coexistence priority setting for Wi-Fi scanning channel, the new setting will allow Wi-Fi do RX while Bluetooth audio is not busy. Forced to set new TDMA policy while RF calibration request come, to make sure the calibration can do well, and switch to normal setting while the calibration is done. Remove the code that no longer use. Signed-off-by: Ching-Te Ku <ku920601@realtek.com> Signed-off-by: Ping-Ke Shih <pkshih@realtek.com> Link: https://patch.msgid.link/20241003105140.10867-2-pkshih@realtek.com
2024-10-09Merge branch 'ipv4-convert-__fib_validate_source-and-its-callers-to-dscp_t'Jakub Kicinski
Guillaume Nault says: ==================== ipv4: Convert __fib_validate_source() and its callers to dscp_t. This patch series continues to prepare users of ->flowi4_tos to a future conversion of this field (__u8 to dscp_t). This time, we convert __fib_validate_source() and its call chain. The objective is to eventually make all users of ->flowi4_tos use a dscp_t value. Making ->flowi4_tos a dscp_t field will help avoiding regressions where ECN bits are erroneously interpreted as DSCP bits. ==================== Link: https://patch.msgid.link/cover.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09ipv4: Convert __fib_validate_source() to dscp_t.Guillaume Nault
Pass a dscp_t variable to __fib_validate_source(), instead of a plain u8, to prevent accidental setting of ECN bits in ->flowi4_tos. Only fib_validate_source() actually calls __fib_validate_source(). Since it already has a dscp_t variable to pass as parameter, we only need to remove the inet_dscp_to_dsfield() conversion. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/8206b0a64a21a208ed94774e261a251c8d7bc251.1728302212.git.gnault@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>