summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-06ovl: pass realinode to ovl_encode_real_fh() instead of realdentryAmir Goldstein
We want to be able to encode an fid from an inode with no alias. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250105162404.357058-2-amir73il@gmail.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-06Merge tag 'exfat-for-6.13-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat Pull exfat fixes from Namjae Jeon: "All fixes are for issues reported by syzbot: - Fix wrong error return in exfat_find_empty_entry() - Fix a endless loop by self-linked chain - fix a KMSAN uninit-value issue in exfat_extend_valid_size()" * tag 'exfat-for-6.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat: exfat: fix the infinite loop in __exfat_free_cluster() exfat: fix the new buffer was not zeroed before writing exfat: fix the infinite loop in exfat_readdir() exfat: fix exfat_find_empty_entry() not returning error on failure
2025-01-06Revert "vmstat: disable vmstat_work on vmstat_cpu_down_prep()"Linus Torvalds
This reverts commit adcfb264c3ed51fbbf5068ddf10d309a63683868. It turns out this just causes a different warning splat instead that seems to be much easier to trigger, so let's revert ASAP. Reported-and-bisected-by: Borislav Petkov <bp@alien8.de> Tested-by: Breno Leitao <leitao@debian.org> Reported-by: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/all/20250106131817.GAZ3vYGVr3-hWFFPLj@fat_crate.local/ Cc: Koichiro Den <koichiro.den@canonical.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-01-06btrfs: don't read from userspace twice in btrfs_uring_encoded_read()Mark Harmstone
If we return -EAGAIN the first time because we need to block, btrfs_uring_encoded_read() will get called twice. Take a copy of args, the iovs, and the iter the first time, as by the time we are called the second time these may have gone out of scope. Reported-by: Jens Axboe <axboe@kernel.dk> Fixes: 34310c442e17 ("btrfs: add io_uring command for encoded reads (ENCODED_READ ioctl)") Signed-off-by: Mark Harmstone <maharmstone@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06io_uring: add io_uring_cmd_get_async_data helperMark Harmstone
Add a helper function in include/linux/io_uring/cmd.h to read the async_data pointer from a struct io_uring_cmd. Signed-off-by: Mark Harmstone <maharmstone@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06io_uring/cmd: add per-op data to struct io_uring_cmd_dataJens Axboe
In case an op handler for ->uring_cmd() needs stable storage for user data, it can allocate io_uring_cmd_data->op_data and use it for the duration of the request. When the request gets cleaned up, uring_cmd will free it automatically. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06io_uring/cmd: rename struct uring_cache to io_uring_cmd_dataJens Axboe
In preparation for making this more generically available for ->uring_cmd() usage that needs stable command data, rename it and move it to io_uring/cmd.h instead. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-06selftests/bpf: Extend netkit tests to validate set {head,tail}roomDaniel Borkmann
Extend the netkit selftests to specify and validate the {head,tail}room on the netdevice: # ./vmtest.sh -- ./test_progs -t netkit [...] ./test_progs -t netkit [ 1.174147] bpf_testmod: loading out-of-tree module taints kernel. [ 1.174585] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel [ 1.422307] tsc: Refined TSC clocksource calibration: 3407.983 MHz [ 1.424511] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fc3e5084, max_idle_ns: 440795359833 ns [ 1.428092] clocksource: Switched to clocksource tsc #363 tc_netkit_basic:OK #364 tc_netkit_device:OK #365 tc_netkit_multi_links:OK #366 tc_netkit_multi_opts:OK #367 tc_netkit_neigh_links:OK #368 tc_netkit_pkt_type:OK #369 tc_netkit_scrub:OK Summary: 7/0 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/bpf/20241220234658.490686-3-daniel@iogearbox.net
2025-01-06netkit: Add add netkit {head,tail}room to rt_link.yamlDaniel Borkmann
Add netkit {head,tail}room attribute support to the rt_link.yaml spec file. Example: # ./tools/net/ynl/cli.py --spec Documentation/netlink/specs/rt_link.yaml \ --do getlink --json '{"ifname": "nk0"}' --output-json | jq [...] "linkinfo": { "kind": "netkit", "data": { "primary": 0, "policy": "forward", "mode": "l3", "scrub": "default", "headroom": 0, "tailroom": 0, "peer-policy": "forward", "peer-scrub": "default" } }, [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/bpf/20241220234658.490686-2-daniel@iogearbox.net
2025-01-06netkit: Allow for configuring needed_{head,tail}roomDaniel Borkmann
Allow the user to configure needed_{head,tail}room for both netkit devices. The idea is similar to 163e529200af ("veth: implement ndo_set_rx_headroom") with the difference that the two parameters can be specified upon device creation. By default the current behavior stays as is which is needed_{head,tail}room is 0. In case of Cilium, for example, the netkit devices are not enslaved into a bridge or openvswitch device (rather, BPF-based redirection is used out of tcx), and as such these parameters are not propagated into the Pod's netns via peer device. Given Cilium can run in vxlan/geneve tunneling mode (needed_headroom) and/or be used in combination with WireGuard (needed_{head,tail}room), allow the Cilium CNI plugin to specify these two upon netkit device creation. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://lore.kernel.org/bpf/20241220234658.490686-1-daniel@iogearbox.net
2025-01-05Linux 6.13-rc6v6.13-rc6Linus Torvalds
2025-01-05Merge tag 'kbuild-fixes-v6.13-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: - Fix escaping of '$' in scripts/mksysmap - Fix a modpost crash observed with the latest binutils - Fix 'provides' in the linux-api-headers pacman package * tag 'kbuild-fixes-v6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kbuild: pacman-pkg: provide versioned linux-api-headers package modpost: work around unaligned data access error modpost: refactor do_vmbus_entry() modpost: fix the missed iteration for the max bit in do_input() scripts/mksysmap: Fix escape chars '$'
2025-01-05Merge tag 'mm-hotfixes-stable-2025-01-04-18-02' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull hotfixes from Andrew Morton: "25 hotfixes. 16 are cc:stable. 18 are MM and 7 are non-MM. The usual bunch of singletons and two doubletons - please see the relevant changelogs for details" * tag 'mm-hotfixes-stable-2025-01-04-18-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (25 commits) MAINTAINERS: change Arınç _NAL's name and email address scripts/sorttable: fix orc_sort_cmp() to maintain symmetry and transitivity mm/util: make memdup_user_nul() similar to memdup_user() mm, madvise: fix potential workingset node list_lru leaks mm/damon/core: fix ignored quota goals and filters of newly committed schemes mm/damon/core: fix new damon_target objects leaks on damon_commit_targets() mm/list_lru: fix false warning of negative counter vmstat: disable vmstat_work on vmstat_cpu_down_prep() mm: shmem: fix the update of 'shmem_falloc->nr_unswapped' mm: shmem: fix incorrect index alignment for within_size policy percpu: remove intermediate variable in PERCPU_PTR() mm: zswap: fix race between [de]compression and CPU hotunplug ocfs2: fix slab-use-after-free due to dangling pointer dqi_priv fs/proc/task_mmu: fix pagemap flags with PMD THP entries on 32bit kcov: mark in_softirq_really() as __always_inline docs: mm: fix the incorrect 'FileHugeMapped' field mailmap: modify the entry for Mathieu Othacehe mm/kmemleak: fix sleeping function called from invalid context at print message mm: hugetlb: independent PMD page table shared count maple_tree: reload mas before the second call for mas_empty_area ...
2025-01-05Merge tag 'clk-fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "A randconfig build fix and a performance fix: - Fix the CONFIG_RESET_CONTROLLER=n path signature of clk_imx8mp_audiomix_reset_controller_register() to appease randconfig - Speed up the sdhci clk on TH1520 by a factor of 4 by adding a fixed factor clk" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: clk-imx8mp-audiomix: fix function signature clk: thead: Fix TH1520 emmc and shdci clock rate
2025-01-05netfilter: xt_hashlimit: htable_selective_cleanup() optimizationEric Dumazet
I have seen syzbot reports hinting at xt_hashlimit abuse: [ 105.783066][ T4331] xt_hashlimit: max too large, truncated to 1048576 [ 105.811405][ T4331] xt_hashlimit: size too large, truncated to 1048576 And worker threads using up to 1 second per htable_selective_cleanup() invocation. [ 269.734496][ C1] [<ffffffff81547180>] ? __local_bh_enable_ip+0x1a0/0x1a0 [ 269.734513][ C1] [<ffffffff817d75d0>] ? lockdep_hardirqs_on_prepare+0x740/0x740 [ 269.734533][ C1] [<ffffffff852e71ff>] ? htable_selective_cleanup+0x25f/0x310 [ 269.734549][ C1] [<ffffffff817dcd30>] ? __lock_acquire+0x2060/0x2060 [ 269.734567][ C1] [<ffffffff817f058a>] ? do_raw_spin_lock+0x14a/0x370 [ 269.734583][ C1] [<ffffffff852e71ff>] ? htable_selective_cleanup+0x25f/0x310 [ 269.734599][ C1] [<ffffffff81547147>] __local_bh_enable_ip+0x167/0x1a0 [ 269.734616][ C1] [<ffffffff81546fe0>] ? _local_bh_enable+0xa0/0xa0 [ 269.734634][ C1] [<ffffffff852e71ff>] ? htable_selective_cleanup+0x25f/0x310 [ 269.734651][ C1] [<ffffffff852e71ff>] htable_selective_cleanup+0x25f/0x310 [ 269.734670][ C1] [<ffffffff815b3cc9>] ? process_one_work+0x7a9/0x1170 [ 269.734685][ C1] [<ffffffff852e57db>] htable_gc+0x1b/0xa0 [ 269.734700][ C1] [<ffffffff815b3cc9>] ? process_one_work+0x7a9/0x1170 [ 269.734714][ C1] [<ffffffff815b3dc9>] process_one_work+0x8a9/0x1170 [ 269.734733][ C1] [<ffffffff815b3520>] ? worker_detach_from_pool+0x260/0x260 [ 269.734749][ C1] [<ffffffff810201c7>] ? _raw_spin_lock_irq+0xb7/0xf0 [ 269.734763][ C1] [<ffffffff81020110>] ? _raw_spin_lock_irqsave+0x100/0x100 [ 269.734777][ C1] [<ffffffff8159d3df>] ? wq_worker_sleeping+0x5f/0x270 [ 269.734800][ C1] [<ffffffff815b53c7>] worker_thread+0xa47/0x1200 [ 269.734815][ C1] [<ffffffff81020010>] ? _raw_spin_lock+0x40/0x40 [ 269.734835][ C1] [<ffffffff815c9f2a>] kthread+0x25a/0x2e0 [ 269.734853][ C1] [<ffffffff815b4980>] ? worker_clr_flags+0x190/0x190 [ 269.734866][ C1] [<ffffffff815c9cd0>] ? kthread_blkcg+0xd0/0xd0 [ 269.734885][ C1] [<ffffffff81027b1a>] ret_from_fork+0x3a/0x50 We can skip over empty buckets, avoiding the lockdep penalty for debug kernels, and avoid atomic operations on non debug ones. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-05ipvs: speed up reads from ip_vs_conn proc fileFlorian Westphal
Reading is very slow because ->start() performs a linear re-scan of the entire hash table until it finds the successor to the last dumped element. The current implementation uses 'pos' as the 'number of elements to skip, then does linear iteration until it has skipped 'pos' entries. Store the last bucket and the number of elements to skip in that bucket instead, so we can resume from bucket b directly. before this patch, its possible to read ~35k entries in one second, but each read() gets slower as the number of entries to skip grows: time timeout 60 cat /proc/net/ip_vs_conn > /tmp/all; wc -l /tmp/all real 1m0.007s user 0m0.003s sys 0m59.956s 140386 /tmp/all Only ~100k more got read in remaining the remaining 59s, and did not get nowhere near the 1m entries that are stored at the time. after this patch, dump completes very quickly: time cat /proc/net/ip_vs_conn > /tmp/all; wc -l /tmp/all real 0m2.286s user 0m0.004s sys 0m2.281s 1000001 /tmp/all Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-05kbuild: pacman-pkg: provide versioned linux-api-headers packageThomas Weißschuh
The Arch Linux glibc package contains a versioned dependency on "linux-api-headers". If the linux-api-headers package provided by pacman-pkg does not specify an explicit version this dependency is not satisfied. Fix the dependency by providing an explicit version. Fixes: c8578539deba ("kbuild: add script and target to generate pacman package") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2025-01-05netfilter: nf_tables: remove the genmask parametertuqiang
The genmask parameter is not used within the nf_tables_addchain function body. It should be removed to simplify the function parameter list. Signed-off-by: tuqiang <tu.qiang35@zte.com.cn> Signed-off-by: Jiang Kun <jiang.kun2@zte.com.cn> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-01-04selinux: match extended permissions to their base permissionsThiébaud Weksteen
In commit d1d991efaf34 ("selinux: Add netlink xperm support") a new extended permission was added ("nlmsg"). This was the second extended permission implemented in selinux ("ioctl" being the first one). Extended permissions are associated with a base permission. It was found that, in the access vector cache (avc), the extended permission did not keep track of its base permission. This is an issue for a domain that is using both extended permissions (i.e., a domain calling ioctl() on a netlink socket). In this case, the extended permissions were overlapping. Keep track of the base permission in the cache. A new field "base_perm" is added to struct extended_perms_decision to make sure that the extended permission refers to the correct policy permission. A new field "base_perms" is added to struct extended_perms to quickly decide if extended permissions apply. While it is in theory possible to retrieve the base permission from the access vector, the same base permission may not be mapped to the same bit for each class (e.g., "nlmsg" is mapped to a different bit for "netlink_route_socket" and "netlink_audit_socket"). Instead, use a constant (AVC_EXT_IOCTL or AVC_EXT_NLMSG) provided by the caller. Fixes: d1d991efaf34 ("selinux: Add netlink xperm support") Signed-off-by: Thiébaud Weksteen <tweek@google.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2025-01-04net: libwx: fix firmware mailbox abnormal returnJiawen Wu
The existing SW-FW interaction flow on the driver is wrong. Follow this wrong flow, driver would never return error if there is a unknown command. Since firmware writes back 'firmware ready' and 'unknown command' in the mailbox message if there is an unknown command sent by driver. So reading 'firmware ready' does not timeout. Then driver would mistakenly believe that the interaction has completed successfully. It tends to happen with the use of custom firmware. Move the check for 'unknown command' out of the poll timeout for 'firmware ready'. And adjust the debug log so that mailbox messages are always printed when commands timeout. Fixes: 1efa9bfe58c5 ("net: libwx: Implement interaction with firmware") Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com> Link: https://patch.msgid.link/20250103081013.1995939-1-jiawenwu@trustnetic.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04Merge tag 'ieee802154-for-net-next-2025-01-03' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan-next Stefan Schmidt says: ==================== pull-request: ieee802154-next 2025-01-03 Leo Stone provided a documatation fix to improve the grammar. David Gilbert spotted a non-used fucntion we can safely remove. * tag 'ieee802154-for-net-next-2025-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan-next: net: mac802154: Remove unused ieee802154_mlme_tx_one Documentation: ieee802154: fix grammar ==================== Link: https://patch.msgid.link/20250103154605.440478-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04Merge tag 'ieee802154-for-net-2025-01-03' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan Stefan Schmidt says: ==================== pull-request: ieee802154 for net 2025-01-03 Keisuke Nishimura provided a fix to check for kfifo_alloc() in the ca8210 driver. Lizhi Xu provided a fix a corrupted list, found by syzkaller, by checking local interfaces first. * tag 'ieee802154-for-net-2025-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan: mac802154: check local interfaces before deleting sdata list ieee802154: ca8210: Add missing check for kfifo_alloc() in ca8210_probe() ==================== Link: https://patch.msgid.link/20250103160046.469363-1-stefan@datenfreihafen.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04Merge tag 'linux-watchdog-6.13-rc6' of ↵Linus Torvalds
git://www.linux-watchdog.org/linux-watchdog Pull watchdog fix from Wim Van Sebroeck: - fix error message during stm32 driver probe * tag 'linux-watchdog-6.13-rc6' of git://www.linux-watchdog.org/linux-watchdog: watchdog: stm32_iwdg: fix error message during driver probe
2025-01-04selftests: tc-testing: reduce rshift valueJakub Kicinski
After previous change rshift >= 32 is no longer allowed. Modify the test to use 31, the test doesn't seem to send any traffic so the exact value shouldn't matter. Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250103182458.1213486-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04net_sched: cls_flow: validate TCA_FLOW_RSHIFT attributeEric Dumazet
syzbot found that TCA_FLOW_RSHIFT attribute was not validated. Right shitfing a 32bit integer is undefined for large shift values. UBSAN: shift-out-of-bounds in net/sched/cls_flow.c:329:23 shift exponent 9445 is too large for 32-bit type 'u32' (aka 'unsigned int') CPU: 1 UID: 0 PID: 54 Comm: kworker/u8:3 Not tainted 6.13.0-rc3-syzkaller-00180-g4f619d518db9 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Workqueue: ipv6_addrconf addrconf_dad_work Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:231 [inline] __ubsan_handle_shift_out_of_bounds+0x3c8/0x420 lib/ubsan.c:468 flow_classify+0x24d5/0x25b0 net/sched/cls_flow.c:329 tc_classify include/net/tc_wrapper.h:197 [inline] __tcf_classify net/sched/cls_api.c:1771 [inline] tcf_classify+0x420/0x1160 net/sched/cls_api.c:1867 sfb_classify net/sched/sch_sfb.c:260 [inline] sfb_enqueue+0x3ad/0x18b0 net/sched/sch_sfb.c:318 dev_qdisc_enqueue+0x4b/0x290 net/core/dev.c:3793 __dev_xmit_skb net/core/dev.c:3889 [inline] __dev_queue_xmit+0xf0e/0x3f50 net/core/dev.c:4400 dev_queue_xmit include/linux/netdevice.h:3168 [inline] neigh_hh_output include/net/neighbour.h:523 [inline] neigh_output include/net/neighbour.h:537 [inline] ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:236 iptunnel_xmit+0x55d/0x9b0 net/ipv4/ip_tunnel_core.c:82 udp_tunnel_xmit_skb+0x262/0x3b0 net/ipv4/udp_tunnel_core.c:173 geneve_xmit_skb drivers/net/geneve.c:916 [inline] geneve_xmit+0x21dc/0x2d00 drivers/net/geneve.c:1039 __netdev_start_xmit include/linux/netdevice.h:5002 [inline] netdev_start_xmit include/linux/netdevice.h:5011 [inline] xmit_one net/core/dev.c:3590 [inline] dev_hard_start_xmit+0x27a/0x7d0 net/core/dev.c:3606 __dev_queue_xmit+0x1b73/0x3f50 net/core/dev.c:4434 Fixes: e5dfb815181f ("[NET_SCHED]: Add flow classifier") Reported-by: syzbot+1dbb57d994e54aaa04d2@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/6777bf49.050a0220.178762.0040.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250103104546.3714168-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04net: stmmac: TSO: Simplify the code flow of DMA descriptor allocationsFurong Xu
The TCP Segmentation Offload (TSO) engine is an optional function in DWMAC cores, it is implemented for dwmac4 and dwxgmac2 only, ancient dwmac100 and dwmac1000 are not supported by hardware. Current driver code checks priv->dma_cap.tsoen which is read from MAC_HW_Feature1 register to determine if TSO is enabled in hardware configurations, if (!priv->dma_cap.tsoen) driver never sets NETIF_F_TSO for net_device. This patch never affects dwmac100/dwmac1000 and their stmmac_desc_ops: ndesc_ops/enh_desc_ops, since TSO is never supported by them two. The DMA AXI address width of DWMAC cores can be configured to 32-bit/40-bit/48-bit, then the format of DMA transmit descriptors get a little different between 32-bit and 40-bit/48-bit. Current driver code checks priv->dma_cap.addr64 to use certain format with certain configuration. This patch converts the format of DMA transmit descriptors on dwmac4 and dwxgmac2 that the DMA AXI address width is configured to 32-bit (as described by function comments of stmmac_tso_xmit() in current code) to a more generic format (see updated function comments after this patch) which is actually already used on 40-bit/48-bit platforms to provide better compatibility and make code flow cleaner in TSO TX routine. Another interesting finding, struct stmmac_desc_ops is a common abstract interface to maintain descriptors, we should avoid the direct assignment of descriptor members (e.g. desc->des0), stmmac_set_desc_addr() is the proper method yet. This patch tries to improve this by the way. Tested and verified on: DWMAC CORE 5.00a with 32-bit DMA AXI address width DWMAC CORE 5.10a with 32-bit DMA AXI address width DWXGMAC CORE 3.20a with 40-bit DMA AXI address width Signed-off-by: Furong Xu <0x1207@gmail.com> Link: https://patch.msgid.link/20241220080726.1733837-1-0x1207@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04net: pcs: pcs-mtk-lynxi: correctly report in-band status capabilitiesDaniel Golle
Neither does the LynxI PCS support QSGMII, nor is in-band-status supported in 2500Base-X mode. Fix the pcs_inband_caps() method accordingly. Fixes: 520d29bdda86 ("net: pcs: pcs-mtk-lynxi: implement pcs_inband_caps() method") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/Z3aJccb1vW14aukg@pidgin.makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04tcp/dccp: allow a connection when sk_max_ack_backlog is zeroZhongqiu Duan
If the backlog of listen() is set to zero, sk_acceptq_is_full() allows one connection to be made, but inet_csk_reqsk_queue_is_full() does not. When the net.ipv4.tcp_syncookies is zero, inet_csk_reqsk_queue_is_full() will cause an immediate drop before the sk_acceptq_is_full() check in tcp_conn_request(), resulting in no connection can be made. This patch tries to keep consistent with 64a146513f8f ("[NET]: Revert incorrect accept queue backlog changes."). Link: https://lore.kernel.org/netdev/20250102080258.53858-1-kuniyu@amazon.com/ Fixes: ef547f2ac16b ("tcp: remove max_qlen_log") Signed-off-by: Zhongqiu Duan <dzq.aishenghu0@gmail.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250102171426.915276-1-dzq.aishenghu0@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04net: 802: LLC+SNAP OID:PID lookup on start of skb dataAntonio Pastor
802.2+LLC+SNAP frames received by napi_complete_done() with GRO and DSA have skb->transport_header set two bytes short, or pointing 2 bytes before network_header & skb->data. This was an issue as snap_rcv() expected offset to point to SNAP header (OID:PID), causing packet to be dropped. A fix at llc_fixup_skb() (a024e377efed) resets transport_header for any LLC consumers that may care about it, and stops SNAP packets from being dropped, but doesn't fix the problem which is that LLC and SNAP should not use transport_header offset. Ths patch eliminates the use of transport_header offset for SNAP lookup of OID:PID so that SNAP does not rely on the offset at all. The offset is reset after pull for any SNAP packet consumers that may (but shouldn't) use it. Fixes: fda55eca5a33 ("net: introduce skb_transport_header_was_set()") Signed-off-by: Antonio Pastor <antonio.pastor@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250103012303.746521-1-antonio.pastor@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04fuse: respect FOPEN_KEEP_CACHE on opendirAmir Goldstein
The re-factoring of fuse_dir_open() missed the need to invalidate directory inode page cache with open flag FOPEN_KEEP_CACHE. Fixes: 7de64d521bf92 ("fuse: break up fuse_open_common()") Reported-by: Prince Kumar <princer@google.com> Closes: https://lore.kernel.org/linux-fsdevel/CAEW=TRr7CYb4LtsvQPLj-zx5Y+EYBmGfM24SuzwyDoGVNoKm7w@mail.gmail.com/ Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://lore.kernel.org/r/20250101130037.96680-1-amir73il@gmail.com Reviewed-by: Bernd Schubert <bernd.schubert@fastmail.fm> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-03Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.13-rc6). No conflicts. Adjacent changes: include/linux/if_vlan.h f91a5b808938 ("af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK") 3f330db30638 ("net: reformat kdoc return statements") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-03Merge tag 'sched_ext-for-6.13-rc5-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix a bug where bpf_iter_scx_dsq_new() was not initializing the iterator's flags and could inadvertently enable e.g. reverse iteration - Fix a bug where scx_ops_bypass() could call irq_restore twice - Add Andrea and Changwoo as maintainers for better review coverage - selftests and tools/sched_ext build and other fixes * tag 'sched_ext-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Fix dsq_local_on selftest sched_ext: initialize kit->cursor.flags sched_ext: Fix invalid irq restore in scx_ops_bypass() MAINTAINERS: add me as reviewer for sched_ext MAINTAINERS: add self as reviewer for sched_ext scx: Fix maximal BPF selftest prog sched_ext: fix application of sizeof to pointer selftests/sched_ext: fix build after renames in sched_ext API sched_ext: Add __weak to fix the build errors
2025-01-03Merge tag 'wq-for-6.13-rc5-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixes from Tejun Heo: - Suppress a corner case spurious flush dependency warning - Two trivial changes * tag 'wq-for-6.13-rc5-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: add printf attribute to __alloc_workqueue() workqueue: Do not warn when cancelling WQ_MEM_RECLAIM work from !WQ_MEM_RECLAIM worker rust: add safety comment in workqueue traits
2025-01-03Merge tag 'block-6.13-20250103' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: "Collection of fixes for block. Particularly the target name overflow has been a bit annoying, as it results in overwriting random memory and hence shows up as triggering various other bugs. - NVMe pull request via Keith: - Fix device specific quirk for PRP list alignment (Robert) - Fix target name overflow (Leo) - Fix target write granularity (Luis) - Fix target sleeping in atomic context (Nilay) - Remove unnecessary tcp queue teardown (Chunguang) - Simple cdrom typo fix" * tag 'block-6.13-20250103' of git://git.kernel.dk/linux: cdrom: Fix typo, 'devicen' to 'device' nvme-tcp: remove nvme_tcp_destroy_io_queues() nvmet-loop: avoid using mutex in IO hotpath nvmet: propagate npwg topology nvmet: Don't overflow subsysnqn nvme-pci: 512 byte aligned dma pool segment quirk
2025-01-03Merge tag 'io_uring-6.13-20250103' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Fix an issue with the read multishot support and posting of CQEs from io-wq context - Fix a regression introduced in this cycle, where making the timeout lock a raw one uncovered another locking dependency. As a result, move the timeout flushing outside of the timeout lock, punting them to a local list first - Fix use of an uninitialized variable in io_async_msghdr. Doesn't really matter functionally, but silences a valid KMSAN complaint that it's not always initialized - Fix use of incrementally provided buffers for read on non-pollable files, where the buffer always gets committed upfront. Unfortunately the buffer address isn't resolved first, so the read ends up using the updated rather than the current value * tag 'io_uring-6.13-20250103' of git://git.kernel.dk/linux: io_uring/kbuf: use pre-committed buffer address for non-pollable file io_uring/net: always initialize kmsg->msg.msg_inq upfront io_uring/timeout: flush timeouts outside of the timeout lock io_uring/rw: fix downgraded mshot read
2025-01-03Merge tag 'net-6.13-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireles and netfilter. Nothing major here. Over the last two weeks we gathered only around two-thirds of our normal weekly fix count, but delaying sending these until -rc7 seemed like a really bad idea. AFAIK we have no bugs under investigation. One or two reverts for stuff for which we haven't gotten a proper fix will likely come in the next PR. Current release - fix to a fix: - netfilter: nft_set_hash: unaligned atomic read on struct nft_set_ext - eth: gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup Previous releases - regressions: - net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets - mptcp: - fix sleeping rcvmsg sleeping forever after bad recvbuffer adjust - fix TCP options overflow - prevent excessive coalescing on receive, fix throughput - net: fix memory leak in tcp_conn_request() if map insertion fails - wifi: cw1200: fix potential NULL dereference after conversion to GPIO descriptors - phy: micrel: dynamically control external clock of KSZ PHY, fix suspend behavior Previous releases - always broken: - af_packet: fix VLAN handling with MSG_PEEK - net: restrict SO_REUSEPORT to inet sockets - netdev-genl: avoid empty messages in NAPI get - dsa: microchip: fix set_ageing_time function on KSZ9477 and LAN937X - eth: - gve: XDP fixes around transmit, queue wakeup etc. - ti: icssg-prueth: fix firmware load sequence to prevent time jump which breaks timesync related operations Misc: - netlink: specs: mptcp: add missing attr and improve documentation" * tag 'net-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits) net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_init net: ti: icssg-prueth: Fix firmware load sequence. mptcp: prevent excessive coalescing on receive mptcp: don't always assume copied data in mptcp_cleanup_rbuf() mptcp: fix recvbuffer adjust on sleeping rcvmsg ila: serialize calls to nf_register_net_hooks() af_packet: fix vlan_get_protocol_dgram() vs MSG_PEEK af_packet: fix vlan_get_tci() vs MSG_PEEK net: wwan: iosm: Properly check for valid exec stage in ipc_mmio_init() net: restrict SO_REUSEPORT to inet sockets net: reenable NETIF_F_IPV6_CSUM offload for BIG TCP packets net: sfc: Correct key_len for efx_tc_ct_zone_ht_params net: wwan: t7xx: Fix FSM command timeout issue sky2: Add device ID 11ab:4373 for Marvell 88E8075 mptcp: fix TCP options overflow. net: mv643xx_eth: fix an OF node reference leak gve: trigger RX NAPI instead of TX NAPI in gve_xsk_wakeup eth: bcmsysport: fix call balance of priv->clk handling routines net: llc: reset skb->transport_header netlink: specs: mptcp: fix missing doc ...
2025-01-03Merge tag 'nios2_update_for_v6.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux Pull nios2 fixlet from Dinh Nguyen: - Use str_yes_no() helper function * tag 'nios2_update_for_v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: nios2: Use str_yes_no() helper in show_cpuinfo()
2025-01-03Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "A lot of fixes accumulated over the holiday break: - Static tool fixes, value is already proven to be NULL, possible integer overflow - Many bnxt_re fixes: - Crashes due to a mismatch in the maximum SGE list size - Don't waste memory for user QPs by creating kernel-only structures - Fix compatability issues with older HW in some of the new HW features recently introduced: RTS->RTS feature, work around 9096 - Do not allow destroy_qp to fail - Validate QP MTU against device limits - Add missing validation on madatory QP attributes for RTR->RTS - Report port_num in query_qp as required by the spec - Fix creation of QPs of the maximum queue size, and in the variable mode - Allow all QPs to be used on newer HW by limiting a work around only to HW it affects - Use the correct MSN table size for variable mode QPs - Add missing locking in create_qp() accessing the qp_tbl - Form WQE buffers correctly when some of the buffers are 0 hop - Don't crash on QP destroy if the userspace doesn't setup the dip_ctx - Add the missing QP flush handler call on the DWQE path to avoid hanging on error recovery - Consistently use ENXIO for return codes if the devices is fatally errored - Try again to fix VLAN support on iwarp, previous fix was reverted due to breaking other cards - Correct error path return code for rdma netlink events - Remove the seperate net_device pointer in siw and rxe which syzkaller found a way to UAF - Fix a UAF of a stack ib_sge in rtrs - Fix a regression where old mlx5 devices and FW were wrongly activing new device features and failing" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (28 commits) RDMA/mlx5: Enable multiplane mode only when it is supported RDMA/bnxt_re: Fix error recovery sequence RDMA/rtrs: Ensure 'ib_sge list' is accessible RDMA/rxe: Remove the direct link to net_device RDMA/hns: Fix missing flush CQE for DWQE RDMA/hns: Fix warning storm caused by invalid input in IO path RDMA/hns: Fix accessing invalid dip_ctx during destroying QP RDMA/hns: Fix mapping error of zero-hop WQE buffer RDMA/bnxt_re: Fix the locking while accessing the QP table RDMA/bnxt_re: Fix MSN table size for variable wqe mode RDMA/bnxt_re: Add send queue size check for variable wqe RDMA/bnxt_re: Disable use of reserved wqes RDMA/bnxt_re: Fix max_qp_wrs reported RDMA/siw: Remove direct link to net_device RDMA/nldev: Set error code in rdma_nl_notify_event RDMA/bnxt_re: Fix reporting hw_ver in query_device RDMA/bnxt_re: Fix to export port num to ib_query_qp RDMA/bnxt_re: Fix setting mandatory attributes for modify_qp RDMA/bnxt_re: Add check for path mtu in modify_qp RDMA/bnxt_re: Fix the check for 9060 condition ...
2025-01-03Merge tag 'pinctrl-v6.13-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - A small Kconfig fixup for the i.MX. In principle this could come in from the SoC tree but the bug was introduced from the pin control tree so let's fix it from here. - Fix a sleep in atomic context in the MCP23xxx GPIO expander by disabling the regmap locking and using explicit mutex locks. * tag 'pinctrl-v6.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: mcp23s08: Fix sleeping in atomic context due to regmap locking ARM: imx: Re-introduce the PINCTRL selection
2025-01-03Merge tag 'sound-6.13-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "The first new year pull request: no surprises, all small fixes, including: - Follow-up fixes for the new compress-offload API extension - A couple of fixes for MIDI 2.0 UMP handling - A trivial race fix for OSS sequencer emulation ioctls - USB-audio and HD-audio fixes / quirks" * tag 'sound-6.13-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: seq: Check UMP support for midi_version change ALSA hda/realtek: Add quirk for Framework F111:000C Revert "ALSA: ump: Don't enumeration invalid groups for legacy rawmidi" ALSA: seq: oss: Fix races at processing SysEx messages ALSA: compress_offload: fix remaining descriptor races in sound/core/compress_offload.c ALSA: compress_offload: Drop unneeded no_free_ptr() ALSA: hda/tas2781: Ignore SUBSYS_ID not found for tas2563 projects ALSA: usb-audio: US16x08: Initialize array before use
2025-01-03Merge tag 'drm-fixes-2025-01-03' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds
Pull drm fixes from Dave Airlie: "Happy New Year. It was fairly quiet for holidays period, certainly nothing that worth getting off the couch before I needed to, this is for the past two weeks, i915, xe and some adv7511, I expect we will see some amdgpu etc happening next week, but otherwise all quiet. i915: - Fix C10 pll programming sequence [cx0_phy] - Fix power gate sequence. [dg1] xe: - uapi: Revert some devcoredump file format changes breaking a mesa debug tool - Fixes around waits when moving to system - Fix a typo when checking for LMEM provisioning - Fix a fault on fd close after unbind - A couple of OA fixes squashed for stable backporting adv7511: - fix UAF - drop single lane support - audio infoframe fix" * tag 'drm-fixes-2025-01-03' of https://gitlab.freedesktop.org/drm/kernel: xe/oa: Fix query mode of operation for OAR/OAC drm/i915/dg1: Fix power gate sequence. drm/i915/cx0_phy: Fix C10 pll programming sequence drm/xe: Fix fault on fd close after unbind drm/xe/pf: Use correct function to check LMEM provisioning drm/xe: Wait for migration job before unmapping pages drm/xe: Use non-interruptible wait when moving BO to system drm/xe: Revert some changes that break a mesa debug tool drm: adv7511: Drop dsi single lane support dt-bindings: display: adi,adv7533: Drop single lane support drm: adv7511: Fix use-after-free in adv7533_attach_dsi() drm/bridge: adv7511_audio: Update Audio InfoFrame properly
2025-01-03Merge tag 'ftrace-v6.13-rc5-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace fixes from Steven Rostedt: - Add needed READ_ONCE() around access to the fgraph array element The updates to the fgraph array can happen when callbacks are registered and unregistered. The __ftrace_return_to_handler() can handle reading either the old value or the new value. But once it reads that value it must stay consistent otherwise the check that looks to see if the value is a stub may show false, but if the compiler decides to re-read after that check, it can be true which can cause the code to crash later on. - Make function profiler use the top level ops for filtering again When function graph became available for instances, its filter ops became independent from the top level set_ftrace_filter. In the process the function profiler received its own filter ops as well. But the function profiler uses the top level set_ftrace_filter file and does not have one of its own. In giving it its own filter ops, it lost any user interface it once had. Make it use the top level set_ftrace_filter file again. This fixes a regression. * tag 'ftrace-v6.13-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ftrace: Fix function profiler's filtering functionality fgraph: Add READ_ONCE() when accessing fgraph_array[]
2025-01-03io_uring/kbuf: use pre-committed buffer address for non-pollable fileJens Axboe
For non-pollable files, buffer ring consumption will commit upfront. This is fine, but io_ring_buffer_select() will return the address of the buffer after having committed it. For incrementally consumed buffers, this is incorrect as it will modify the buffer address. Store the pre-committed value and return that. If that isn't done, then the initial part of the buffer is not used and the application will correctly assume the content arrived at the start of the userspace buffer, but the kernel will have put it later in the buffer. Or it can cause a spurious -EFAULT returned in the CQE, depending on the buffer size. As bounds are suitably checked for doing the actual IO, no adverse side effects are possible - it's just a data misplacement within the existing buffer. Reported-by: Gwendal Fernet <gwendalfernet@gmail.com> Cc: stable@vger.kernel.org Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption") Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-03dm-verity FEC: Avoid copying RS parity bytes twice.Milan Broz
Caching RS parity bytes is already done in fec_decode_bufs() now, no need to use yet another buffer for conversion to uint16_t. This patch removes that double copy of RS parity bytes. Signed-off-by: Milan Broz <gmazyland@gmail.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-01-03dm-verity FEC: Fix RS FEC repair for roots unaligned to block size (take 2)Milan Broz
This patch fixes an issue that was fixed in the commit df7b59ba9245 ("dm verity: fix FEC for RS roots unaligned to block size") but later broken again in the commit 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO") If the Reed-Solomon roots setting spans multiple blocks, the code does not use proper parity bytes and randomly fails to repair even trivial errors. This bug cannot happen if the sector size is multiple of RS roots setting (Android case with roots 2). The previous solution was to find a dm-bufio block size that is multiple of the device sector size and roots size. Unfortunately, the optimization in commit 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO") is incorrect and uses data block size for some roots (for example, it uses 4096 block size for roots = 20). This patch uses a different approach: - It always uses a configured data block size for dm-bufio to avoid possible misaligned IOs. - and it caches the processed parity bytes, so it can join it if it spans two blocks. As the RS calculation is called only if an error is detected and the process is computationally intensive, copying a few more bytes should not introduce performance issues. The issue was reported to cryptsetup with trivial reproducer https://gitlab.com/cryptsetup/cryptsetup/-/issues/923 Reproducer (with roots=20): # create verity device with RS FEC dd if=/dev/urandom of=data.img bs=4096 count=8 status=none veritysetup format data.img hash.img --fec-device=fec.img --fec-roots=20 | \ awk '/^Root hash/{ print $3 }' >roothash # create an erasure that should always be repairable with this roots setting dd if=/dev/zero of=data.img conv=notrunc bs=1 count=4 seek=4 status=none # try to read it through dm-verity veritysetup open data.img test hash.img --fec-device=fec.img --fec-roots=20 $(cat roothash) dd if=/dev/mapper/test of=/dev/null bs=4096 status=noxfer Even now the log says it cannot repair it: : verity-fec: 7:1: FEC 0: failed to correct: -74 : device-mapper: verity: 7:1: data block 0 is corrupted ... With this fix, errors are properly repaired. : verity-fec: 7:1: FEC 0: corrected 4 errors Signed-off-by: Milan Broz <gmazyland@gmail.com> Fixes: 8ca7cab82bda ("dm verity fec: fix misaligned RS roots IO") Cc: stable@vger.kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-01-03vfio/pci: Fallback huge faults for unaligned pfnAlex Williamson
The PFN must also be aligned to the fault order to insert a huge pfnmap. Test the alignment and fallback when unaligned. Fixes: f9e54c3a2f5b ("vfio/pci: implement huge_fault support") Link: https://bugzilla.kernel.org/show_bug.cgi?id=219619 Reported-by: Athul Krishna <athul.krishna.kr@protonmail.com> Reported-by: Precific <precification@posteo.de> Reviewed-by: Peter Xu <peterx@redhat.com> Tested-by: Precific <precification@posteo.de> Link: https://lore.kernel.org/r/20250102183416.1841878-1-alex.williamson@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2025-01-03RDMA/mlx5: Enable multiplane mode only when it is supportedMark Zhang
Driver queries vport_cxt.num_plane and enables multiplane when it is greater then 0, but some old FWs (versions from x.40.1000 till x.42.1000), report vport_cxt.num_plane = 1 unexpectedly. Fix it by querying num_plane only when HCA_CAP2.multiplane bit is set. Fixes: 2a5db20fa532 ("RDMA/mlx5: Add support to multi-plane device and port") Link: https://patch.msgid.link/r/1ef901acdf564716fcf550453cf5e94f343777ec.1734610916.git.leon@kernel.org Cc: stable@vger.kernel.org Reported-by: Francesco Poli <invernomuto@paranoici.org> Closes: https://lore.kernel.org/all/nvs4i2v7o6vn6zhmtq4sgazy2hu5kiulukxcntdelggmznnl7h@so3oul6uwgbl/ Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-01-03team: prevent adding a device which is already a team device lowerOctavian Purdila
Prevent adding a device which is already a team device lower, e.g. adding veth0 if vlan1 was already added and veth0 is a lower of vlan1. This is not useful in practice and can lead to recursive locking: $ ip link add veth0 type veth peer name veth1 $ ip link set veth0 up $ ip link set veth1 up $ ip link add link veth0 name veth0.1 type vlan protocol 802.1Q id 1 $ ip link add team0 type team $ ip link set veth0.1 down $ ip link set veth0.1 master team0 team0: Port device veth0.1 added $ ip link set veth0 down $ ip link set veth0 master team0 ============================================ WARNING: possible recursive locking detected 6.13.0-rc2-virtme-00441-ga14a429069bb #46 Not tainted -------------------------------------------- ip/7684 is trying to acquire lock: ffff888016848e00 (team->team_lock_key){+.+.}-{4:4}, at: team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) but task is already holding lock: ffff888016848e00 (team->team_lock_key){+.+.}-{4:4}, at: team_add_slave (drivers/net/team/team_core.c:1147 drivers/net/team/team_core.c:1977) other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(team->team_lock_key); lock(team->team_lock_key); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by ip/7684: stack backtrace: CPU: 3 UID: 0 PID: 7684 Comm: ip Not tainted 6.13.0-rc2-virtme-00441-ga14a429069bb #46 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:122) print_deadlock_bug.cold (kernel/locking/lockdep.c:3040) __lock_acquire (kernel/locking/lockdep.c:3893 kernel/locking/lockdep.c:5226) ? netlink_broadcast_filtered (net/netlink/af_netlink.c:1548) lock_acquire.part.0 (kernel/locking/lockdep.c:467 kernel/locking/lockdep.c:5851) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) ? trace_lock_acquire (./include/trace/events/lock.h:24 (discriminator 2)) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) ? lock_acquire (kernel/locking/lockdep.c:5822) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) __mutex_lock (kernel/locking/mutex.c:587 kernel/locking/mutex.c:735) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) ? fib_sync_up (net/ipv4/fib_semantics.c:2167) ? team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) team_device_event (drivers/net/team/team_core.c:2928 drivers/net/team/team_core.c:2951 drivers/net/team/team_core.c:2973) notifier_call_chain (kernel/notifier.c:85) call_netdevice_notifiers_info (net/core/dev.c:1996) __dev_notify_flags (net/core/dev.c:8993) ? __dev_change_flags (net/core/dev.c:8975) dev_change_flags (net/core/dev.c:9027) vlan_device_event (net/8021q/vlan.c:85 net/8021q/vlan.c:470) ? br_device_event (net/bridge/br.c:143) notifier_call_chain (kernel/notifier.c:85) call_netdevice_notifiers_info (net/core/dev.c:1996) dev_open (net/core/dev.c:1519 net/core/dev.c:1505) team_add_slave (drivers/net/team/team_core.c:1219 drivers/net/team/team_core.c:1977) ? __pfx_team_add_slave (drivers/net/team/team_core.c:1972) do_set_master (net/core/rtnetlink.c:2917) do_setlink.isra.0 (net/core/rtnetlink.c:3117) Reported-by: syzbot+3c47b5843403a45aef57@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=3c47b5843403a45aef57 Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device") Signed-off-by: Octavian Purdila <tavip@google.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-03Merge branch 'net-iep-clock-module-fixes'David S. Miller
Meghana Malladi says: ==================== IEP clock module bug fixes This series has some bug fixes for IEP module needed by PPS and timesync operations. Patch 1/2 fixes firmware load sequence to run all the firmwares when either of the ethernet interfaces is up. Move all the code common for firmware bringup under common functions. Patch 2/2 fixes distorted PPS signal when the ethernet interfaces are brough down and up. This patch also fixes enabling PPS signal after bringing the interface up, without disabling PPS. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-03net: ti: icssg-prueth: Fix clearing of IEP_CMP_CFG registers during iep_initMeghana Malladi
When ICSSG interfaces are brought down and brought up again, the pru cores are shut down and booted again, flushing out all the memories and start again in a clean state. Hence it is expected that the IEP_CMP_CFG register needs to be flushed during iep_init() to ensure that the existing residual configuration doesn't cause any unusual behavior. If the register is not cleared, existing IEP_CMP_CFG set for CMP1 will result in SYNC0_OUT signal based on the SYNC_OUT register values. After bringing the interface up, calling PPS enable doesn't work as the driver believes PPS is already enabled, (iep->pps_enabled is not cleared during interface bring down) and driver will just return true even though there is no signal. Fix this by disabling pps and perout. Fixes: c1e0230eeaab ("net: ti: icss-iep: Add IEP driver") Signed-off-by: Meghana Malladi <m-malladi@ti.com> Reviewed-by: Roger Quadros <rogerq@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>