summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-09-04docs: netdev: update the netdev infra URLsJakub Kicinski
Some corporate proxies block our current NIPA URLs because they use a free / shady DNS domain. As suggested by Jesse we got a new DNS entry from Konstantin - netdev.bots.linux.dev, use it. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-04Merge branch 'rework/misc-cleanups' into for-linusPetr Mladek
2023-09-04Merge branch 'for-6.6-vsprintf-doc' into for-linusPetr Mladek
2023-09-04drm/i915: Handle dma fences in dirtyfb callbackJouni Högander
Take into account dma fences in dirtyfb callback. If there is no unsignaled dma fences perform flush immediately. If there are unsignaled dma fences perform invalidate and add callback which will queue flush when the fence gets signaled. v4: - Move invalidate before callback is added v3: - Check frontbuffer bits before adding any fence fb - Flush only when adding fence cb succeeds v2: Use dma_resv_get_singleton Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901093500.3463046-5-jouni.hogander@intel.com
2023-09-04drm/i915: Add new frontbuffer tracking interface to queue flushJouni Högander
We want to wait dma fences in dirtyfb ioctl. As we don't want to make dirtyfb ioctl as blocking call we need to use dma_fence_add_callback. Callback used for dma_fence_add_callback is called from atomic context. Due to this we need to add a new frontbuffer tracking interface to queue flush. v3: - Check schedule work success rather than work being pending - Init flush work when frontbuffer struct is initialized v2: Check if flush work is already pending Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901093500.3463046-4-jouni.hogander@intel.com
2023-09-04drm/i915/psr: Clear frontbuffer busy bits on flipJouni Högander
We are planning to move flush performed from work queue. This means it is possible to have invalidate -> flip -> flush sequence. Handle this by clearing possible busy bits on flip. Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901093500.3463046-3-jouni.hogander@intel.com
2023-09-04drm/i915/fbc: Clear frontbuffer busy bits on flipJouni Högander
We are planning to move flush performed from work queue. This means it is possible to have invalidate -> flip -> flush sequence. Handle this by clearing possible busy bits on flip. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901093500.3463046-2-jouni.hogander@intel.com
2023-09-04accel/ivpu: Move MMU register definitions to ivpu_mmu.cJacek Lawrynowicz
MMU registers are not platform specific so they should be defined separate to platform regs. Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-12-stanislaw.gruszka@linux.intel.com
2023-09-04accel/ivpu/37xx: White space cleanupStanislaw Gruszka
No functional change, adjust code formatting so that defines line up nicely to improve code readability. Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-11-stanislaw.gruszka@linux.intel.com
2023-09-04accel/ivpu/37xx: Change register rename leftoversStanislaw Gruszka
Change remaining MTL_VPU_ register names to generation based names. Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-10-stanislaw.gruszka@linux.intel.com
2023-09-04accel/ivpu: Move ivpu_fw_load() to ivpu_fw_init()Jacek Lawrynowicz
ivpu_fw_load() doesn't have to be called separately in ivpu_dev_init(). Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-8-stanislaw.gruszka@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-7-stanislaw.gruszka@linux.intel.com
2023-09-04accel/ivpu: Initialize context with SSID = 1Karol Wachowski
Context with SSID = 1 is reserved and accesses on that context happen only when context is uninitialized on the VPU side. Such access triggers MMU fault (0xa) "Invalid CD Fetch", which doesn't contain any useful information besides context ID. This commit will change that state, now (0x10) "Translation fault" will be triggered and accessed address will shown in the log. Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-7-stanislaw.gruszka@linux.intel.com
2023-09-04accel/ivpu: Add information about context on failureStanislaw Gruszka
Identify the mmu context that failed to initialize in the error messages. This allows the error to be correlated with a specific user during debug. Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-5-stanislaw.gruszka@linux.intel.com
2023-09-04accel/ivpu: Make ivpu_pm_init() voidStanislaw Gruszka
ivpu_pm_init() does not return any error, make it void. Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-4-stanislaw.gruszka@linux.intel.com
2023-09-04accel/ivpu: Remove duplicated error messagesJacek Lawrynowicz
Reduce the number of error messages per single failure in ivpu_dev_init() and ivpu_probe(). Most error messages are already printed by functions called from ivpu_dev_init(). Add missed error prints in ivpu_ipc_init() and ivpu_mmu_context_init(). Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-3-stanislaw.gruszka@linux.intel.com
2023-09-04accel/ivpu: Move set autosuspend delay to HW specific codeKrystian Pradzynski
Configure autosuspend values per HW generation and per platform. For non silicon platforms disable autosuspend for now, for silicon reduce it to 10 ms. Signed-off-by: Krystian Pradzynski <krystian.pradzynski@linux.intel.com> Reviewed-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230901094957.168898-2-stanislaw.gruszka@linux.intel.com
2023-09-04docs: netdev: document patchwork patch statesJakub Kicinski
The patchwork states are largely self-explanatory but small ambiguities may still come up. Document how we interpret the states in networking. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-04drm: gm12u320: Fix the timeout usage for usb_bulk_msg()Jinjie Ruan
The timeout arg of usb_bulk_msg() is ms already, which has been converted to jiffies by msecs_to_jiffies() in usb_start_wait_urb(). So fix the usage by removing the redundant msecs_to_jiffies() in the macros. And as Hans suggested, also remove msecs_to_jiffies() for the IDLE_TIMEOUT macro to make it consistent here and so change IDLE_TIMEOUT to msecs_to_jiffies(IDLE_TIMEOUT) where it is used. Fixes: e4f86e437164 ("drm: Add Grain Media GM12U320 driver v2") Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Suggested-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230904021421.1663892-1-ruanjinjie@huawei.com
2023-09-04bpf, sockmap: Fix skb refcnt race after locking changesJohn Fastabend
There is a race where skb's from the sk_psock_backlog can be referenced after userspace side has already skb_consumed() the sk_buff and its refcnt dropped to zer0 causing use after free. The flow is the following: while ((skb = skb_peek(&psock->ingress_skb)) sk_psock_handle_Skb(psock, skb, ..., ingress) if (!ingress) ... sk_psock_skb_ingress sk_psock_skb_ingress_enqueue(skb) msg->skb = skb sk_psock_queue_msg(psock, msg) skb_dequeue(&psock->ingress_skb) The sk_psock_queue_msg() puts the msg on the ingress_msg queue. This is what the application reads when recvmsg() is called. An application can read this anytime after the msg is placed on the queue. The recvmsg hook will also read msg->skb and then after user space reads the msg will call consume_skb(skb) on it effectively free'ing it. But, the race is in above where backlog queue still has a reference to the skb and calls skb_dequeue(). If the skb_dequeue happens after the user reads and free's the skb we have a use after free. The !ingress case does not suffer from this problem because it uses sendmsg_*(sk, msg) which does not pass the sk_buff further down the stack. The following splat was observed with 'test_progs -t sockmap_listen': [ 1022.710250][ T2556] general protection fault, ... [...] [ 1022.712830][ T2556] Workqueue: events sk_psock_backlog [ 1022.713262][ T2556] RIP: 0010:skb_dequeue+0x4c/0x80 [ 1022.713653][ T2556] Code: ... [...] [ 1022.720699][ T2556] Call Trace: [ 1022.720984][ T2556] <TASK> [ 1022.721254][ T2556] ? die_addr+0x32/0x80^M [ 1022.721589][ T2556] ? exc_general_protection+0x25a/0x4b0 [ 1022.722026][ T2556] ? asm_exc_general_protection+0x22/0x30 [ 1022.722489][ T2556] ? skb_dequeue+0x4c/0x80 [ 1022.722854][ T2556] sk_psock_backlog+0x27a/0x300 [ 1022.723243][ T2556] process_one_work+0x2a7/0x5b0 [ 1022.723633][ T2556] worker_thread+0x4f/0x3a0 [ 1022.723998][ T2556] ? __pfx_worker_thread+0x10/0x10 [ 1022.724386][ T2556] kthread+0xfd/0x130 [ 1022.724709][ T2556] ? __pfx_kthread+0x10/0x10 [ 1022.725066][ T2556] ret_from_fork+0x2d/0x50 [ 1022.725409][ T2556] ? __pfx_kthread+0x10/0x10 [ 1022.725799][ T2556] ret_from_fork_asm+0x1b/0x30 [ 1022.726201][ T2556] </TASK> To fix we add an skb_get() before passing the skb to be enqueued in the engress queue. This bumps the skb->users refcnt so that consume_skb() and kfree_skb will not immediately free the sk_buff. With this we can be sure the skb is still around when we do the dequeue. Then we just need to decrement the refcnt or free the skb in the backlog case which we do by calling kfree_skb() on the ingress case as well as the sendmsg case. Before locking change from fixes tag we had the sock locked so we couldn't race with user and there was no issue here. Fixes: 799aa7f98d53e ("skmsg: Avoid lock_sock() in sk_psock_backlog()") Reported-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Xu Kuohai <xukuohai@huawei.com> Tested-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20230901202137.214666-1-john.fastabend@gmail.com
2023-09-04fbdev/g364fb: fix build failure with mipsSudip Mukherjee
Fix the typo which resulted in the driver using FB_DEFAULT_IOMEM_HELPERS instead of FB_DEFAULT_IOMEM_OPS as the fbdev I/O helpers. Fixes: 501126083855 ("fbdev/g364fb: Use fbdev I/O helpers") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20230902095102.5908-1-sudip.mukherjee@codethink.co.uk
2023-09-04net: phy: micrel: Correct bit assignments for phy_device flagsOleksij Rempel
Previously, the defines for phy_device flags in the Micrel driver were ambiguous in their representation. They were intended to be bit masks but were mistakenly defined as bit positions. This led to the following issues: - MICREL_KSZ8_P1_ERRATA, designated for KSZ88xx switches, overlapped with MICREL_PHY_FXEN and MICREL_PHY_50MHZ_CLK. - Due to this overlap, the code path for MICREL_PHY_FXEN, tailored for the KSZ8041 PHY, was not executed for KSZ88xx PHYs. - Similarly, the code associated with MICREL_PHY_50MHZ_CLK wasn't triggered for KSZ88xx. To rectify this, all three flags have now been explicitly converted to use the `BIT()` macro, ensuring they are defined as bit masks and preventing potential overlaps in the future. Fixes: 49011e0c1555 ("net: phy: micrel: ksz886x/ksz8081: add cabletest support") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-04net: ipv6/addrconf: avoid integer underflow in ipv6_create_tempaddrAlex Henrie
The existing code incorrectly casted a negative value (the result of a subtraction) to an unsigned value without checking. For example, if /proc/sys/net/ipv6/conf/*/temp_prefered_lft was set to 1, the preferred lifetime would jump to 4 billion seconds. On my machine and network the shortest lifetime that avoided underflow was 3 seconds. Fixes: 76506a986dc3 ("IPv6: fix DESYNC_FACTOR") Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-04veth: Fixing transmit return status for dropped packetsLiang Chen
The veth_xmit function returns NETDEV_TX_OK even when packets are dropped. This behavior leads to incorrect calculations of statistics counts, as well as things like txq->trans_start updates. Fixes: e314dbdc1c0d ("[NET]: Virtual ethernet device driver.") Signed-off-by: Liang Chen <liangchen.linux@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-04gve: fix frag_list chainingEric Dumazet
gve_rx_append_frags() is able to build skbs chained with frag_list, like GRO engine. Problem is that shinfo->frag_list should only be used for the head of the chain. All other links should use skb->next pointer. Otherwise, built skbs are not valid and can cause crashes. Equivalent code in GRO (skb_gro_receive()) is: if (NAPI_GRO_CB(p)->last == p) skb_shinfo(p)->frag_list = skb; else NAPI_GRO_CB(p)->last->next = skb; NAPI_GRO_CB(p)->last = skb; Fixes: 9b8dd5e5ea48 ("gve: DQO: Add RX path") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Bailey Forrest <bcf@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Catherine Sullivan <csully@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-04net: deal with integer overflows in kmalloc_reserve()Eric Dumazet
Blamed commit changed: ptr = kmalloc(size); if (ptr) size = ksize(ptr); to: size = kmalloc_size_roundup(size); ptr = kmalloc(size); This allowed various crash as reported by syzbot [1] and Kyle Zeng. Problem is that if @size is bigger than 0x80000001, kmalloc_size_roundup(size) returns 2^32. kmalloc_reserve() uses a 32bit variable (obj_size), so 2^32 is truncated to 0. kmalloc(0) returns ZERO_SIZE_PTR which is not handled by skb allocations. Following trace can be triggered if a netdev->mtu is set close to 0x7fffffff We might in the future limit netdev->mtu to more sensible limit (like KMALLOC_MAX_SIZE). This patch is based on a syzbot report, and also a report and tentative fix from Kyle Zeng. [1] BUG: KASAN: user-memory-access in __build_skb_around net/core/skbuff.c:294 [inline] BUG: KASAN: user-memory-access in __alloc_skb+0x3c4/0x6e8 net/core/skbuff.c:527 Write of size 32 at addr 00000000fffffd10 by task syz-executor.4/22554 CPU: 1 PID: 22554 Comm: syz-executor.4 Not tainted 6.1.39-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/03/2023 Call trace: dump_backtrace+0x1c8/0x1f4 arch/arm64/kernel/stacktrace.c:279 show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:286 __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x120/0x1a0 lib/dump_stack.c:106 print_report+0xe4/0x4b4 mm/kasan/report.c:398 kasan_report+0x150/0x1ac mm/kasan/report.c:495 kasan_check_range+0x264/0x2a4 mm/kasan/generic.c:189 memset+0x40/0x70 mm/kasan/shadow.c:44 __build_skb_around net/core/skbuff.c:294 [inline] __alloc_skb+0x3c4/0x6e8 net/core/skbuff.c:527 alloc_skb include/linux/skbuff.h:1316 [inline] igmpv3_newpack+0x104/0x1088 net/ipv4/igmp.c:359 add_grec+0x81c/0x1124 net/ipv4/igmp.c:534 igmpv3_send_cr net/ipv4/igmp.c:667 [inline] igmp_ifc_timer_expire+0x1b0/0x1008 net/ipv4/igmp.c:810 call_timer_fn+0x1c0/0x9f0 kernel/time/timer.c:1474 expire_timers kernel/time/timer.c:1519 [inline] __run_timers+0x54c/0x710 kernel/time/timer.c:1790 run_timer_softirq+0x28/0x4c kernel/time/timer.c:1803 _stext+0x380/0xfbc ____do_softirq+0x14/0x20 arch/arm64/kernel/irq.c:79 call_on_irq_stack+0x24/0x4c arch/arm64/kernel/entry.S:891 do_softirq_own_stack+0x20/0x2c arch/arm64/kernel/irq.c:84 invoke_softirq kernel/softirq.c:437 [inline] __irq_exit_rcu+0x1c0/0x4cc kernel/softirq.c:683 irq_exit_rcu+0x14/0x78 kernel/softirq.c:695 el0_interrupt+0x7c/0x2e0 arch/arm64/kernel/entry-common.c:717 __el0_irq_handler_common+0x18/0x24 arch/arm64/kernel/entry-common.c:724 el0t_64_irq_handler+0x10/0x1c arch/arm64/kernel/entry-common.c:729 el0t_64_irq+0x1a0/0x1a4 arch/arm64/kernel/entry.S:584 Fixes: 12d6c1d3a2ad ("skbuff: Proactively round up to kmalloc bucket size") Reported-by: syzbot <syzkaller@googlegroups.com> Reported-by: Kyle Zeng <zengyhkyle@gmail.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-09-03ksmbd: remove experimental warningSteve French
ksmbd has made significant improvements over the past two years and is regularly tested and used. Remove the experimental warning. Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-09-03virtio_ring: fix avail_wrap_counter in virtqueue_add_packedYuan Yao
In current packed virtqueue implementation, the avail_wrap_counter won't flip, in the case when the driver supplies a descriptor chain with a length equals to the queue size; total_sg == vq->packed.vring.num. Let’s assume the following situation: vq->packed.vring.num=4 vq->packed.next_avail_idx: 1 vq->packed.avail_wrap_counter: 0 Then the driver adds a descriptor chain containing 4 descriptors. We expect the following result with avail_wrap_counter flipped: vq->packed.next_avail_idx: 1 vq->packed.avail_wrap_counter: 1 But, the current implementation gives the following result: vq->packed.next_avail_idx: 1 vq->packed.avail_wrap_counter: 0 To reproduce the bug, you can set a packed queue size as small as possible, so that the driver is more likely to provide a descriptor chain with a length equal to the packed queue size. For example, in qemu run following commands: sudo qemu-system-x86_64 \ -enable-kvm \ -nographic \ -kernel "path/to/kernel_image" \ -m 1G \ -drive file="path/to/rootfs",if=none,id=disk \ -device virtio-blk,drive=disk \ -drive file="path/to/disk_image",if=none,id=rwdisk \ -device virtio-blk,drive=rwdisk,packed=on,queue-size=4,\ indirect_desc=off \ -append "console=ttyS0 root=/dev/vda rw init=/bin/bash" Inside the VM, create a directory and mount the rwdisk device on it. The rwdisk will hang and mount operation will not complete. This commit fixes the wrap counter error by flipping the packed.avail_wrap_counter, when start of descriptor chain equals to the end of descriptor chain (head == i). Fixes: 1ce9e6055fa0 ("virtio_ring: introduce packed ring support") Signed-off-by: Yuan Yao <yuanyaogoog@chromium.org> Message-Id: <20230808051110.3492693-1-yuanyaogoog@chromium.org> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_vdpa: build affinity masks conditionallyJason Wang
We try to build affinity mask via create_affinity_masks() unconditionally which may lead several issues: - the affinity mask is not used for parent without affinity support (only VDUSE support the affinity now) - the logic of create_affinity_masks() might not work for devices other than block. For example it's not rare in the networking device where the number of queues could exceed the number of CPUs. Such case breaks the current affinity logic which is based on group_cpus_evenly() who assumes the number of CPUs are not less than the number of groups. This can trigger a warning[1]: if (ret >= 0) WARN_ON(nr_present + nr_others < numgrps); Fixing this by only build the affinity masks only when - Driver passes affinity descriptor, driver like virtio-blk can make sure to limit the number of queues when it exceeds the number of CPUs - Parent support affinity setting config ops This help to avoid the warning. More optimizations could be done on top. [1] [ 682.146655] WARNING: CPU: 6 PID: 1550 at lib/group_cpus.c:400 group_cpus_evenly+0x1aa/0x1c0 [ 682.146668] CPU: 6 PID: 1550 Comm: vdpa Not tainted 6.5.0-rc5jason+ #79 [ 682.146671] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [ 682.146673] RIP: 0010:group_cpus_evenly+0x1aa/0x1c0 [ 682.146676] Code: 4c 89 e0 5b 5d 41 5c 41 5d 41 5e c3 cc cc cc cc e8 1b c4 74 ff 48 89 ef e8 13 ac 98 ff 4c 89 e7 45 31 e4 e8 08 ac 98 ff eb c2 <0f> 0b eb b6 e8 fd 05 c3 00 45 31 e4 eb e5 cc cc cc cc cc cc cc cc [ 682.146679] RSP: 0018:ffffc9000215f498 EFLAGS: 00010293 [ 682.146682] RAX: 000000000001f1e0 RBX: 0000000000000041 RCX: 0000000000000000 [ 682.146684] RDX: ffff888109922058 RSI: 0000000000000041 RDI: 0000000000000030 [ 682.146686] RBP: ffff888109922058 R08: ffffc9000215f498 R09: ffffc9000215f4a0 [ 682.146687] R10: 00000000000198d0 R11: 0000000000000030 R12: ffff888107e02800 [ 682.146689] R13: 0000000000000030 R14: 0000000000000030 R15: 0000000000000041 [ 682.146692] FS: 00007fef52315740(0000) GS:ffff888237380000(0000) knlGS:0000000000000000 [ 682.146695] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 682.146696] CR2: 00007fef52509000 CR3: 0000000110dbc004 CR4: 0000000000370ee0 [ 682.146698] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 682.146700] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 682.146701] Call Trace: [ 682.146703] <TASK> [ 682.146705] ? __warn+0x7b/0x130 [ 682.146709] ? group_cpus_evenly+0x1aa/0x1c0 [ 682.146712] ? report_bug+0x1c8/0x1e0 [ 682.146717] ? handle_bug+0x3c/0x70 [ 682.146721] ? exc_invalid_op+0x14/0x70 [ 682.146723] ? asm_exc_invalid_op+0x16/0x20 [ 682.146727] ? group_cpus_evenly+0x1aa/0x1c0 [ 682.146729] ? group_cpus_evenly+0x15c/0x1c0 [ 682.146731] create_affinity_masks+0xaf/0x1a0 [ 682.146735] virtio_vdpa_find_vqs+0x83/0x1d0 [ 682.146738] ? __pfx_default_calc_sets+0x10/0x10 [ 682.146742] virtnet_find_vqs+0x1f0/0x370 [ 682.146747] virtnet_probe+0x501/0xcd0 [ 682.146749] ? vp_modern_get_status+0x12/0x20 [ 682.146751] ? get_cap_addr.isra.0+0x10/0xc0 [ 682.146754] virtio_dev_probe+0x1af/0x260 [ 682.146759] really_probe+0x1a5/0x410 Fixes: 3dad56823b53 ("virtio-vdpa: Support interrupt affinity spreading mechanism") Signed-off-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230811091539.1359865-1-jasowang@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_net: merge dma operations when filling mergeable buffersXuan Zhuo
Currently, the virtio core will perform a dma operation for each buffer. Although, the same page may be operated multiple times. This patch, the driver does the dma operation and manages the dma address based the feature premapped of virtio core. This way, we can perform only one dma operation for the pages of the alloc frag. This is beneficial for the iommu device. kernel command line: intel_iommu=on iommu.passthrough=0 | strict=0 | strict=1 Before | 775496pps | 428614pps After | 1109316pps | 742853pps Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-13-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: introduce dma sync api for virtqueueXuan Zhuo
These API has been introduced: * virtqueue_dma_need_sync * virtqueue_dma_sync_single_range_for_cpu * virtqueue_dma_sync_single_range_for_device These APIs can be used together with the premapped mechanism to sync the DMA address. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-12-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: introduce dma map api for virtqueueXuan Zhuo
Added virtqueue_dma_map_api* to map DMA addresses for virtual memory in advance. The purpose is to keep memory mapped across multiple add/get buf operations. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-11-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: introduce virtqueue_reset()Xuan Zhuo
Introduce virtqueue_reset() to release all buffer inside vq. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230810123057.43407-10-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: separate the logic of reset/enable from virtqueue_resizeXuan Zhuo
The subsequent reset function will reuse these logic. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230810123057.43407-9-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: correct the expression of the description of virtqueue_resize()Xuan Zhuo
Modify the "useless" to a more accurate "unused". Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230810123057.43407-8-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: skip unmap for premappedXuan Zhuo
Now we add a case where we skip dma unmap, the vq->premapped is true. We can't just rely on use_dma_api to determine whether to skip the dma operation. For convenience, I introduced the "do_unmap". By default, it is the same as use_dma_api. If the driver is configured with premapped, then do_unmap is false. So as long as do_unmap is false, for addr of desc, we should skip dma unmap operation. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-7-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: introduce virtqueue_dma_dev()Xuan Zhuo
Added virtqueue_dma_dev() to get DMA device for virtio. Then the caller can do dma operation in advance. The purpose is to keep memory mapped across multiple add/get buf operations. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230810123057.43407-6-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: support add premapped bufXuan Zhuo
If the vq is the premapped mode, use the sg_dma_address() directly. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-5-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: introduce virtqueue_set_dma_premapped()Xuan Zhuo
This helper allows the driver change the dma mode to premapped mode. Under the premapped mode, the virtio core do not do dma mapping internally. This just work when the use_dma_api is true. If the use_dma_api is false, the dma options is not through the DMA APIs, that is not the standard way of the linux kernel. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Message-Id: <20230810123057.43407-4-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: put mapping error check in vring_map_one_sgXuan Zhuo
This patch put the dma addr error check in vring_map_one_sg(). The benefits of doing this: 1. reduce one judgment of vq->use_dma_api. 2. make vring_map_one_sg more simple, without calling vring_mapping_error to check the return value. simplifies subsequent code Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230810123057.43407-3-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03virtio_ring: check use_dma_api before unmap desc for indirectXuan Zhuo
Inside detach_buf_split(), if use_dma_api is false, vring_unmap_one_split_indirect will be called many times, but actually nothing is done. So this patch check use_dma_api firstly. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Message-Id: <20230810123057.43407-2-xuanzhuo@linux.alibaba.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03vdpa_sim: offer VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OKEugenio Pérez
Start offering the feature in the simulator. Other parent drivers can follow this code to offer it too. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20230609092127.170673-5-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03vdpa: add get_backend_features vdpa operationEugenio Pérez
This operation allow vdpa parent to expose its own backend feature bits. Next patches introduce a feature not compatible with all parent drivers: the ability to enable vq after driver_ok. Each parent must declare if it allows it or not. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20230609092127.170673-4-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03vdpa: accept VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend featureEugenio Pérez
Accepting VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK backend feature if userland sets it. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20230609092127.170673-3-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03vdpa: add VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK flagEugenio Pérez
This feature flag allows the driver enabling virtqueues both before and after DRIVER_OK. This is needed for software assisted live migration, so userland can restore the device status in devices with control virtqueue before the dataplane is enabled. Signed-off-by: Eugenio Pérez <eperezma@redhat.com> Acked-by: Shannon Nelson <shannon.nelson@amd.com> Message-Id: <20230609092127.170673-2-eperezma@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03vdpa/mlx5: Remove unused function declarationsYue Haibing
Commit 29064bfdabd5 ("vdpa/mlx5: Add support library for mlx5 VDPA implementation") declared but never implemented these. Signed-off-by: Yue Haibing <yuehaibing@huawei.com> Message-Id: <20230803143041.23388-1-yuehaibing@huawei.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-09-03Merge tag 'dmaengine-6.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine updates from Vinod Koul: "New controller support and updates to drivers. New support: - Qualcomm SM6115 and QCM2290 dmaengine support - at_xdma support for microchip,sam9x7 controller Updates: - idxd updates for wq simplification and ats knob updates - fsl edma updates for v3 support - Xilinx AXI4-Stream control support - Yaml conversion for bcm dma binding" * tag 'dmaengine-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (53 commits) dmaengine: fsl-edma: integrate v3 support dt-bindings: fsl-dma: fsl-edma: add edma3 compatible string dmaengine: fsl-edma: move tcd into struct fsl_dma_chan dmaengine: fsl-edma: refactor chan_name setup and safety dmaengine: fsl-edma: move clearing of register interrupt into setup_irq function dmaengine: fsl-edma: refactor using devm_clk_get_enabled dmaengine: fsl-edma: simply ATTR_DSIZE and ATTR_SSIZE by using ffs() dmaengine: fsl-edma: move common IRQ handler to common.c dmaengine: fsl-edma: Remove enum edma_version dmaengine: fsl-edma: transition from bool fields to bitmask flags in drvdata dmaengine: fsl-edma: clean up EXPORT_SYMBOL_GPL in fsl-edma-common.c dmaengine: fsl-edma: fix build error when arch is s390 dmaengine: idxd: Fix issues with PRS disable sysfs knob dmaengine: idxd: Allow ATS disable update only for configurable devices dmaengine: xilinx_dma: Program interrupt delay timeout dmaengine: xilinx_dma: Use tasklet_hi_schedule for timing critical usecase dmaengine: xilinx_dma: Freeup active list based on descriptor completion bit dmaengine: xilinx_dma: Increase AXI DMA transaction segment count dmaengine: xilinx_dma: Pass AXI4-Stream control words to dma client dt-bindings: dmaengine: xilinx_dma: Add xlnx,irq-delay property ...
2023-09-03Merge tag 'phy-for-6.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy updates from Vinod Koul: "As usual a couple of new drivers, a bunch of new device support and few updates to existing drivers New Support: - Starfive dphy rx, JH7110 usb and pcie support - Rockchip rv1126 inno-dsi phy, rk3588 usb and pcie support - Qualcomm sa8775p PCIe support, M31 USB PHY driver - Samsung Exynos850 usb support Updates: - Mediatek dsi driver clock updates - Qualcomm sm8150 combo phy with reworking of qmp pcie driver - Xilinx zynqmp runtime PM support" * tag 'phy-for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (83 commits) phy: exynos5-usbdrd: Add Exynos850 support phy: exynos5-usbdrd: Add 26MHz ref clk support phy: exynos5-usbdrd: Make it possible to pass custom phy ops dt-bindings: phy: samsung,usb3-drd-phy: Add Exynos850 support phy: qcom-qmp-combo: fix clock probing phy: qcom-qmp-pcie: support SM8150 PCIe QMP PHYs phy: qcom-qmp-pcie: populate offsets configuration phy: qcom-qmp-pcie: simplify clock handling phy: qcom-qmp-pcie: keep offset tables sorted phy: qcom-qmp-pcie: drop ln_shrd from v5_20 config dt-bindings: phy: qcom,qmp-pcie: describe SM8150 PCIe PHYs dt-bindings: phy: migrate QMP PCIe PHY bindings to qcom,sc8280xp-qmp-pcie-phy.yaml phy: fsl-imx8mq-usb: add dev_err_probe if getting vbus failed phy: qcom: Introduce M31 USB PHY driver dt-bindings: phy: qcom,m31: Document qcom,m31 USB phy phy: rockchip: inno-dsidphy: Add rv1126 support dt-bindings: phy: rockchip-inno-dsidphy: Document rv1126 dt-bindings: phy: mediatek,tphy: allow simple nodename pattern phy: amlogic: meson-g12a-usb2: fix Wvoid-pointer-to-enum-cast warning phy: marvell pxa-usb: fix Wvoid-pointer-to-enum-cast warning ...
2023-09-03Merge tag 'soundwire-6.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire updates from Vinod Koul: "Device numbering and intel driver changes are main features: - Core support for soundwire device number allocation - intel driver updates for adding hw_params for DAI ops, hybrid number allocation and power managemnt callback updates - DT header include changes for subsystem" * tag 'soundwire-6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: intel_ace2x: add DAI hw_params/prepare/hw_free callbacks soundwire: intel_auxdevice: add hybrid IDA-based device_number allocation soundwire: bus: add callbacks for device_number allocation soundwire: extend parameters of new_peripheral_assigned() callback soundWire: intel_auxdevice: resume 'sdw-master' on startup and system resume soundwire: intel_auxdevice: enable pm_runtime earlier on startup soundwire: Explicitly include correct DT includes
2023-09-04kbuild: Show marked Kconfig fragments in "help"Kees Cook
Currently the Kconfig fragments in kernel/configs and arch/*/configs that aren't used internally aren't discoverable through "make help", which consists of hard-coded lists of config fragments. Instead, list all the fragment targets that have a "# Help: " comment prefix so the targets can be generated dynamically. Add logic to the Makefile to search for and display the fragment and comment. Add comments to fragments that are intended to be direct targets. Signed-off-by: Kees Cook <keescook@chromium.org> Co-developed-by: Masahiro Yamada <masahiroy@kernel.org> Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2023-09-03Merge tag 'mtd/for-6.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD updates from Miquel Raynal: "Core MTD changes: - Use refcount to prevent corruption - Call external _get and _put in right order - Fix use-after-free in mtd release - Explicitly include correct DT includes - Clean refcounting with MTD_PARTITIONED_MASTER - mtdblock: make warning messages ratelimited - dt-bindings: Add SEAMA partition bindings Device driver changes: - Use devm helper functions - Fix questionable cast, remove pointless ones. - error handling fixes - add support for new chip versions - update DT bindings - misc cleanups - fix typos, whitespace, indentation" * tag 'mtd/for-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (105 commits) dt-bindings: mtd: amlogic,meson-nand: drop unneeded quotes mtd: spear_smi: Use helper function devm_clk_get_enabled() mtd: rawnand: orion: Use helper function devm_clk_get_optional_enabled() mtd: rawnand: vf610_nfc: Use helper function devm_clk_get_enabled() mtd: rawnand: sunxi: Use helper function devm_clk_get_enabled() mtd: rawnand: stm32_fmc2: Use helper function devm_clk_get_enabled() mtd: rawnand: mtk: Use helper function devm_clk_get_enabled() mtd: rawnand: mpc5121: Use helper function devm_clk_get_enabled() mtd: rawnand: lpc32xx_slc: Use helper function devm_clk_get_enabled() mtd: rawnand: intel: Use helper function devm_clk_get_enabled() mtd: rawnand: fsmc: Use helper function devm_clk_get_enabled() mtd: rawnand: arasan: Use helper function devm_clk_get_enabled() mtd: rawnand: qcom: Add read/read_start ops in exec_op path mtd: rawnand: qcom: Clear buf_count and buf_start in raw read mtd: maps: fix -Wvoid-pointer-to-enum-cast warning mtd: rawnand: fix -Wvoid-pointer-to-enum-cast warning mtd: rawnand: fsmc: handle clk prepare error in fsmc_nand_resume() mtd: rawnand: Propagate error and simplify ternary operators for brcmstb_nand_wait_for_completion() mtd: rawnand: qcom: Sort includes alphabetically mtd: rawnand: qcom: Do not override the error no of submit_descs() ...