summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-08riscv: use local label names instead of global ones in assemblyClément Léger
Local labels should be prefix by '.L' or they'll be exported in the symbol table. Additionally, this messes up the backtrace by displaying an incorrect symbol: ... [ 12.751810] [<ffffffff80441628>] _copy_from_user+0x28/0xc2 [ 12.752035] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc [ 12.752310] [<ffffffff80a033e8>] do_trap_load_misaligned+0x24/0xee [ 12.752596] [<ffffffff80a0dcae>] _new_vmalloc_restore_context_a0+0xc2/0xce After: ... [ 10.243916] [<ffffffff804415e4>] _copy_from_user+0x28/0xc2 [ 10.244026] [<ffffffff800152ca>] handle_misaligned_load+0x1ca/0x2fc [ 10.244150] [<ffffffff80a033a0>] do_trap_load_misaligned+0x24/0xee [ 10.244268] [<ffffffff80a0dc66>] handle_exception+0x146/0x152 Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Fixes: 503638e0babf3 ("riscv: Stop emitting preventive sfence.vma for new vmalloc mappings") Link: https://lore.kernel.org/r/20250103141814.508865-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08riscv: qspinlock: Fixup _Q_PENDING_LOOPS definitionGuo Ren
When CONFIG_RISCV_QUEUED_SPINLOCKS=y, the _Q_PENDING_LOOPS definition is missing. Add the _Q_PENDING_LOOPS definition for pure qspinlock usage. Fixes: ab83647fadae ("riscv: Add qspinlock support") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241215135252.201983-1-guoren@kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08riscv: stacktrace: fix backtracing through exceptionsClément Léger
Prior to commit 5d5fc33ce58e ("riscv: Improve exception and system call latency"), backtrace through exception worked since ra was filled with ret_from_exception symbol address and the stacktrace code checked 'pc' to be equal to that symbol. Now that handle_exception uses regular 'call' instructions, this isn't working anymore and backtrace stops at handle_exception(). Since there are multiple call site to C code in the exception handling path, rather than checking multiple potential return addresses, add a new symbol at the end of exception handling and check pc to be in that range. Fixes: 5d5fc33ce58e ("riscv: Improve exception and system call latency") Signed-off-by: Clément Léger <cleger@rivosinc.com> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241209155714.1239665-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08riscv: mm: Fix the out of bound issue of vmemmap addressXu Lu
In sparse vmemmap model, the virtual address of vmemmap is calculated as: ((struct page *)VMEMMAP_START - (phys_ram_base >> PAGE_SHIFT)). And the struct page's va can be calculated with an offset: (vmemmap + (pfn)). However, when initializing struct pages, kernel actually starts from the first page from the same section that phys_ram_base belongs to. If the first page's physical address is not (phys_ram_base >> PAGE_SHIFT), then we get an va below VMEMMAP_START when calculating va for it's struct page. For example, if phys_ram_base starts from 0x82000000 with pfn 0x82000, the first page in the same section is actually pfn 0x80000. During init_unavailable_range(), we will initialize struct page for pfn 0x80000 with virtual address ((struct page *)VMEMMAP_START - 0x2000), which is below VMEMMAP_START as well as PCI_IO_END. This commit fixes this bug by introducing a new variable 'vmemmap_start_pfn' which is aligned with memory section size and using it to calculate vmemmap address instead of phys_ram_base. Fixes: a11dd49dcb93 ("riscv: Sparse-Memory/vmemmap out-of-bounds fix") Signed-off-by: Xu Lu <luxu.kernel@bytedance.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Link: https://lore.kernel.org/r/20241209122617.53341-1-luxu.kernel@bytedance.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08cpuidle: riscv-sbi: fix device node release in early exit of ↵Javier Carrasco
for_each_possible_cpu The 'np' device_node is initialized via of_cpu_device_node_get(), which requires explicit calls to of_node_put() when it is no longer required to avoid leaking the resource. Instead of adding the missing calls to of_node_put() in all execution paths, use the cleanup attribute for 'np' by means of the __free() macro, which automatically calls of_node_put() when the variable goes out of scope. Given that 'np' is only used within the for_each_possible_cpu(), reduce its scope to release the nood after every iteration of the loop. Fixes: 6abf32f1d9c5 ("cpuidle: Add RISC-V SBI CPU idle driver") Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com> Link: https://lore.kernel.org/r/20241116-cpuidle-riscv-sbi-cleanup-v3-1-a3a46372ce08@gmail.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08riscv: kprobes: Fix incorrect address calculationNam Cao
p->ainsn.api.insn is a pointer to u32, therefore arithmetic operations are multiplied by four. This is clearly undesirable for this case. Cast it to (void *) first before any calculation. Below is a sample before/after. The dumped memory is two kprobe slots, the first slot has - c.addiw a0, 0x1c (0x7125) - ebreak (0x00100073) and the second slot has: - c.addiw a0, -4 (0x7135) - ebreak (0x00100073) Before this patch: (gdb) x/16xh 0xff20000000135000 0xff20000000135000: 0x7125 0x0000 0x0000 0x0000 0x7135 0x0010 0x0000 0x0000 0xff20000000135010: 0x0073 0x0010 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 After this patch: (gdb) x/16xh 0xff20000000125000 0xff20000000125000: 0x7125 0x0073 0x0010 0x0000 0x7135 0x0073 0x0010 0x0000 0xff20000000125010: 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 Fixes: b1756750a397 ("riscv: kprobes: Use patch_text_nosync() for insn slots") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20241119111056.2554419-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08Merge branch 'there-are-some-bugfix-for-the-hns3-ethernet-driver'Jakub Kicinski
Jijie Shao says: ==================== There are some bugfix for the HNS3 ethernet driver There's a series of bugfix that's been accepted: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=d80a3091308491455b6501b1c4b68698c4a7cd24 However, The series is making the driver poke into IOMMU internals instead of implementing appropriate IOMMU workarounds. After discussion, the series was reverted: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=249cfa318fb1b77eb726c2ff4f74c9685f04e568 But only two patches are related to the IOMMU. Other patches involve only the modification of the driver. This series resends other patches. v2*: https://lore.kernel.org/20241217010839.1742227-1-shaojijie@huawei.com v2: https://lore.kernel.org/20241216132346.1197079-1-shaojijie@huawei.com v1: https://lore.kernel.org/20241107133023.3813095-1-shaojijie@huawei.com ==================== Link: https://patch.msgid.link/20250106143642.539698-1-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08net: hns3: fix kernel crash when 1588 is sent on HIP08 devicesJie Wang
Currently, HIP08 devices does not register the ptp devices, so the hdev->ptp is NULL. But the tx process would still try to set hardware time stamp info with SKBTX_HW_TSTAMP flag and cause a kernel crash. [ 128.087798] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000018 ... [ 128.280251] pc : hclge_ptp_set_tx_info+0x2c/0x140 [hclge] [ 128.286600] lr : hclge_ptp_set_tx_info+0x20/0x140 [hclge] [ 128.292938] sp : ffff800059b93140 [ 128.297200] x29: ffff800059b93140 x28: 0000000000003280 [ 128.303455] x27: ffff800020d48280 x26: ffff0cb9dc814080 [ 128.309715] x25: ffff0cb9cde93fa0 x24: 0000000000000001 [ 128.315969] x23: 0000000000000000 x22: 0000000000000194 [ 128.322219] x21: ffff0cd94f986000 x20: 0000000000000000 [ 128.328462] x19: ffff0cb9d2a166c0 x18: 0000000000000000 [ 128.334698] x17: 0000000000000000 x16: ffffcf1fc523ed24 [ 128.340934] x15: 0000ffffd530a518 x14: 0000000000000000 [ 128.347162] x13: ffff0cd6bdb31310 x12: 0000000000000368 [ 128.353388] x11: ffff0cb9cfbc7070 x10: ffff2cf55dd11e02 [ 128.359606] x9 : ffffcf1f85a212b4 x8 : ffff0cd7cf27dab0 [ 128.365831] x7 : 0000000000000a20 x6 : ffff0cd7cf27d000 [ 128.372040] x5 : 0000000000000000 x4 : 000000000000ffff [ 128.378243] x3 : 0000000000000400 x2 : ffffcf1f85a21294 [ 128.384437] x1 : ffff0cb9db520080 x0 : ffff0cb9db500080 [ 128.390626] Call trace: [ 128.393964] hclge_ptp_set_tx_info+0x2c/0x140 [hclge] [ 128.399893] hns3_nic_net_xmit+0x39c/0x4c4 [hns3] [ 128.405468] xmit_one.constprop.0+0xc4/0x200 [ 128.410600] dev_hard_start_xmit+0x54/0xf0 [ 128.415556] sch_direct_xmit+0xe8/0x634 [ 128.420246] __dev_queue_xmit+0x224/0xc70 [ 128.425101] dev_queue_xmit+0x1c/0x40 [ 128.429608] ovs_vport_send+0xac/0x1a0 [openvswitch] [ 128.435409] do_output+0x60/0x17c [openvswitch] [ 128.440770] do_execute_actions+0x898/0x8c4 [openvswitch] [ 128.446993] ovs_execute_actions+0x64/0xf0 [openvswitch] [ 128.453129] ovs_dp_process_packet+0xa0/0x224 [openvswitch] [ 128.459530] ovs_vport_receive+0x7c/0xfc [openvswitch] [ 128.465497] internal_dev_xmit+0x34/0xb0 [openvswitch] [ 128.471460] xmit_one.constprop.0+0xc4/0x200 [ 128.476561] dev_hard_start_xmit+0x54/0xf0 [ 128.481489] __dev_queue_xmit+0x968/0xc70 [ 128.486330] dev_queue_xmit+0x1c/0x40 [ 128.490856] ip_finish_output2+0x250/0x570 [ 128.495810] __ip_finish_output+0x170/0x1e0 [ 128.500832] ip_finish_output+0x3c/0xf0 [ 128.505504] ip_output+0xbc/0x160 [ 128.509654] ip_send_skb+0x58/0xd4 [ 128.513892] udp_send_skb+0x12c/0x354 [ 128.518387] udp_sendmsg+0x7a8/0x9c0 [ 128.522793] inet_sendmsg+0x4c/0x8c [ 128.527116] __sock_sendmsg+0x48/0x80 [ 128.531609] __sys_sendto+0x124/0x164 [ 128.536099] __arm64_sys_sendto+0x30/0x5c [ 128.540935] invoke_syscall+0x50/0x130 [ 128.545508] el0_svc_common.constprop.0+0x10c/0x124 [ 128.551205] do_el0_svc+0x34/0xdc [ 128.555347] el0_svc+0x20/0x30 [ 128.559227] el0_sync_handler+0xb8/0xc0 [ 128.563883] el0_sync+0x160/0x180 Fixes: 0bf5eb788512 ("net: hns3: add support for PTP") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250106143642.539698-8-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08net: hns3: fixed hclge_fetch_pf_reg accesses bar space out of bounds issueHao Lan
The TQP BAR space is divided into two segments. TQPs 0-1023 and TQPs 1024-1279 are in different BAR space addresses. However, hclge_fetch_pf_reg does not distinguish the tqp space information when reading the tqp space information. When the number of TQPs is greater than 1024, access bar space overwriting occurs. The problem of different segments has been considered during the initialization of tqp.io_base. Therefore, tqp.io_base is directly used when the queue is read in hclge_fetch_pf_reg. The error message: Unable to handle kernel paging request at virtual address ffff800037200000 pc : hclge_fetch_pf_reg+0x138/0x250 [hclge] lr : hclge_get_regs+0x84/0x1d0 [hclge] Call trace: hclge_fetch_pf_reg+0x138/0x250 [hclge] hclge_get_regs+0x84/0x1d0 [hclge] hns3_get_regs+0x2c/0x50 [hns3] ethtool_get_regs+0xf4/0x270 dev_ethtool+0x674/0x8a0 dev_ioctl+0x270/0x36c sock_do_ioctl+0x110/0x2a0 sock_ioctl+0x2ac/0x530 __arm64_sys_ioctl+0xa8/0x100 invoke_syscall+0x4c/0x124 el0_svc_common.constprop.0+0x140/0x15c do_el0_svc+0x30/0xd0 el0_svc+0x1c/0x2c el0_sync_handler+0xb0/0xb4 el0_sync+0x168/0x180 Fixes: 939ccd107ffc ("net: hns3: move dump regs function to a separate file") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-7-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08net: hns3: initialize reset_timer before hclgevf_misc_irq_init()Jian Shen
Currently the misc irq is initialized before reset_timer setup. But it will access the reset_timer in the irq handler. So initialize the reset_timer earlier. Fixes: ff200099d271 ("net: hns3: remove unnecessary work in hclgevf_main") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-6-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08net: hns3: don't auto enable misc vectorJian Shen
Currently, there is a time window between misc irq enabled and service task inited. If an interrupte is reported at this time, it will cause warning like below: [ 16.324639] Call trace: [ 16.324641] __queue_delayed_work+0xb8/0xe0 [ 16.324643] mod_delayed_work_on+0x78/0xd0 [ 16.324655] hclge_errhand_task_schedule+0x58/0x90 [hclge] [ 16.324662] hclge_misc_irq_handle+0x168/0x240 [hclge] [ 16.324666] __handle_irq_event_percpu+0x64/0x1e0 [ 16.324667] handle_irq_event+0x80/0x170 [ 16.324670] handle_fasteoi_edge_irq+0x110/0x2bc [ 16.324671] __handle_domain_irq+0x84/0xfc [ 16.324673] gic_handle_irq+0x88/0x2c0 [ 16.324674] el1_irq+0xb8/0x140 [ 16.324677] arch_cpu_idle+0x18/0x40 [ 16.324679] default_idle_call+0x5c/0x1bc [ 16.324682] cpuidle_idle_call+0x18c/0x1c4 [ 16.324684] do_idle+0x174/0x17c [ 16.324685] cpu_startup_entry+0x30/0x6c [ 16.324687] secondary_start_kernel+0x1a4/0x280 [ 16.324688] ---[ end trace 6aa0bff672a964aa ]--- So don't auto enable misc vector when request irq.. Fixes: 7be1b9f3e99f ("net: hns3: make hclge_service use delayed workqueue") Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08net: hns3: Resolved the issue that the debugfs query result is inconsistent.Hao Lan
This patch modifies the implementation of debugfs: When the user process stops unexpectedly, not all data of the file system is read. In this case, the save_buf pointer is not released. When the user process is called next time, save_buf is used to copy the cached data to the user space. As a result, the queried data is stale. To solve this problem, this patch implements .open() and .release() handler for debugfs file_operations. moving allocation buffer and execution of the cmd to the .open() handler and freeing in to the .release() handler. Allocate separate buffer for each reader and associate the buffer with the file pointer. When different user read processes no longer share the buffer, the stale data problem is fixed. Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Guangwei Zhang <zhangwangwei6@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250106143642.539698-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08net: hns3: fix missing features due to dev->features configuration too earlyHao Lan
Currently, the netdev->features is configured in hns3_nic_set_features. As a result, __netdev_update_features considers that there is no feature difference, and the procedures of the real features are missing. Fixes: 2a7556bb2b73 ("net: hns3: implement ndo_features_check ops for hns3 driver") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250106143642.539698-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08net: hns3: fixed reset failure issues caused by the incorrect reset typeHao Lan
When a reset type that is not supported by the driver is input, a reset pending flag bit of the HNAE3_NONE_RESET type is generated in reset_pending. The driver does not have a mechanism to clear this type of error. As a result, the driver considers that the reset is not complete. This patch provides a mechanism to clear the HNAE3_NONE_RESET flag and the parameter of hnae3_ae_ops.set_default_reset_request is verified. The error message: hns3 0000:39:01.0: cmd failed -16 hns3 0000:39:01.0: hclge device re-init failed, VF is disabled! hns3 0000:39:01.0: failed to reset VF stack hns3 0000:39:01.0: failed to reset VF(4) hns3 0000:39:01.0: prepare reset(2) wait done hns3 0000:39:01.0 eth4: already uninitialized Use the crash tool to view struct hclgevf_dev: struct hclgevf_dev { ... default_reset_request = 0x20, reset_level = HNAE3_NONE_RESET, reset_pending = 0x100, reset_type = HNAE3_NONE_RESET, ... }; Fixes: 720bd5837e37 ("net: hns3: add set_default_reset_request in the hnae3_ae_ops") Signed-off-by: Hao Lan <lanhao@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20250106143642.539698-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08riscv: Fix sleeping in invalid context in die()Nam Cao
die() can be called in exception handler, and therefore cannot sleep. However, die() takes spinlock_t which can sleep with PREEMPT_RT enabled. That causes the following warning: BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 285, name: mutex preempt_count: 110001, expected: 0 RCU nest depth: 0, expected: 0 CPU: 0 UID: 0 PID: 285 Comm: mutex Not tainted 6.12.0-rc7-00022-ge19049cf7d56-dirty #234 Hardware name: riscv-virtio,qemu (DT) Call Trace: dump_backtrace+0x1c/0x24 show_stack+0x2c/0x38 dump_stack_lvl+0x5a/0x72 dump_stack+0x14/0x1c __might_resched+0x130/0x13a rt_spin_lock+0x2a/0x5c die+0x24/0x112 do_trap_insn_illegal+0xa0/0xea _new_vmalloc_restore_context_a0+0xcc/0xd8 Oops - illegal instruction [#1] Switch to use raw_spinlock_t, which does not sleep even with PREEMPT_RT enabled. Fixes: 76d2a0493a17 ("RISC-V: Init and Halt Code") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20241118091333.1185288-1-namcao@linutronix.de Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08riscv: module: remove relocation_head rel_entry member allocationClément Léger
relocation_head's list_head member, rel_entry, doesn't need to be allocated, its storage can just be part of the allocated relocation_head. Remove the pointer which allows to get rid of the allocation as well as an existing memory leak found by Kai Zhang using kmemleak. Fixes: 8fd6c5142395 ("riscv: Add remaining module relocations") Reported-by: Kai Zhang <zhangkai@iscas.ac.cn> Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Tested-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20241128081636.3620468-1-cleger@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2025-01-08tcp: Annotate data-race around sk->sk_mark in tcp_v4_send_resetDaniel Borkmann
This is a follow-up to 3c5b4d69c358 ("net: annotate data-races around sk->sk_mark"). sk->sk_mark can be read and written without holding the socket lock. IPv6 equivalent is already covered with READ_ONCE() annotation in tcp_v6_send_response(). Fixes: 3c5b4d69c358 ("net: annotate data-races around sk->sk_mark") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/f459d1fc44f205e13f6d8bdca2c8bfb9902ffac9.1736244569.git.daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08netdev: prevent accessing NAPI instances from another namespaceJakub Kicinski
The NAPI IDs were not fully exposed to user space prior to the netlink API, so they were never namespaced. The netlink API must ensure that at the very least NAPI instance belongs to the same netns as the owner of the genl sock. napi_by_id() can become static now, but it needs to move because of dev_get_by_napi_id(). Cc: stable@vger.kernel.org Fixes: 1287c1ae0fc2 ("netdev-genl: Support setting per-NAPI config values") Fixes: 27f91aaf49b3 ("netdev-genl: Add netlink framework functions for napi") Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250106180137.1861472-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-08Merge tag 'for-6.13/dm-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mikulas Patocka: - dm-array fixes - dm-verity forward error correction fixes - remove the flag DM_TARGET_PASSES_INTEGRITY from dm-ebs - dm-thin RCU list fix * tag 'for-6.13/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm thin: make get_first_thin use rcu-safe list first function dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITY dm-verity FEC: Avoid copying RS parity bytes twice. dm-verity FEC: Fix RS FEC repair for roots unaligned to block size (take 2) dm array: fix cursor index when skipping across block boundaries dm array: fix unreleased btree blocks on closing a faulty array cursor dm array: fix releasing a faulty array block twice in dm_array_cursor_end
2025-01-08bus: mhi: host: pci_generic: Enable MSI-X if the endpoint supportsVivek Pernamitta
Enable MSI-X if the endpoint supports. Signed-off-by: Vivek Pernamitta <quic_vpernami@quicinc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20250108-msix-v2-1-dc4466922350@quicinc.com [mani: added pci_generic prefix to subject] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2025-01-08sched_ext: switch class when preempted by higher priority schedulerHonglei Wang
ops.cpu_release() function, if defined, must be invoked when preempted by a higher priority scheduler class task. This scenario was skipped in commit f422316d7466 ("sched_ext: Remove switch_class_scx()"). Let's fix it. Fixes: f422316d7466 ("sched_ext: Remove switch_class_scx()") Signed-off-by: Honglei Wang <jameshongleiwang@126.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-08sched_ext: Replace rq_lock() to raw_spin_rq_lock() in scx_ops_bypass()Changwoo Min
scx_ops_bypass() iterates all CPUs to re-enqueue all the scx tasks. For each CPU, it acquires a lock using rq_lock() regardless of whether a CPU is offline or the CPU is currently running a task in a higher scheduler class (e.g., deadline). The rq_lock() is supposed to be used for online CPUs, and the use of rq_lock() may trigger an unnecessary warning in rq_pin_lock(). Therefore, replace rq_lock() to raw_spin_rq_lock() in scx_ops_bypass(). Without this change, we observe the following warning: ===== START ===== [ 6.615205] rq->balance_callback && rq->balance_callback != &balance_push_callback [ 6.615208] WARNING: CPU: 2 PID: 0 at kernel/sched/sched.h:1730 __schedule+0x1130/0x1c90 ===== END ===== Fixes: 0e7ffff1b811 ("scx: Fix raciness in scx_ops_bypass()") Signed-off-by: Changwoo Min <changwoo@igalia.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-08sched_ext: keep running prev when prev->scx.slice != 0Henry Huang
When %SCX_OPS_ENQ_LAST is set and prev->scx.slice != 0, @prev will be dispacthed into the local DSQ in put_prev_task_scx(). However, pick_task_scx() is executed before put_prev_task_scx(), so it will not pick @prev. Set %SCX_RQ_BAL_KEEP in balance_one() to ensure that pick_task_scx() can pick @prev. Signed-off-by: Henry Huang <henry.hj@antgroup.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-08Bluetooth: btmtk: Fix failed to send func ctrl for MediaTek devices.Chris Lu
Use usb_autopm_get_interface() and usb_autopm_put_interface() in btmtk_usb_shutdown(), it could send func ctrl after enabling autosuspend. Bluetooth: btmtk_usb_hci_wmt_sync() hci0: Execution of wmt command timed out Bluetooth: btmtk_usb_shutdown() hci0: Failed to send wmt func ctrl (-110) Fixes: 5c5e8c52e3ca ("Bluetooth: btmtk: move btusb_mtk_[setup, shutdown] to btmtk.c") Signed-off-by: Chris Lu <chris.lu@mediatek.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-01-08Bluetooth: btnxpuart: Fix driver sending truncated dataNeeraj Sanjay Kale
This fixes the apparent controller hang issue seen during stress test where the host sends a truncated payload, followed by HCI commands. The controller treats these HCI commands as a part of previously truncated payload, leading to command timeouts. Adding a serdev_device_wait_until_sent() call after serdev_device_write_buf() fixed the issue. Fixes: 689ca16e5232 ("Bluetooth: NXP: Add protocol support for NXP Bluetooth chipsets") Signed-off-by: Neeraj Sanjay Kale <neeraj.sanjaykale@nxp.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-01-08Bluetooth: MGMT: Fix Add Device to responding before completingLuiz Augusto von Dentz
Add Device with LE type requires updating resolving/accept list which requires quite a number of commands to complete and each of them may fail, so instead of pretending it would always work this checks the return of hci_update_passive_scan_sync which indicates if everything worked as intended. Fixes: e8907f76544f ("Bluetooth: hci_sync: Make use of hci_cmd_sync_queue set 3") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-01-08Bluetooth: hci_sync: Fix not setting Random Address when requiredLuiz Augusto von Dentz
This fixes errors such as the following when Own address type is set to Random Address but it has not been programmed yet due to either be advertising or connecting: < HCI Command: LE Set Exte.. (0x08|0x0041) plen 13 Own address type: Random (0x03) Filter policy: Ignore not in accept list (0x01) PHYs: 0x05 Entry 0: LE 1M Type: Passive (0x00) Interval: 60.000 msec (0x0060) Window: 30.000 msec (0x0030) Entry 1: LE Coded Type: Passive (0x00) Interval: 180.000 msec (0x0120) Window: 90.000 msec (0x0090) > HCI Event: Command Complete (0x0e) plen 4 LE Set Extended Scan Parameters (0x08|0x0041) ncmd 1 Status: Success (0x00) < HCI Command: LE Set Exten.. (0x08|0x0042) plen 6 Extended scan: Enabled (0x01) Filter duplicates: Enabled (0x01) Duration: 0 msec (0x0000) Period: 0.00 sec (0x0000) > HCI Event: Command Complete (0x0e) plen 4 LE Set Extended Scan Enable (0x08|0x0042) ncmd 1 Status: Invalid HCI Command Parameters (0x12) Fixes: c45074d68a9b ("Bluetooth: Fix not generating RPA when required") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-01-08misc: microchip: pci1xxxx: Resolve return code mismatch during GPIO set configRengarajan S
Driver returns -EOPNOTSUPPORTED on unsupported parameters case in set config. Upper level driver checks for -ENOTSUPP. Because of the return code mismatch, the ioctls from userspace fail. Resolve the issue by passing -ENOTSUPP during unsupported case. Fixes: 7d3e4d807df2 ("misc: microchip: pci1xxxx: load gpio driver for the gpio controller auxiliary device enumerated by the auxiliary bus driver.") Cc: stable <stable@kernel.org> Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://lore.kernel.org/r/20241205133626.1483499-3-rengarajan.s@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08misc: microchip: pci1xxxx: Resolve kernel panic during GPIO IRQ handlingRengarajan S
Resolve kernel panic caused by improper handling of IRQs while accessing GPIO values. This is done by replacing generic_handle_irq with handle_nested_irq. Fixes: 1f4d8ae231f4 ("misc: microchip: pci1xxxx: Add gpio irq handler and irq helper functions irq_ack, irq_mask, irq_unmask and irq_set_type of irq_chip.") Cc: stable <stable@kernel.org> Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://lore.kernel.org/r/20241205133626.1483499-2-rengarajan.s@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08misc: microchip: pci1xxxx: Add push-pull drive support for GPIORengarajan S
Add support to configure GPIO pins for push-pull drive mode. Signed-off-by: Rengarajan S <rengarajan.s@microchip.com> Link: https://lore.kernel.org/r/20241205134956.1493091-1-rengarajan.s@microchip.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08scripts/spdxcheck: Handle license identifiers in Jinja commentsLukas Bulwahn
Commit 4b132aacb076 ("tools: Add xdrgen") adds a tool, which uses Jinja template files, i.e., files with the j2 file extension, for its lightweight code generation. These template files for this tool have proper headers with the SPDX License information, which are included as Jinja comments by enclosing the text with '{#' and '#}'. Sofar, the spdxcheck script does not support to properly parse this license information in Jinja comments and it reports back with 'Invalid token: #}'. Parse Jinja comments properly by stripping the known Jinja comment suffix. Signed-off-by: Lukas Bulwahn <lukas.bulwahn@redhat.com> Link: https://lore.kernel.org/r/20250108125207.57486-1-lukas.bulwahn@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08scripts/spdxcheck: Parse j2 comments correctlyThomas Gleixner
j2 files use '#}' as comment closure, which trips up the SPDX parser: tools/.../definition.j2: 1:36 Invalid token: #} Handle those comments correctly by removing the closure. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/878qt2xr46.ffs@tglx Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08dm thin: make get_first_thin use rcu-safe list first functionKrister Johansen
The documentation in rculist.h explains the absence of list_empty_rcu() and cautions programmers against relying on a list_empty() -> list_first() sequence in RCU safe code. This is because each of these functions performs its own READ_ONCE() of the list head. This can lead to a situation where the list_empty() sees a valid list entry, but the subsequent list_first() sees a different view of list head state after a modification. In the case of dm-thin, this author had a production box crash from a GP fault in the process_deferred_bios path. This function saw a valid list head in get_first_thin() but when it subsequently dereferenced that and turned it into a thin_c, it got the inside of the struct pool, since the list was now empty and referring to itself. The kernel on which this occurred printed both a warning about a refcount_t being saturated, and a UBSAN error for an out-of-bounds cpuid access in the queued spinlock, prior to the fault itself. When the resulting kdump was examined, it was possible to see another thread patiently waiting in thin_dtr's synchronize_rcu. The thin_dtr call managed to pull the thin_c out of the active thins list (and have it be the last entry in the active_thins list) at just the wrong moment which lead to this crash. Fortunately, the fix here is straight forward. Switch get_first_thin() function to use list_first_or_null_rcu() which performs just a single READ_ONCE() and returns NULL if the list is already empty. This was run against the devicemapper test suite's thin-provisioning suites for delete and suspend and no regressions were observed. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Fixes: b10ebd34ccca ("dm thin: fix rcu_read_lock being held in code that can sleep") Cc: stable@vger.kernel.org Acked-by: Ming-Hung Tsai <mtsai@redhat.com> Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
2025-01-08dm-ebs: don't set the flag DM_TARGET_PASSES_INTEGRITYMikulas Patocka
dm-ebs uses dm-bufio to process requests that are not aligned on logical sector size. dm-bufio doesn't support passing integrity data (and it is unclear how should it do it), so we shouldn't set the DM_TARGET_PASSES_INTEGRITY flag. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Cc: stable@vger.kernel.org Fixes: d3c7b35c20d6 ("dm: add emulated block size target")
2025-01-08xfs: don't return an error from xfs_update_last_rtgroup_size for !XFS_RTChristoph Hellwig
Non-rtg file systems have a fake RT group even if they do not have a RT device, and thus an rgcount of 1. Ensure xfs_update_last_rtgroup_size doesn't fail when called for !XFS_RT to handle this case. Fixes: 87fe4c34a383 ("xfs: create incore realtime group structures") Reported-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-01-08ntsync: No longer depend on BROKEN.Elizabeth Figura
f5b335dc025cfee90957efa90dc72fada0d5abb4 ("misc: ntsync: mark driver as "broken" to prevent from building") was committed to avoid the driver being used while only part of its functionality was released. Since the rest of the functionality has now been committed, revert this. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241213193511.457338-31-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08docs: ntsync: Add documentation for the ntsync uAPI.Elizabeth Figura
Add an overall explanation of the driver architecture, and complete and precise specification for its intended behaviour. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-30-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08maintainers: Add an entry for ntsync.Elizabeth Figura
Add myself as maintainer, supported by CodeWeavers. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20241213193511.457338-29-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add a stress test for contended waits.Elizabeth Figura
Test a more realistic usage pattern, and one with heavy contention, in order to actually exercise ntsync's internal synchronization. This test has several threads in a tight loop acquiring a mutex, modifying some shared data, and then releasing the mutex. At the end we check if the data is consistent. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-28-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add some tests for wakeup signaling via alerts.Elizabeth Figura
Expand the alert tests to cover alerting a thread mid-wait, to test that the relevant scheduling logic works correctly. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-27-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add tests for alertable waits.Elizabeth Figura
Test the "alert" functionality of NTSYNC_IOC_WAIT_ALL and NTSYNC_IOC_WAIT_ANY, when a wait is woken with an alert and when it is woken by an object. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-26-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add some tests for wakeup signaling with events.Elizabeth Figura
Expand the contended wait tests, which previously only covered events and semaphores, to cover events as well. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-25-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add some tests for auto-reset event state.Elizabeth Figura
Test event-specific ioctls NTSYNC_IOC_EVENT_SET, NTSYNC_IOC_EVENT_RESET, NTSYNC_IOC_EVENT_PULSE, NTSYNC_IOC_EVENT_READ for auto-reset events, and waiting on auto-reset events. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-24-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add some tests for manual-reset event state.Elizabeth Figura
Test event-specific ioctls NTSYNC_IOC_EVENT_SET, NTSYNC_IOC_EVENT_RESET, NTSYNC_IOC_EVENT_PULSE, NTSYNC_IOC_EVENT_READ for manual-reset events, and waiting on manual-reset events. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-23-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add some tests for wakeup signaling with ↵Elizabeth Figura
WINESYNC_IOC_WAIT_ALL. Test contended "wait-for-all" waits, to make sure that scheduling and wakeup logic works correctly, and that the wait only exits once objects are all simultaneously signaled. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-22-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add some tests for wakeup signaling with ↵Elizabeth Figura
WINESYNC_IOC_WAIT_ANY. Test contended "wait-for-any" waits, to make sure that scheduling and wakeup logic works correctly. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-21-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ALL.Elizabeth Figura
Test basic synchronous functionality of NTSYNC_IOC_WAIT_ALL, and when objects are considered simultaneously signaled. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-20-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add some tests for NTSYNC_IOC_WAIT_ANY.Elizabeth Figura
Test basic synchronous functionality of NTSYNC_IOC_WAIT_ANY, when objects are considered signaled or not signaled, and how they are affected by a successful wait. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-19-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add some tests for mutex state.Elizabeth Figura
Test mutex-specific ioctls NTSYNC_IOC_MUTEX_UNLOCK and NTSYNC_IOC_MUTEX_READ, and waiting on mutexes. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-18-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-08selftests: ntsync: Add some tests for semaphore state.Elizabeth Figura
Wine has tests for its synchronization primitives, but these are more accessible to kernel developers, and also allow us to test some edge cases that Wine does not care about. This patch adds tests for semaphore-specific ioctls NTSYNC_IOC_SEM_POST and NTSYNC_IOC_SEM_READ, and waiting on semaphores. Signed-off-by: Elizabeth Figura <zfigura@codeweavers.com> Link: https://lore.kernel.org/r/20241213193511.457338-17-zfigura@codeweavers.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>