summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-03-04fs: support getname_maybe_null() in move_mount()Christian Brauner
Allow move_mount() to work with NULL path arguments. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-8-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-04selftests: create detached mounts from detached mountsChristian Brauner
Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-7-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-04fs: create detached mounts from detached mountsChristian Brauner
Add the ability to create detached mounts from detached mounts. Currently, detached mounts can only be created from attached mounts. This limitaton prevents various use-cases. For example, the ability to mount a subdirectory without ever having to make the whole filesystem visible first. The current permission model for the OPEN_TREE_CLONE flag of the open_tree() system call is: (1) Check that the caller is privileged over the owning user namespace of it's current mount namespace. (2) Check that the caller is located in the mount namespace of the mount it wants to create a detached copy of. While it is not strictly necessary to do it this way it is consistently applied in the new mount api. This model will also be used when allowing the creation of detached mount from another detached mount. The (1) requirement can simply be met by performing the same check as for the non-detached case, i.e., verify that the caller is privileged over its current mount namespace. To meet the (2) requirement it must be possible to infer the origin mount namespace that the anonymous mount namespace of the detached mount was created from. The origin mount namespace of an anonymous mount is the mount namespace that the mounts that were copied into the anonymous mount namespace originate from. The origin mount namespace of the anonymous mount namespace must be the same as the caller's mount namespace. To establish this the sequence number of the caller's mount namespace and the origin sequence number of the anonymous mount namespace are compared. The caller is always located in a non-anonymous mount namespace since anonymous mount namespaces cannot be setns()ed into. The caller's mount namespace will thus always have a valid sequence number. The owning namespace of any mount namespace, anonymous or non-anonymous, can never change. A mount attached to a non-anonymous mount namespace can never change mount namespace. If the sequence number of the non-anonymous mount namespace and the origin sequence number of the anonymous mount namespace match, the owning namespaces must match as well. Hence, the capability check on the owning namespace of the caller's mount namespace ensures that the caller has the ability to copy the mount tree. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-6-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-04fs: add may_copy_tree()Christian Brauner
Add a helper that verifies whether a caller may copy a given mount tree. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-5-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-04fs: add fastpath for dissolve_on_fput()Christian Brauner
Instead of acquiring the namespace semaphore and the mount lock everytime we close a file with FMODE_NEED_UNMOUNT set add a fastpath that checks whether we need to at all. Most of the time the caller will have attached the mount to the filesystem hierarchy and there's nothing to do. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-4-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-04fs: add assert for move_mount()Christian Brauner
After we've attached a detached mount tree the anonymous mount namespace must be empty. Add an assert and make this assumption explicit. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-3-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-04fs: add mnt_ns_empty() helperChristian Brauner
Add a helper that checks whether a give mount namespace is empty instead of open-coding the specific data structure check. This also be will be used in follow-up patches. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-2-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-04fs: record sequence number of origin mount namespaceChristian Brauner
Store the sequence number of the mount namespace the anonymous mount namespace has been created from. This information will be used in follow-up patches. Link: https://lore.kernel.org/r/20250221-brauner-open_tree-v1-1-dbcfcb98c676@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-04slab: Mark large folios for debugging purposesMatthew Wilcox (Oracle)
If a user calls p = kmalloc(1024); kfree(p); kfree(p); and 'p' was the only object in the slab, we may free the slab after the first call to kfree(). If we do, we clear PGTY_slab and the second call to kfree() will call free_large_kmalloc(). That will leave a trace in the logs ("object pointer: 0x%p"), but otherwise proceed to free the memory, which is likely to corrupt the page allocator's metadata. Allocate a new page type for large kmalloc and mark the memory with it while it's allocated. That lets us detect this double-free and return without harming any data structures. Reported-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-03-04kunit, slub: Add test_kfree_rcu_wq_destroy use caseUladzislau Rezki (Sony)
Add a test_kfree_rcu_wq_destroy test to verify a kmem_cache_destroy() from a workqueue context. The problem is that, before destroying any cache the kvfree_rcu_barrier() is invoked to guarantee that in-flight freed objects are flushed. The _barrier() function queues and flushes its own internal workers which might conflict with a workqueue type a kmem-cache gets destroyed from. One example is when a WQ_MEM_RECLAIM workqueue is flushing !WQ_MEM_RECLAIM events which leads to a kernel splat. See the check_flush_dependency() in the workqueue.c file. If this test does not emits any kernel warning, it is passed. Reviewed-by: Keith Busch <kbusch@kernel.org> Co-developed-by: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-03-04mm, slab: cleanup slab_bug() parametersVlastimil Babka
slab_err() has variadic printf arguments but instead of passing them to slab_bug() it does vsnprintf() to a buffer and passes %s, buf. To allow passing them directly, turn slab_bug() to __slab_bug() with a va_list parameter, and slab_bug() a wrapper with fmt, ... parameters. Then slab_err() can call __slab_bug() without the intermediate buffer. Also constify fmt everywhere, which also simplifies object_err()'s call to slab_bug(). Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
2025-03-04mm: slub: call WARN() when detecting a slab corruptionHyesoo Yu
If a slab object is corrupted or an error occurs in its internal validation, continuing after restoration may cause other side effects. At this point, it is difficult to debug because the problem occurred in the past. It is useful to use WARN() to catch errors at the point of issue because WARN() could trigger panic for system debugging when panic_on_warn is enabled. WARN() is added where to detect the error on slab_err and object_err. It makes sense to only do the WARN() after printing the logs. slab_err is splited to __slab_err that calls the WARN() and it is called after printing logs. Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-03-04mm: slub: Print the broken data before restoring themHyesoo Yu
Previously, the restore occurred after printing the object in slub. After commit 47d911b02cbe ("slab: make check_object() more consistent"), the bytes are printed after the restore. This information about the bytes before the restore is highly valuable for debugging purpose. For instance, in a event of cache issue, it displays byte patterns by breaking them down into 64-bytes units. Without this information, we can only speculate on how it was broken. Hence the corrupted regions should be printed prior to the restoration process. However if an object breaks in multiple places, the same log may be output multiple times. Therefore the slub log is reported only once to prevent redundant printing, by sending a parameter indicating whether an error has occurred previously. Signed-off-by: Hyesoo Yu <hyesoo.yu@samsung.com> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-03-04slab: Achieve better kmalloc caches randomization in kvmallocGONG Ruiqi
As revealed by this writeup[1], due to the fact that __kmalloc_node (now renamed to __kmalloc_node_noprof) is an exported symbol and will never get inlined, using it in kvmalloc_node (now is __kvmalloc_node_noprof) would make the RET_IP inside always point to the same address: upper_caller kvmalloc kvmalloc_node kvmalloc_node_noprof __kvmalloc_node_noprof <-- all macros all the way down here __kmalloc_node_noprof __do_kmalloc_node(.., _RET_IP_) ... <-- _RET_IP_ points to That literally means all kmalloc invoked via kvmalloc would use the same seed for cache randomization (CONFIG_RANDOM_KMALLOC_CACHES), which makes this hardening non-functional. The root cause of this problem, IMHO, is that using RET_IP only cannot identify the actual allocation site in case of kmalloc being called inside non-inlined wrappers or helper functions. And I believe there could be similar cases in other functions. Nevertheless, I haven't thought of any good solution for this. So for now let's solve this specific case first. For __kvmalloc_node_noprof, replace __kmalloc_node_noprof and call __do_kmalloc_node directly instead, so that RET_IP can take the return address of kvmalloc and differentiate each kvmalloc invocation: upper_caller kvmalloc kvmalloc_node kvmalloc_node_noprof __kvmalloc_node_noprof <-- all macros all the way down here __do_kmalloc_node(.., _RET_IP_) ... <-- _RET_IP_ points to Thanks to Tamás Koczka for the report and discussion! Link: https://github.com/google/security-research/blob/908d59b573960dc0b90adda6f16f7017aca08609/pocs/linux/kernelctf/CVE-2024-27397_mitigation/docs/exploit.md?plain=1#L259 [1] Reported-by: Tamás Koczka <poprdi@google.com> Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-03-04slab: Adjust placement of __kvmalloc_node_noprofGONG Ruiqi
Move __kvmalloc_node_noprof (as well as kvfree*, kvrealloc_noprof and kmalloc_gfp_adjust for consistency) into mm/slub.c so that it can directly invoke __do_kmalloc_node, which is needed for the next patch. No functional changes intended. Signed-off-by: GONG Ruiqi <gongruiqi1@huawei.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-03-04mm/slab: simplify SLAB_* flag handlingKevin Brodsky
SLUB is the only remaining allocator. We can therefore get rid of the logic for allocator-specific flags: * Merge SLAB_CACHE_FLAGS into SLAB_CORE_FLAGS. * Remove CACHE_CREATE_MASK and instead mask out SLAB_DEBUG_FLAGS if !CONFIG_SLUB_DEBUG. SLAB_DEBUG_FLAGS is now defined unconditionally (no impact on existing code, which ignores it if !CONFIG_SLUB_DEBUG). * Define SLAB_FLAGS_PERMITTED in terms of SLAB_CORE_FLAGS and SLAB_DEBUG_FLAGS (no functional change). While at it also remove misleading comments that suggest that multiple allocators are available. Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-03-04mm/slab/kvfree_rcu: Switch to WQ_MEM_RECLAIM wqUladzislau Rezki (Sony)
Currently kvfree_rcu() APIs use a system workqueue which is "system_unbound_wq" to driver RCU machinery to reclaim a memory. Recently, it has been noted that the following kernel warning can be observed: <snip> workqueue: WQ_MEM_RECLAIM nvme-wq:nvme_scan_work is flushing !WQ_MEM_RECLAIM events_unbound:kfree_rcu_work WARNING: CPU: 21 PID: 330 at kernel/workqueue.c:3719 check_flush_dependency+0x112/0x120 Modules linked in: intel_uncore_frequency(E) intel_uncore_frequency_common(E) skx_edac(E) ... CPU: 21 UID: 0 PID: 330 Comm: kworker/u144:6 Tainted: G E 6.13.2-0_g925d379822da #1 Hardware name: Wiwynn Twin Lakes MP/Twin Lakes Passive MP, BIOS YMM20 02/01/2023 Workqueue: nvme-wq nvme_scan_work RIP: 0010:check_flush_dependency+0x112/0x120 Code: 05 9a 40 14 02 01 48 81 c6 c0 00 00 00 48 8b 50 18 48 81 c7 c0 00 00 00 48 89 f9 48 ... RSP: 0018:ffffc90000df7bd8 EFLAGS: 00010082 RAX: 000000000000006a RBX: ffffffff81622390 RCX: 0000000000000027 RDX: 00000000fffeffff RSI: 000000000057ffa8 RDI: ffff88907f960c88 RBP: 0000000000000000 R08: ffffffff83068e50 R09: 000000000002fffd R10: 0000000000000004 R11: 0000000000000000 R12: ffff8881001a4400 R13: 0000000000000000 R14: ffff88907f420fb8 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff88907f940000(0000) knlGS:0000000000000000 CR2: 00007f60c3001000 CR3: 000000107d010005 CR4: 00000000007726f0 PKRU: 55555554 Call Trace: <TASK> ? __warn+0xa4/0x140 ? check_flush_dependency+0x112/0x120 ? report_bug+0xe1/0x140 ? check_flush_dependency+0x112/0x120 ? handle_bug+0x5e/0x90 ? exc_invalid_op+0x16/0x40 ? asm_exc_invalid_op+0x16/0x20 ? timer_recalc_next_expiry+0x190/0x190 ? check_flush_dependency+0x112/0x120 ? check_flush_dependency+0x112/0x120 __flush_work.llvm.1643880146586177030+0x174/0x2c0 flush_rcu_work+0x28/0x30 kvfree_rcu_barrier+0x12f/0x160 kmem_cache_destroy+0x18/0x120 bioset_exit+0x10c/0x150 disk_release.llvm.6740012984264378178+0x61/0xd0 device_release+0x4f/0x90 kobject_put+0x95/0x180 nvme_put_ns+0x23/0xc0 nvme_remove_invalid_namespaces+0xb3/0xd0 nvme_scan_work+0x342/0x490 process_scheduled_works+0x1a2/0x370 worker_thread+0x2ff/0x390 ? pwq_release_workfn+0x1e0/0x1e0 kthread+0xb1/0xe0 ? __kthread_parkme+0x70/0x70 ret_from_fork+0x30/0x40 ? __kthread_parkme+0x70/0x70 ret_from_fork_asm+0x11/0x20 </TASK> ---[ end trace 0000000000000000 ]--- <snip> To address this switch to use of independent WQ_MEM_RECLAIM workqueue, so the rules are not violated from workqueue framework point of view. Apart of that, since kvfree_rcu() does reclaim memory it is worth to go with WQ_MEM_RECLAIM type of wq because it is designed for this purpose. Fixes: 6c6c47b063b5 ("mm, slab: call kvfree_rcu_barrier() from kmem_cache_destroy()"), Reported-by: Keith Busch <kbusch@kernel.org> Closes: https://lore.kernel.org/all/Z7iqJtCjHKfo8Kho@kbusch-mbp/ Cc: stable@vger.kernel.org Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2025-03-04usb: dwc3: Set SUSPENDENABLE soon after phy initThinh Nguyen
After phy initialization, some phy operations can only be executed while in lower P states. Ensure GUSB3PIPECTL.SUSPENDENABLE and GUSB2PHYCFG.SUSPHY are set soon after initialization to avoid blocking phy ops. Previously the SUSPENDENABLE bits are only set after the controller initialization, which may not happen right away if there's no gadget driver or xhci driver bound. Revise this to clear SUSPENDENABLE bits only when there's mode switching (change in GCTL.PRTCAPDIR). Fixes: 6d735722063a ("usb: dwc3: core: Prevent phy suspend during init") Cc: stable <stable@kernel.org> Signed-off-by: Thinh Nguyen <Thinh.Nguyen@synopsys.com> Link: https://lore.kernel.org/r/633aef0afee7d56d2316f7cc3e1b2a6d518a8cc9.1738280911.git.Thinh.Nguyen@synopsys.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-03-03arm64: dts: rockchip: Add avdd HDMI supplies to RockPro64 board dtsiDragan Simic
Add missing "avdd-0v9-supply" and "avdd-1v8-supply" properties to the "hdmi" node in the Pine64 RockPro64 board dtsi file. To achieve this, also add the associated "vcca_0v9" regulator that produces the 0.9 V supply, [1][2] which hasn't been defined previously in the board dtsi file. This also eliminates the following warnings from the kernel log: dwhdmi-rockchip ff940000.hdmi: supply avdd-0v9 not found, using dummy regulator dwhdmi-rockchip ff940000.hdmi: supply avdd-1v8 not found, using dummy regulator There are no functional changes to the way board works with these additions, because the "vcc1v8_dvp" and "vcca_0v9" regulators are always enabled, [1][2] but these additions improve the accuracy of hardware description. These changes apply to the both supported hardware revisions of the Pine64 RockPro64, i.e. to the production-run revisions 2.0 and 2.1. [1][2] [1] https://files.pine64.org/doc/rockpro64/rockpro64_v21-SCH.pdf [2] https://files.pine64.org/doc/rockpro64/rockpro64_v20-SCH.pdf Fixes: e4f3fb490967 ("arm64: dts: rockchip: add initial dts support for Rockpro64") Cc: stable@vger.kernel.org Suggested-by: Diederik de Haas <didi.debian@cknow.org> Signed-off-by: Dragan Simic <dsimic@manjaro.org> Tested-by: Diederik de Haas <didi.debian@cknow.org> Link: https://lore.kernel.org/r/df3d7e8fe74ed5e727e085b18c395260537bb5ac.1740941097.git.dsimic@manjaro.org Signed-off-by: Heiko Stuebner <heiko@sntech.de>
2025-03-03genirq/msi: Expose MSI message data in debugfsHans Zhang
When debugging MSI-related hardware issues (e.g. interrupt delivery failures), developers currently need to either: 1. Recompile the kernel with dynamic debug for tracing msi_desc. 2. Manually read device registers through low-level tools. Both approaches become challenging in production environments where dynamic debugging is often disabled. The interrupt core provides a debugfs interface for inspection of interrupt related data, which contains the per interrupt information in the view of the hierarchical interrupt domains. Though this interface does not expose the MSI address/data pair, which is important information to: - Verify whether the MSI configuration matches the hardware expectations - Diagnose interrupt routing errors (e.g., mismatched destination ID) - Validate remapping behavior in virtualized environments Implement the debug_show() callback for the generic MSI interrupt domains, and use it to expose the MSI address/data pair in the per interrupt diagnostics. Sample output: address_hi: 0x00000000 address_lo: 0xfe670040 msg_data: 0x00000001 [ tglx: Massaged change log. Use irq_data_get_msi_desc() to avoid pointless lookup. ] Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250303121008.309265-1-18255117159@163.com
2025-03-03Merge tag 'v6.14-rc5' into x86/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-03-03x86/smp/32: Remove safe_smp_processor_id()Brian Gerst
The safe_smp_processor_id() function was originally implemented in: dc2bc768a009 ("stack overflow safe kdump: safe_smp_processor_id()") to mitigate the CPU number corruption on a stack overflow. At the time, x86-32 stored the CPU number in thread_struct, which was located at the bottom of the task stack and thus vulnerable to an overflow. The CPU number is now located in percpu memory, so this workaround is no longer needed. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20250303170115.2176553-1-brgerst@gmail.com
2025-03-03x86/asm: Merge KSTK_ESP() implementationsBrian Gerst
Commit: 263042e4630a ("Save user RSP in pt_regs->sp on SYSCALL64 fastpath") simplified the 64-bit implementation of KSTK_ESP() which is now identical to 32-bit. Merge them into a common definition. No functional change. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250303183111.2245129-1-brgerst@gmail.com
2025-03-03RDMA/bnxt_re: Fix reporting maximum SRQs on P7 chipsPreethi G
Firmware reports support for additional SRQs in the max_srq_ext field. In CREQ_QUERY_FUNC response, if MAX_SRQ_EXTENDED flag is set, driver should derive the total number of max SRQs by the summation of "max_srq" and "max_srq_ext" fields. Fixes: b1b66ae094cd ("bnxt_en: Use FW defined resource limits for RoCE") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Preethi G <preethi.gurusiddalingeswaraswamy@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1741021178-2569-4-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-03RDMA/bnxt_re: Add missing paranthesis in map_qp_id_to_tbl_indxKashyap Desai
The modulo operation returns wrong result without the paranthesis and that resulted in wrong QP table indexing. Fixes: 84cf229f4001 ("RDMA/bnxt_re: Fix the qp table indexing") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1741021178-2569-3-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-03RDMA/bnxt_re: Fix allocation of QP tableKashyap Desai
Driver is creating QP table too early while probing before querying firmware capabilities. Driver currently is using a hard coded values of 64K as size while creating QP table. This resulted in a crash when firmwre supported QP count is more than 64K. To fix the issue, move the QP tabel creation after the firmware capabilities are queried. Use the firmware returned maximum value of QPs while creating the QP table. Fixes: b1b66ae094cd ("bnxt_en: Use FW defined resource limits for RoCE") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Link: https://patch.msgid.link/1741021178-2569-2-git-send-email-selvin.xavier@broadcom.com Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-03RDMA/rxe: Fix the failure of ibv_query_device() and ibv_query_device_ex() testsZhu Yanjun
In rdma-core, the following failures appear. " $ ./build/bin/run_tests.py -k device ssssssss....FF........s ====================================================================== FAIL: test_query_device (tests.test_device.DeviceTest.test_query_device) Test ibv_query_device() ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/ubuntu/rdma-core/tests/test_device.py", line 63, in test_query_device self.verify_device_attr(attr, dev) File "/home/ubuntu/rdma-core/tests/test_device.py", line 200, in verify_device_attr assert attr.sys_image_guid != 0 ^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError ====================================================================== FAIL: test_query_device_ex (tests.test_device.DeviceTest.test_query_device_ex) Test ibv_query_device_ex() ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/ubuntu/rdma-core/tests/test_device.py", line 222, in test_query_device_ex self.verify_device_attr(attr_ex.orig_attr, dev) File "/home/ubuntu/rdma-core/tests/test_device.py", line 200, in verify_device_attr assert attr.sys_image_guid != 0 ^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError " The root cause is: before a net device is set with rxe, this net device is used to generate a sys_image_guid. Fixes: 2ac5415022d1 ("RDMA/rxe: Remove the direct link to net_device") Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Link: https://patch.msgid.link/20250302215444.3742072-1-yanjun.zhu@linux.dev Reviewed-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Tested-by: Daisuke Matsuda <matsuda-daisuke@fujitsu.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>
2025-03-03selftests: vDSO: vdso_standalone_test_x86: Switch to nolibcThomas Weißschuh
vdso_standalone_test_x86 provides its own ASM syscall wrappers and _start() implementation. The in-tree nolibc library already provides this functionality for multiple architectures. By making use of nolibc, the standalone testcase can be built from the exact same codebase as the non-standalone version. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-16-28e14e031ed8@linutronix.de
2025-03-03selftests: vDSO: vdso_test_gettimeofday: Make compatible with nolibcThomas Weißschuh
nolibc does not provide sys/time.h and sys/auxv.h, instead their definitions are available unconditionally. Guard the includes so they are not attempted on nolibc. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-15-28e14e031ed8@linutronix.de
2025-03-03selftests: vDSO: vdso_test_gettimeofday: Clean up includesThomas Weißschuh
Some unnecessary headers are included, remove them. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-14-28e14e031ed8@linutronix.de
2025-03-03selftests: vDSO: parse_vdso: Test __SIZEOF_LONG__ instead of ULONG_MAXThomas Weißschuh
According to limits.h(2) ULONG_MAX is only guaranteed to expand to an expression, not a symbolic constant which can be evaluated by the preprocessor. Specifically the definition of ULONG_MAX from nolibc can not be evaluated by the preprocessor. To provide compatibility with nolibc, check with __SIZEOF_LONG__ instead, with is provided directly by the preprocessor and therefore always a symbolic constant. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-13-28e14e031ed8@linutronix.de
2025-03-03selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headersThomas Weißschuh
To allow the usage of parse_vdso.c together with a limited libc like nolibc, use the kernels own elf.h and auxvec.h headers. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-12-28e14e031ed8@linutronix.de
2025-03-03selftests: vDSO: parse_vdso: Drop vdso_init_from_auxv()Thomas Weißschuh
There are no users left. This also removes the usage of ElfXX_auxv_t, which is not formally standardized. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-11-28e14e031ed8@linutronix.de
2025-03-03selftests: vDSO: vdso_standalone_test_x86: Use vdso_init_form_sysinfo_ehdrThomas Weißschuh
vdso_standalone_test_x86 is the only user of vdso_init_from_auxv(). Instead of combining the parsing the aux vector with the parsing of the vDSO, split them apart into getauxval() and the regular vdso_init_from_sysinfo_ehdr(). The implementation of getauxval() is taken from tools/include/nolibc/stdlib.h. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-10-28e14e031ed8@linutronix.de
2025-03-03tools/nolibc: add limits.h shim headerThomas Weißschuh
limits.h is a widely used standard header. Missing it from nolibc requires adoption effort to port applications. Add a shim header which includes the global nolibc.h header. It makes all nolibc symbols available. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Willy Tarreau <w@1wt.eu> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Willy Tarreau <w@1wt.eu> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-9-28e14e031ed8@linutronix.de
2025-03-03selftests: Add headers targetThomas Weißschuh
Some selftests need access to a full UAPI headers tree, for example when building with nolibc which heavily relies on UAPI headers. A reference to such a tree is available in the KHDR_INCLUDES variable, but there is currently no way to populate such a tree automatically. Provide a target that the tests can depend on to get access to usable UAPI headers. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-8-28e14e031ed8@linutronix.de
2025-03-03tools/include: Add uapi/linux/elf.hThomas Weißschuh
It will be used by the vDSO selftests. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-7-28e14e031ed8@linutronix.de
2025-03-03elf, uapi: Add types ElfXX_Verdef and ElfXX_VerauxThomas Weißschuh
The types are used by tools/testing/selftests/vDSO/parse_vdso.c. To be able to build the vDSO selftests without a libc dependency, add the types to the kernels own UAPI headers. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://refspecs.linuxfoundation.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/symversion.html#VERDEFEXTS Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-6-28e14e031ed8@linutronix.de
2025-03-03elf, uapi: Add type ElfXX_VersymThomas Weißschuh
The type is used by tools/testing/selftests/vDSO/parse_vdso.c. To be able to build the vDSO selftests without a libc dependency, add the type to the kernels own UAPI headers. As documented by elf(5). Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-5-28e14e031ed8@linutronix.de
2025-03-03elf, uapi: Add definitions for VER_FLG_BASE and VER_FLG_WEAKThomas Weißschuh
The definitions are used by tools/testing/selftests/vDSO/parse_vdso.c. To be able to build the vDSO selftests without a libc dependency, add the definitions to the kernels own UAPI headers. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://docs.oracle.com/cd/E19683-01/816-1386/chapter6-80869/index.html Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-4-28e14e031ed8@linutronix.de
2025-03-03elf, uapi: Add definition for DT_GNU_HASHThomas Weißschuh
The definition is used by tools/testing/selftests/vDSO/parse_vdso.c. To be able to build the vDSO selftests without a libc dependency, add the define to the kernels own UAPI headers. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://refspecs.linuxbase.org/LSB_5.0.0/LSB-Core-generic/LSB-Core-generic/libc-ddefs.html Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-3-28e14e031ed8@linutronix.de
2025-03-03elf, uapi: Add definition for STN_UNDEFThomas Weißschuh
The definition is used by tools/testing/selftests/vDSO/parse_vdso.c. To be able to build the vDSO selftests without a libc dependency, add the definition to the kernels own UAPI headers. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://refspecs.linuxfoundation.org/elf/gabi4+/ch4.symtab.html Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-2-28e14e031ed8@linutronix.de
2025-03-03MAINTAINERS: Add vDSO selftestsThomas Weißschuh
These currently have no maintainer besides the default kselftest ones. Add the general vDSO maintainers, too. Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Acked-by: Shuah Khan <skhan@linuxfoundation.org> Link: https://lore.kernel.org/all/20250226-parse_vdso-nolibc-v2-1-28e14e031ed8@linutronix.de
2025-03-03sched_ext: Merge branch 'for-6.14-fixes' into for-6.15Tejun Heo
Pull for-6.14-fixes to receive: 9360dfe4cbd6 ("sched_ext: Validate prev_cpu in scx_bpf_select_cpu_dfl()") which conflicts with: 337d1b354a29 ("sched_ext: Move built-in idle CPU selection policy to a separate file") Signed-off-by: Tejun Heo <tj@kernel.org>
2025-03-03sched_ext: Validate prev_cpu in scx_bpf_select_cpu_dfl()Andrea Righi
If a BPF scheduler provides an invalid CPU (outside the nr_cpu_ids range) as prev_cpu to scx_bpf_select_cpu_dfl() it can cause a kernel crash. To prevent this, validate prev_cpu in scx_bpf_select_cpu_dfl() and trigger an scx error if an invalid CPU is specified. Fixes: f0e1a0643a59b ("sched_ext: Implement BPF extensible scheduler class") Cc: stable@vger.kernel.org # v6.12+ Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-03-03Merge tag 'affs-6.14-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull affs fixes from David Sterba: "Two fixes from Simon Tatham. They're real bugfixes for problems with OFS floppy disks created on linux and then read in the emulated Workbench environment" * tag 'affs-6.14-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: affs: don't write overlarge OFS data block size fields affs: generate OFS sequence numbers starting at 1
2025-03-03Merge tag 'xfs-fixes-6.14-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs cleanups from Carlos Maiolino: "Just a few cleanups" * tag 'xfs-fixes-6.14-rc6' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: remove the XBF_STALE check from xfs_buf_rele_cached xfs: remove most in-flight buffer accounting xfs: decouple buffer readahead from the normal buffer read path xfs: reduce context switches for synchronous buffered I/O
2025-03-03loadpin: remove MODULE_COMPRESS_NONE as it is no longer supportedArulpandiyan Vadivel
Updated the MODULE_COMPRESS_NONE with MODULE_COMPRESS as it was no longer available from kernel modules. As MODULE_COMPRESS and MODULE_DECOMPRESS depends on MODULES removing MODULES as well. Fixes: c7ff693fa209 ("module: Split modules_install compression and in-kernel decompression") Signed-off-by: Arulpandiyan Vadivel <arulpandiyan.vadivel@siemens.com> Link: https://lore.kernel.org/r/20250302103831.285381-1-arulpandiyan.vadivel@siemens.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-03lib/string_choices: Rearrange functions in sorted orderR Sundar
Rearrange misplaced functions in sorted order. Suggested-by: Andy Shevchenko <andy.shevchenko@gmail.com> Signed-off-by: R Sundar <prosunofficial@gmail.com> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Link: https://lore.kernel.org/r/20241119021719.7659-2-prosunofficial@gmail.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-03-03string.h: Validate memtostr*()/strtomem*() arguments more carefullyKees Cook
Since these functions handle moving between C strings and non-C strings, they should check for the appropriate presence/lack of the nonstring attribute on arguments. Signed-off-by: Kees Cook <kees@kernel.org>