summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-04-17libfs: Fix simple_offset_rename_exchange()Chuck Lever
User space expects the replacement (old) directory entry to have the same directory offset after the rename. Suggested-by: Christian Brauner <brauner@kernel.org> Fixes: a2e459555c5f ("shmem: stable directory offsets") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Link: https://lore.kernel.org/r/20240415152057.4605-2-cel@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-17s390/expoline: Make modules use kernel expolinesVasily Gorbik
Currently, kernel modules contain their own set of expoline thunks. In the case of EXPOLINE_EXTERN, this involves postlinking of precompiled expoline.o. expoline.o is also necessary for out-of-source tree module builds. Now that the kernel modules area is less than 4 GB away from kernel expoline thunks, make modules use kernel expolines. Also make EXPOLINE_EXTERN the default if the compiler supports it. This simplifies build and aligns with the approach adopted by other architectures. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/nospec: Correct modules thunk offset calculationVasily Gorbik
Fix offset calculation when branch target is more then 2Gb away. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Do not rescue .vmlinux.relocs sectionAlexander Gordeev
The .vmlinux.relocs section is moved in front of the compressed kernel. The interim section rescue step is avoided as result. Suggested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Rework deployment of the kernel imageAlexander Gordeev
Rework deployment of kernel image for both compressed and uncompressed variants as defined by CONFIG_KERNEL_UNCOMPRESSED kernel configuration variable. In case CONFIG_KERNEL_UNCOMPRESSED is disabled avoid uncompressing the kernel to a temporary buffer and copying it to the target address. Instead, uncompress it directly to the target destination. In case CONFIG_KERNEL_UNCOMPRESSED is enabled avoid moving the kernel to default 0x100000 location when KASLR is disabled or failed. Instead, use the uncompressed kernel image directly. In case KASLR is disabled or failed .amode31 section location in memory is not randomized and precedes the kernel image. In case CONFIG_KERNEL_UNCOMPRESSED is disabled that location overlaps the area used by the decompression algorithm. That is fine, since that area is not used after the decompression finished and the size of .amode31 section is not expected to exceed BOOT_HEAP_SIZE ever. There is no decompression in case CONFIG_KERNEL_UNCOMPRESSED is enabled. Therefore, rename decompress_kernel() to deploy_kernel(), which better describes both uncompressed and compressed cases. Introduce AMODE31_SIZE macro to avoid immediate value of 0x3000 (the size of .amode31 section) in the decompressor linker script. Modify the vmlinux linker script to force the size of .amode31 section to AMODE31_SIZE (the value of (_eamode31 - _samode31) could otherwise differ as result of compiler options used). Introduce __START_KERNEL macro that defines the kernel ELF image entry point and set it to the currrent value of 0x100000. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390: Map kernel at fixed location when KASLR is disabledAlexander Gordeev
Since kernel virtual and physical address spaces are uncoupled the kernel is mapped at the top of the virtual address space in case KASLR is disabled. That does not pose any issue with regard to the kernel booting and operation, but makes it difficult to use a generated vmlinux with some debugging tools (e.g. gdb), because the exact location of the kernel image in virtual memory is unknown. Make that location known and introduce CONFIG_KERNEL_IMAGE_BASE configuration option. A custom CONFIG_KERNEL_IMAGE_BASE value that would break the virtual memory layout leads to a build error. The kernel image size is defined by KERNEL_IMAGE_SIZE macro and set to 512 MB, by analogy with x86. Suggested-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/mm: Uncouple physical vs virtual address spacesAlexander Gordeev
The uncoupling physical vs virtual address spaces brings the following benefits to s390: - virtual memory layout flexibility; - closes the address gap between kernel and modules, it caused s390-only problems in the past (e.g. 'perf' bugs); - allows getting rid of trampolines used for module calls into kernel; - allows simplifying BPF trampoline; - minor performance improvement in branch prediction; - kernel randomization entropy is magnitude bigger, as it is derived from the amount of available virtual, not physical memory; The whole change could be described in two pictures below: before and after the change. Some aspects of the virtual memory layout setup are not clarified (number of page levels, alignment, DMA memory), since these are not a part of this change or secondary with regard to how the uncoupling itself is implemented. The focus of the pictures is to explain why __va() and __pa() macros are implemented the way they are. Memory layout in V==R mode: | Physical | Virtual | +- 0 --------------+- 0 --------------+ identity mapping start | | S390_lowcore | Low-address memory | +- 8 KB -----------+ | | | | | identity | phys == virt | | mapping | virt == phys | | | +- AMODE31_START --+- AMODE31_START --+ .amode31 rand. phys/virt start |.amode31 text/data|.amode31 text/data| +- AMODE31_END ----+- AMODE31_END ----+ .amode31 rand. phys/virt start | | | | | | +- __kaslr_offset, __kaslr_offset_phys| kernel rand. phys/virt start | | | | kernel text/data | kernel text/data | phys == kvirt | | | +------------------+------------------+ kernel phys/virt end | | | | | | | | | | | | +- ident_map_size -+- ident_map_size -+ identity mapping end | | | ... unused gap | | | +---- vmemmap -----+ 'struct page' array start | | | virtually mapped | | memory map | | | +- __abs_lowcore --+ | | | Absolute Lowcore | | | +- __memcpy_real_area | | | Real Memory Copy| | | +- VMALLOC_START --+ vmalloc area start | | | vmalloc area | | | +- MODULES_VADDR --+ modules area start | | | modules area | | | +------------------+ UltraVisor Secure Storage limit | | | ... unused gap | | | +KASAN_SHADOW_START+ KASAN shadow memory start | | | KASAN shadow | | | +------------------+ ASCE limit Memory layout in V!=R mode: | Physical | Virtual | +- 0 --------------+- 0 --------------+ | | S390_lowcore | Low-address memory | +- 8 KB -----------+ | | | | | | | | ... unused gap | | | | +- AMODE31_START --+- AMODE31_START --+ .amode31 rand. phys/virt start |.amode31 text/data|.amode31 text/data| +- AMODE31_END ----+- AMODE31_END ----+ .amode31 rand. phys/virt end (<2GB) | | | | | | +- __kaslr_offset_phys | kernel rand. phys start | | | | kernel text/data | | | | | +------------------+ | kernel phys end | | | | | | | | | | | | +- ident_map_size -+ | | | | ... unused gap | | | +- __identity_base + identity mapping start (>= 2GB) | | | identity | phys == virt - __identity_base | mapping | virt == phys + __identity_base | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | +---- vmemmap -----+ 'struct page' array start | | | virtually mapped | | memory map | | | +- __abs_lowcore --+ | | | Absolute Lowcore | | | +- __memcpy_real_area | | | Real Memory Copy| | | +- VMALLOC_START --+ vmalloc area start | | | vmalloc area | | | +- MODULES_VADDR --+ modules area start | | | modules area | | | +- __kaslr_offset -+ kernel rand. virt start | | | kernel text/data | phys == (kvirt - __kaslr_offset) + | | __kaslr_offset_phys +- kernel .bss end + kernel rand. virt end | | | ... unused gap | | | +------------------+ UltraVisor Secure Storage limit | | | ... unused gap | | | +KASAN_SHADOW_START+ KASAN shadow memory start | | | KASAN shadow | | | +------------------+ ASCE limit Unused gaps in the virtual memory layout could be present or not - depending on how partucular system is configured. No page tables are created for the unused gaps. The relative order of vmalloc, modules and kernel image in virtual memory is defined by following considerations: - start of the modules area and end of the kernel should reside within 4GB to accommodate relative 32-bit jumps. The best way to achieve that is to place kernel next to modules; - vmalloc and module areas should locate next to each other to prevent failures and extra reworks in user level tools (makedumpfile, crash, etc.) which treat vmalloc and module addresses similarily; - kernel needs to be the last area in the virtual memory layout to easily distinguish between kernel and non-kernel virtual addresses. That is needed to (again) simplify handling of addresses in user level tools and make __pa() macro faster (see below); Concluding the above, the relative order of the considered virtual areas in memory is: vmalloc - modules - kernel. Therefore, the only change to the current memory layout is moving kernel to the end of virtual address space. With that approach the implementation of __pa() macro is straightforward - all linear virtual addresses less than kernel base are considered identity mapping: phys == virt - __identity_base All addresses greater than kernel base are kernel ones: phys == (kvirt - __kaslr_offset) + __kaslr_offset_phys By contrast, __va() macro deals only with identity mapping addresses: virt == phys + __identity_base .amode31 section is mapped separately and is not covered by __pa() macro. In fact, it could have been handled easily by checking whether a virtual address is within the section or not, but there is no need for that. Thus, let __pa() code do as little machine cycles as possible. The KASAN shadow memory is located at the very end of the virtual memory layout, at addresses higher than the kernel. However, that is not a linear mapping and no code other than KASAN instrumentation or API is expected to access it. When KASLR mode is enabled the kernel base address randomized within a memory window that spans whole unused virtual address space. The size of that window depends from the amount of physical memory available to the system, the limit imposed by UltraVisor (if present) and the vmalloc area size as provided by vmalloc= kernel command line parameter. In case the virtual memory is exhausted the minimum size of the randomization window is forcefully set to 2GB, which amounts to in 15 bits of entropy if KASAN is enabled or 17 bits of entropy in default configuration. The default kernel offset 0x100000 is used as a magic value both in the decompressor code and vmlinux linker script, but it will be removed with a follow-up change. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/crash: Use old os_info to create PT_LOAD headersAlexander Gordeev
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. The vmcore ELF program headers describe virtual memory regions of a crashed kernel. User level tools use that information for the kernel text and data analysis (e.g vmcore-dmesg extracts the kernel log). Currently the kernel image is covered by program headers describing the identity mapping regions. But in the future the kernel image will be mapped into separate region outside of the identity mapping. Create the additional ELF program header that covers kernel image only, so that vmcore tools could locate kernel text and data. Further, the identity mapping in crashed and capture kernels will have different base address. Due to that __va() macro can not be used in the capture kernel. Instead, read crashed kernel identity mapping base address from os_info and use it for PT_LOAD type program headers creation. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/vmcoreinfo: Store virtual memory layoutAlexander Gordeev
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. The virtual memory layout is needed for address translation by crash tool when /proc/kcore device is used as the memory image. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/os_info: Store virtual memory layoutAlexander Gordeev
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. The virtual memory layout will be read out by makedumpfile, crash and other user tools for virtual address translation. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/os_info: Introduce value entriesAlexander Gordeev
Introduce entries that do not reference any data in memory, but rather provide values. Set the size of such entries to zero and do not compute checksum for them, since there is no data which integrity needs to be checked. The integrity of the value entries itself is still covered by the os_info checksum. Reserve the lowest unused entry index OS_INFO_RESERVED for future use - presumably for the number of entries present. That could later be used by user level tools. The existing tools would not notice any difference. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Make .amode31 section address range explicitAlexander Gordeev
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. Introduce .amode31 section address range AMODE31_START and AMODE31_END macros for later use. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Make identity mapping base address explicitAlexander Gordeev
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. Currently the identity mapping base address is implicit and is always set to zero. Make it explicit by putting into __identity_base persistent boot variable and use it in proper context - which is the value of PAGE_OFFSET. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Uncouple virtual and physical kernel offsetsAlexander Gordeev
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. Currently __kaslr_offset is the kernel offset in both physical memory on boot and in virtual memory after DAT mode is enabled. Uncouple these offsets and rename the physical address space variant to __kaslr_offset_phys while keep the name __kaslr_offset for the offset in virtual address space. Do not use __kaslr_offset_phys after DAT mode is enabled just yet, but still make it a persistent boot variable for later use. Use __kaslr_offset and __kaslr_offset_phys offsets in proper contexts and alter handle_relocs() function to distinguish between the two. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/mm: Create virtual memory layout structureAlexander Gordeev
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. Put virtual memory layout information into a structure to improve code generation when accessing the structure members, which are currently only ident_map_size and __kaslr_offset. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/mm: Move KASLR related to <asm/page.h>Alexander Gordeev
Move everyting KASLR related to <asm/page.h>, similarly to many other architectures. Acked-by: Heiko Carstens <hca@linux.ibm.com> Suggested-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Swap vmalloc and Lowcore/Real Memory Copy areasAlexander Gordeev
This is a preparatory rework to allow uncoupling virtual and physical addresses spaces. Currently the order of virtual memory areas is (the lowcore and .amode31 section are skipped, as it is irrelevant): identity mapping (the kernel is contained within) vmemmap vmalloc modules Absolute Lowcore Real Memory Copy In the future the kernel will be mapped separately and placed to the end of the virtual address space, so the layout would turn like this: identity mapping vmemmap vmalloc modules Absolute Lowcore Real Memory Copy kernel However, the distance between kernel and modules needs to be as little as possible, ideally - none. Thus, the Absolute Lowcore and Real Memory Copy areas would stay in the way and therefore need to be moved as well: identity mapping vmemmap Absolute Lowcore Real Memory Copy vmalloc modules kernel To facilitate such layout swap the vmalloc and Absolute Lowcore together with Real Memory Copy areas. As result, the current layout turns into: identity mapping (the kernel is contained within) vmemmap Absolute Lowcore Real Memory Copy vmalloc modules This will allow to locate the kernel directly next to the modules once it gets mapped separately. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Reduce size of identity mapping on overlapAlexander Gordeev
In case vmemmap array could overlap with vmalloc area on virtual memory layout setup, the size of vmalloc area is decreased. That could result in less memory than user requested with vmalloc= kernel command line parameter. Instead, reduce the size of identity mapping (and the size of vmemmap array as result) to avoid such overlap. Further, currently the virtual memmory allocation "rolls" from top to bottom and it is only VMALLOC_START that could get increased due to the overlap. Change that to decrease- only, which makes the whole allocation algorithm more easy to comprehend. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Consider DCSS segments on memory layout setupAlexander Gordeev
The maximum mappable physical address (as returned by arch_get_mappable_range() callback) is limited by the value of (1UL << MAX_PHYSMEM_BITS). The maximum physical address available to a DCSS segment is 512GB. In case the available online or offline memory size is less than the DCSS limit arch_get_mappable_range() would include never used [512GB..(1UL << MAX_PHYSMEM_BITS)] range. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17s390/boot: Do not force vmemmap to start at MAX_PHYSMEM_BITSAlexander Gordeev
vmemmap is forcefully set to start at MAX_PHYSMEM_BITS at most. That could be needed in the past to limit ident_map_size to MAX_PHYSMEM_BITS. However since commit 75eba6ec0de1 ("s390: unify identity mapping limits handling") ident_map_size is limited in setup_ident_map_size() function, which is called earlier. Another reason to limit vmemmap start to MAX_PHYSMEM_BITS is because it was returned by arch_get_mappable_range() as the maximum mappable physical address. Since commit f641679dfe55 ("s390/mm: rework arch_get_mappable_range() callback") that is not required anymore. As result, there is no neccessity to limit vmemmap starting address with MAX_PHYSMEM_BITS. Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17KVM: s390: vsie: Use virt_to_phys for facility control blockNina Schoetterl-Glausch
In order for SIE to interpretively execute STFLE, it requires the real or absolute address of a facility-list control block. Before writing the location into the shadow SIE control block, convert it from a virtual address. We currently do not run into this bug because the lower 31 bits are the same for virtual and physical addresses. Signed-off-by: Nina Schoetterl-Glausch <nsg@linux.ibm.com> Link: https://lore.kernel.org/r/20240319164420.4053380-3-nsg@linux.ibm.com Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Message-Id: <20240319164420.4053380-3-nsg@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-04-17sched/vtime: Do not include <asm/vtime.h> headerAlexander Gordeev
There is no architecture-specific code or data left that generic <linux/vtime.h> needs to know about. Thus, avoid the inclusion of <asm/vtime.h> header. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Nicholas Piggin <npiggin@gmail.com> Link: https://lore.kernel.org/r/f7cd245668b9ae61a55184871aec494ec9199c4a.1712760275.git.agordeev@linux.ibm.com
2024-04-17s390/irq,nmi: Include <asm/vtime.h> header directlyAlexander Gordeev
update_timer_sys() and update_timer_mcck() are inlines used for CPU time accounting from the interrupt and machine-check handlers. These routines are specific to s390 architecture, but included via <linux/vtime.h> header implicitly. Avoid the extra loop and include <asm/vtime.h> header directly. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> Link: https://lore.kernel.org/r/3fb696637c0eb7e9d6ffd6cbf9e647d7c5986b3d.1712760275.git.agordeev@linux.ibm.com
2024-04-17s390/vtime: Remove unused __ARCH_HAS_VTIME_TASK_SWITCH leftoverAlexander Gordeev
__ARCH_HAS_VTIME_TASK_SWITCH macro is not used anymore. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Nicholas Piggin <npiggin@gmail.com> Link: https://lore.kernel.org/r/b1055852eab0ffea33ad16c92d6a825c83037c3e.1712760275.git.agordeev@linux.ibm.com
2024-04-17sched/vtime: Get rid of generic vtime_task_switch() implementationAlexander Gordeev
The generic vtime_task_switch() implementation gets built only if __ARCH_HAS_VTIME_TASK_SWITCH is not defined, but requires an architecture to implement arch_vtime_task_switch() callback at the same time, which is confusing. Further, arch_vtime_task_switch() is implemented for 32-bit PowerPC architecture only and vtime_task_switch() generic variant is rather superfluous. Simplify the whole vtime_task_switch() wiring by moving the existing generic implementation to PowerPC. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2cb6e3caada93623f6d4f78ad938ac6cd0e2fda8.1712760275.git.agordeev@linux.ibm.com
2024-04-17sched/vtime: Remove confusing arch_vtime_task_switch() declarationAlexander Gordeev
Callback arch_vtime_task_switch() is only defined when CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is selected. Yet, the function prototype forward declaration is present for CONFIG_VIRT_CPU_ACCOUNTING_GEN variant. Remove it. Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Link: https://lore.kernel.org/r/783d7c611864f82b0fb9edf01890b9396f3a549a.1712760275.git.agordeev@linux.ibm.com
2024-04-17serial: stm32: Reset .throttled state in .startup()Uwe Kleine-König
When an UART is opened that still has .throttled set from a previous open, the RX interrupt is enabled but the irq handler doesn't consider it. This easily results in a stuck irq with the effect to occupy the CPU in a tight loop. So reset the throttle state in .startup() to ensure that RX irqs are handled. Fixes: d1ec8a2eabe9 ("serial: stm32: update throttle and unthrottle ops for dma mode") Cc: stable@vger.kernel.org Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/a784f80d3414f7db723b2ec66efc56e1ad666cbf.1713344161.git.u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-17serial: stm32: Return IRQ_NONE in the ISR if no handling happendUwe Kleine-König
If there is a stuck irq that the handler doesn't address, returning IRQ_HANDLED unconditionally makes it impossible for the irq core to detect the problem and disable the irq. So only return IRQ_HANDLED if an event was handled. A stuck irq is still problematic, but with this change at least it only makes the UART nonfunctional instead of occupying the (usually only) CPU by 100% and so stall the whole machine. Fixes: 48a6092fb41f ("serial: stm32-usart: Add STM32 USART Driver") Cc: stable@vger.kernel.org Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Link: https://lore.kernel.org/r/5f92603d0dfd8a5b8014b2b10a902d91e0bb881f.1713344161.git.u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-04-17jffs2: prevent xattr node from overflowing the eraseblockIlya Denisyev
Add a check to make sure that the requested xattr node size is no larger than the eraseblock minus the cleanmarker. Unlike the usual inode nodes, the xattr nodes aren't split into parts and spread across multiple eraseblocks, which means that a xattr node must not occupy more than one eraseblock. If the requested xattr value is too large, the xattr node can spill onto the next eraseblock, overwriting the nodes and causing errors such as: jffs2: argh. node added in wrong place at 0x0000b050(2) jffs2: nextblock 0x0000a000, expected at 0000b00c jffs2: error: (823) do_verify_xattr_datum: node CRC failed at 0x01e050, read=0xfc892c93, calc=0x000000 jffs2: notice: (823) jffs2_get_inode_nodes: Node header CRC failed at 0x01e00c. {848f,2fc4,0fef511f,59a3d171} jffs2: Node at 0x0000000c with length 0x00001044 would run over the end of the erase block jffs2: Perhaps the file system was created with the wrong erase size? jffs2: jffs2_scan_eraseblock(): Magic bitmask 0x1985 not found at 0x00000010: 0x1044 instead This breaks the filesystem and can lead to KASAN crashes such as: BUG: KASAN: slab-out-of-bounds in jffs2_sum_add_kvec+0x125e/0x15d0 Read of size 4 at addr ffff88802c31e914 by task repro/830 CPU: 0 PID: 830 Comm: repro Not tainted 6.9.0-rc3+ #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0xc6/0x120 print_report+0xc4/0x620 ? __virt_addr_valid+0x308/0x5b0 kasan_report+0xc1/0xf0 ? jffs2_sum_add_kvec+0x125e/0x15d0 ? jffs2_sum_add_kvec+0x125e/0x15d0 jffs2_sum_add_kvec+0x125e/0x15d0 jffs2_flash_direct_writev+0xa8/0xd0 jffs2_flash_writev+0x9c9/0xef0 ? __x64_sys_setxattr+0xc4/0x160 ? do_syscall_64+0x69/0x140 ? entry_SYSCALL_64_after_hwframe+0x76/0x7e [...] Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: aa98d7cf59b5 ("[JFFS2][XATTR] XATTR support on JFFS2 (version. 5)") Signed-off-by: Ilya Denisyev <dev@elkcl.ru> Link: https://lore.kernel.org/r/20240412155357.237803-1-dev@elkcl.ru Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-04-17Merge tag 'renesas-pinctrl-fixes-for-v6.9-tag1' of ↵Linus Walleij
git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-drivers into fixes pinctrl: renesas: Fixes for v6.9 - Fix a dtbs_check warning on RZ/G3S, - Fix a lockdep warning on RZ/G2L. Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2024-04-17Merge branch 'mt7530-fixes'David S. Miller
Merge branch 'mr7530-fixes' Arınç ÜNAL says: ==================== Fix port mirroring on MT7530 DSA subdriver This patch series fixes the frames received on the local port (monitor port) not being mirrored, and port mirroring for the MT7988 SoC switch. ==================== Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
2024-04-17net: dsa: mt7530: fix port mirroring for MT7988 SoC switchArınç ÜNAL
The "MT7988A Wi-Fi 7 Generation Router Platform: Datasheet (Open Version) v0.1" document shows bits 16 to 18 as the MIRROR_PORT field of the CPU forward control register. Currently, the MT7530 DSA subdriver configures bits 0 to 2 of the CPU forward control register which breaks the port mirroring feature for the MT7988 SoC switch. Fix this by using the MT7531_MIRROR_PORT_GET() and MT7531_MIRROR_PORT_SET() macros which utilise the correct bits. Fixes: 110c18bfed41 ("net: dsa: mt7530: introduce driver for MT7988 built-in switch") Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Acked-by: Daniel Golle <daniel@makrotopia.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-17net: dsa: mt7530: fix mirroring frames received on local portArınç ÜNAL
This switch intellectual property provides a bit on the ARL global control register which controls allowing mirroring frames which are received on the local port (monitor port). This bit is unset after reset. This ability must be enabled to fully support the port mirroring feature on this switch intellectual property. Therefore, this patch fixes the traffic not being reflected on a port, which would be configured like below: tc qdisc add dev swp0 clsact tc filter add dev swp0 ingress matchall skip_sw \ action mirred egress mirror dev swp0 As a side note, this configuration provides the hairpinning feature for a single port. Fixes: 37feab6076aa ("net: dsa: mt7530: add support for port mirroring") Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-17USB: serial: option: add support for Fibocom FM650/FG650Chuanhong Guo
Fibocom FM650/FG650 are 5G modems with ECM/NCM/RNDIS/MBIM modes. This patch adds support to all 4 modes. In all 4 modes, the first serial port is the AT console while the other 3 appear to be diagnostic interfaces for dumping modem logs. usb-devices output for all modes: ECM: T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 5 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2cb7 ProdID=0a04 Rev=04.04 S: Manufacturer=Fibocom Wireless Inc. S: Product=FG650 Module S: SerialNumber=0123456789ABCDEF C: #Ifs= 5 Cfg#= 1 Atr=c0 MxPwr=504mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=06 Prot=00 Driver=cdc_ether E: Ad=82(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=cdc_ether E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms NCM: T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 6 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2cb7 ProdID=0a05 Rev=04.04 S: Manufacturer=Fibocom Wireless Inc. S: Product=FG650 Module S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=c0 MxPwr=504mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0d Prot=00 Driver=cdc_ncm E: Ad=82(I) Atr=03(Int.) MxPS= 16 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=01 Driver=cdc_ncm E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms RNDIS: T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 4 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2cb7 ProdID=0a06 Rev=04.04 S: Manufacturer=Fibocom Wireless Inc. S: Product=FG650 Module S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=c0 MxPwr=504mA I: If#= 0 Alt= 0 #EPs= 1 Cls=e0(wlcon) Sub=01 Prot=03 Driver=rndis_host E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 2 Cls=0a(data ) Sub=00 Prot=00 Driver=rndis_host E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms MBIM: T: Bus=04 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 7 Spd=5000 MxCh= 0 D: Ver= 3.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs= 1 P: Vendor=2cb7 ProdID=0a07 Rev=04.04 S: Manufacturer=Fibocom Wireless Inc. S: Product=FG650 Module S: SerialNumber=0123456789ABCDEF C: #Ifs= 6 Cfg#= 1 Atr=c0 MxPwr=504mA I: If#= 0 Alt= 0 #EPs= 1 Cls=02(commc) Sub=0e Prot=00 Driver=cdc_mbim E: Ad=82(I) Atr=03(Int.) MxPS= 64 Ivl=32ms I: If#= 1 Alt= 1 #EPs= 2 Cls=0a(data ) Sub=00 Prot=02 Driver=cdc_mbim E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 4 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=00 Prot=00 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms Signed-off-by: Chuanhong Guo <gch981213@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org>
2024-04-17wifi: iwlwifi: mvm: return uid from iwl_mvm_build_scan_cmdMiri Korenblit
This function is supposed to return a uid on success, and an errno in failure. But it currently returns the return value of the specific cmd version handler, which in turn returns 0 on success and errno otherwise. This means that on success, iwl_mvm_build_scan_cmd will return 0 regardless if the actual uid. Fix this by returning the uid if the handler succeeded. Fixes: 687db6ff5b70 ("iwlwifi: scan: make new scan req versioning flow") Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Reviewed-by: Ilan Peer <ilan.peer@intel.com> Link: https://msgid.link/20240415114847.5e2d602b3190.I4c4931021be74a67a869384c8f8ee7463e0c7857@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-04-17wifi: iwlwifi: mvm: remove old PASN station when adding a new oneAvraham Stern
If a PASN station is added, and an old PASN station already exists for the same mac address, remove the old station before adding the new one. Keeping the old station caueses old security context to be used in measurements. Fixes: 0739a7d70e00 ("iwlwifi: mvm: initiator: add option for adding a PASN responder") Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Miri Korenblit <miriam.rachel.korenblit@intel.com> Link: https://msgid.link/20240415114847.ef3544a416f2.I4e8c7c8ca22737f4f908ae5cd4fc0b920c703dd3@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-04-17wifi: mac80211: split mesh fast tx cache into local/proxied/forwardedFelix Fietkau
Depending on the origin of the packets (and their SA), 802.11 + mesh headers could be filled in differently. In order to properly deal with that, add a new field to the lookup key, indicating the type (local, proxied or forwarded). This can fix spurious packet drop issues that depend on the order in which nodes/hosts communicate with each other. Fixes: d5edb9ae8d56 ("wifi: mac80211: mesh fast xmit support") Signed-off-by: Felix Fietkau <nbd@nbd.name> Link: https://msgid.link/20240415121811.13391-1-nbd@nbd.name [use sizeof_field() for key_len] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-04-16ARM: dts: BCM5301X: remove earlycon on ASUS RT-AC3100 and ASUS RT-AC88UArınç ÜNAL
Remove the earlycon boot argument. As Krzysztof pointed out, earlycon is for debugging, not regular mainline usage. Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reviewed-by: Krzysztof Kozlowski <krzk@kernel.org> Link: https://lore.kernel.org/r/20240414-for-soc-asus-rt-ac3100-improvements-v1-4-0e40caf1a70a@arinc9.com Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
2024-04-16ARM: dts: BCM5301X: remove duplicate compatible on ASUS RT-AC3100 & AC88UArınç ÜNAL
The compatible property on the node with the srab handle is already described with the same value on the included device tree. Remove it. Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Link: https://lore.kernel.org/r/20240414-for-soc-asus-rt-ac3100-improvements-v1-3-0e40caf1a70a@arinc9.com Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
2024-04-16ARM: dts: BCM5301X: provide address for SoC MACs on ASUS RT-AC3100 & AC88UArınç ÜNAL
Do not leave the providing of a MAC address for an SoC MAC to a driver. Describe it on the bindings. Provide a distinct MAC address for each SoC MAC. Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Link: https://lore.kernel.org/r/20240414-for-soc-asus-rt-ac3100-improvements-v1-2-0e40caf1a70a@arinc9.com Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
2024-04-16ARM: dts: BCM5301X: use color and function on ASUS RT-AC3100 and RT-AC88UArınç ÜNAL
As the label property for LEDs is deprecated, use the color and function properties to describe the LEDs on the device tree file for ASUS RT-AC3100 and ASUS RT-AC88U. Reorder the LED and button nodes in alphabetical order. Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Link: https://lore.kernel.org/r/20240414-for-soc-asus-rt-ac3100-improvements-v1-1-0e40caf1a70a@arinc9.com Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
2024-04-16tun: limit printing rate when illegal packet received by tun devLei Chen
vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 #12 [ffffa65531497b68] printk at ffffffff89318306 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] #18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a59542 ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <lei.chen@smartx.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/20240415020247.2207781-1-lei.chen@smartx.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-16bcachefs: make sure to release last journal pin in replayKent Overstreet
This fixes a deadlock when journal replay has many keys to insert that were from fsck, not the journal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-16bcachefs: node scan: ignore multiple nodes with same seq if interiorKent Overstreet
Interior nodes are not really needed, when we have to scan - but if this pops up for leaf nodes we'll need a real heuristic. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-16bcachefs: Fix format specifier in validate_bset_keys()Nathan Chancellor
When building for 32-bit platforms, for which size_t is 'unsigned int', there is a warning from a format string in validate_bset_keys(): fs/bcachefs/btree_io.c: In function 'validate_bset_keys': fs/bcachefs/btree_io.c:891:34: error: format '%lu' expects argument of type 'long unsigned int', but argument 12 has type 'unsigned int' [-Werror=format=] 891 | "bad k->u64s %u (min %u max %lu)", k->u64s, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ fs/bcachefs/btree_io.c:603:32: note: in definition of macro 'btree_err' 603 | msg, ##__VA_ARGS__); \ | ^~~ fs/bcachefs/btree_io.c:887:21: note: in expansion of macro 'btree_err_on' 887 | if (btree_err_on(!bkeyp_u64s_valid(&b->format, k), | ^~~~~~~~~~~~ fs/bcachefs/btree_io.c:891:64: note: format string is defined here 891 | "bad k->u64s %u (min %u max %lu)", k->u64s, | ~~^ | | | long unsigned int | %u cc1: all warnings being treated as errors BKEY_U64s is size_t so the entire expression is promoted to size_t. Use the '%zu' specifier so that there is no warning regardless of the width of size_t. Fixes: 031ad9e7dbd1 ("bcachefs: Check for packed bkeys that are too big") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202404130747.wH6Dd23p-lkp@intel.com/ Closes: https://lore.kernel.org/oe-kbuild-all/202404131536.HdAMBOVc-lkp@intel.com/ Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-16bcachefs: Fix null ptr deref in twf from BCH_IOCTL_FSCK_OFFLINEKent Overstreet
We need to initialize the stdio redirects before they're used. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-16nilfs2: fix OOB in nilfs_set_de_typeJeongjun Park
The size of the nilfs_type_by_mode array in the fs/nilfs2/dir.c file is defined as "S_IFMT >> S_SHIFT", but the nilfs_set_de_type() function, which uses this array, specifies the index to read from the array in the same way as "(mode & S_IFMT) >> S_SHIFT". static void nilfs_set_de_type(struct nilfs_dir_entry *de, struct inode *inode) { umode_t mode = inode->i_mode; de->file_type = nilfs_type_by_mode[(mode & S_IFMT)>>S_SHIFT]; // oob } However, when the index is determined this way, an out-of-bounds (OOB) error occurs by referring to an index that is 1 larger than the array size when the condition "mode & S_IFMT == S_IFMT" is satisfied. Therefore, a patch to resize the nilfs_type_by_mode array should be applied to prevent OOB errors. Link: https://lkml.kernel.org/r/20240415182048.7144-1-konishi.ryusuke@gmail.com Reported-by: syzbot+2e22057de05b9f3b30d8@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=2e22057de05b9f3b30d8 Fixes: 2ba466d74ed7 ("nilfs2: directory entry operations") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-16MAINTAINERS: update Naoya Horiguchi's email addressNaoya Horiguchi
My old NEC address has been removed, so update MAINTAINERS and .mailmap to map it to my gmail address. Link: https://lkml.kernel.org/r/20240412181720.18452-1-nao.horiguchi@gmail.com Signed-off-by: Naoya Horiguchi <nao.horiguchi@gmail.com> Acked-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Oscar Salvador <osalvador@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-16fork: defer linking file vma until vma is fully initializedMiaohe Lin
Thorvald reported a WARNING [1]. And the root cause is below race: CPU 1 CPU 2 fork hugetlbfs_fallocate dup_mmap hugetlbfs_punch_hole i_mmap_lock_write(mapping); vma_interval_tree_insert_after -- Child vma is visible through i_mmap tree. i_mmap_unlock_write(mapping); hugetlb_dup_vma_private -- Clear vma_lock outside i_mmap_rwsem! i_mmap_lock_write(mapping); hugetlb_vmdelete_list vma_interval_tree_foreach hugetlb_vma_trylock_write -- Vma_lock is cleared. tmp->vm_ops->open -- Alloc new vma_lock outside i_mmap_rwsem! hugetlb_vma_unlock_write -- Vma_lock is assigned!!! i_mmap_unlock_write(mapping); hugetlb_dup_vma_private() and hugetlb_vm_op_open() are called outside i_mmap_rwsem lock while vma lock can be used in the same time. Fix this by deferring linking file vma until vma is fully initialized. Those vmas should be initialized first before they can be used. Link: https://lkml.kernel.org/r/20240410091441.3539905-1-linmiaohe@huawei.com Fixes: 8d9bfb260814 ("hugetlb: add vma based lock for pmd sharing") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reported-by: Thorvald Natvig <thorvald@google.com> Closes: https://lore.kernel.org/linux-mm/20240129161735.6gmjsswx62o4pbja@revolver/T/ [1] Reviewed-by: Jane Chu <jane.chu@oracle.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peng Zhang <zhangpeng.00@bytedance.com> Cc: Tycho Andersen <tandersen@netflix.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-04-16mm/shmem: inline shmem_is_huge() for disabled transparent hugepagesSumanth Korikkar
In order to minimize code size (CONFIG_CC_OPTIMIZE_FOR_SIZE=y), compiler might choose to make a regular function call (out-of-line) for shmem_is_huge() instead of inlining it. When transparent hugepages are disabled (CONFIG_TRANSPARENT_HUGEPAGE=n), it can cause compilation error. mm/shmem.c: In function `shmem_getattr': ./include/linux/huge_mm.h:383:27: note: in expansion of macro `BUILD_BUG' 383 | #define HPAGE_PMD_SIZE ({ BUILD_BUG(); 0; }) | ^~~~~~~~~ mm/shmem.c:1148:33: note: in expansion of macro `HPAGE_PMD_SIZE' 1148 | stat->blksize = HPAGE_PMD_SIZE; To prevent the possible error, always inline shmem_is_huge() when transparent hugepages are disabled. Link: https://lkml.kernel.org/r/20240409155407.2322714-1-sumanthk@linux.ibm.com Signed-off-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>