summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-09-01nilfs2: fix state management in error path of log writing functionRyusuke Konishi
After commit a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") was applied, the log writing function nilfs_segctor_do_construct() was able to issue I/O requests continuously even if user data blocks were split into multiple logs across segments, but two potential flaws were introduced in its error handling. First, if nilfs_segctor_begin_construction() fails while creating the second or subsequent logs, the log writing function returns without calling nilfs_segctor_abort_construction(), so the writeback flag set on pages/folios will remain uncleared. This causes page cache operations to hang waiting for the writeback flag. For example, truncate_inode_pages_final(), which is called via nilfs_evict_inode() when an inode is evicted from memory, will hang. Second, the NILFS_I_COLLECTED flag set on normal inodes remain uncleared. As a result, if the next log write involves checkpoint creation, that's fine, but if a partial log write is performed that does not, inodes with NILFS_I_COLLECTED set are erroneously removed from the "sc_dirty_files" list, and their data and b-tree blocks may not be written to the device, corrupting the block mapping. Fix these issues by uniformly calling nilfs_segctor_abort_construction() on failure of each step in the loop in nilfs_segctor_do_construct(), having it clean up logs and segment usages according to progress, and correcting the conditions for calling nilfs_redirty_inodes() to ensure that the NILFS_I_COLLECTED flag is cleared. Link: https://lkml.kernel.org/r/20240814101119.4070-1-konishi.ryusuke@gmail.com Fixes: a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01nilfs2: fix missing cleanup on rollforward recovery errorRyusuke Konishi
In an error injection test of a routine for mount-time recovery, KASAN found a use-after-free bug. It turned out that if data recovery was performed using partial logs created by dsync writes, but an error occurred before starting the log writer to create a recovered checkpoint, the inodes whose data had been recovered were left in the ns_dirty_files list of the nilfs object and were not freed. Fix this issue by cleaning up inodes that have read the recovery data if the recovery routine fails midway before the log writer starts. Link: https://lkml.kernel.org/r/20240810065242.3701-1-konishi.ryusuke@gmail.com Fixes: 0f3e1c7f23f8 ("nilfs2: recovery functions") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01nilfs2: protect references to superblock parameters exposed in sysfsRyusuke Konishi
The superblock buffers of nilfs2 can not only be overwritten at runtime for modifications/repairs, but they are also regularly swapped, replaced during resizing, and even abandoned when degrading to one side due to backing device issues. So, accessing them requires mutual exclusion using the reader/writer semaphore "nilfs->ns_sem". Some sysfs attribute show methods read this superblock buffer without the necessary mutual exclusion, which can cause problems with pointer dereferencing and memory access, so fix it. Link: https://lkml.kernel.org/r/20240811100320.9913-1-konishi.ryusuke@gmail.com Fixes: da7141fb78db ("nilfs2: add /sys/fs/nilfs2/<device> group") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01userfaultfd: don't BUG_ON() if khugepaged yanks our page tableJann Horn
Since khugepaged was changed to allow retracting page tables in file mappings without holding the mmap lock, these BUG_ON()s are wrong - get rid of them. We could also remove the preceding "if (unlikely(...))" block, but then we could reach pte_offset_map_lock() with transhuge pages not just for file mappings but also for anonymous mappings - which would probably be fine but I think is not necessarily expected. Link: https://lkml.kernel.org/r/20240813-uffd-thp-flip-fix-v2-2-5efa61078a41@google.com Fixes: 1d65b771bc08 ("mm/khugepaged: retract_page_tables() without mmap or vma lock") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01userfaultfd: fix checks for huge PMDsJann Horn
Patch series "userfaultfd: fix races around pmd_trans_huge() check", v2. The pmd_trans_huge() code in mfill_atomic() is wrong in three different ways depending on kernel version: 1. The pmd_trans_huge() check is racy and can lead to a BUG_ON() (if you hit the right two race windows) - I've tested this in a kernel build with some extra mdelay() calls. See the commit message for a description of the race scenario. On older kernels (before 6.5), I think the same bug can even theoretically lead to accessing transhuge page contents as a page table if you hit the right 5 narrow race windows (I haven't tested this case). 2. As pointed out by Qi Zheng, pmd_trans_huge() is not sufficient for detecting PMDs that don't point to page tables. On older kernels (before 6.5), you'd just have to win a single fairly wide race to hit this. I've tested this on 6.1 stable by racing migration (with a mdelay() patched into try_to_migrate()) against UFFDIO_ZEROPAGE - on my x86 VM, that causes a kernel oops in ptlock_ptr(). 3. On newer kernels (>=6.5), for shmem mappings, khugepaged is allowed to yank page tables out from under us (though I haven't tested that), so I think the BUG_ON() checks in mfill_atomic() are just wrong. I decided to write two separate fixes for these (one fix for bugs 1+2, one fix for bug 3), so that the first fix can be backported to kernels affected by bugs 1+2. This patch (of 2): This fixes two issues. I discovered that the following race can occur: mfill_atomic other thread ============ ============ <zap PMD> pmdp_get_lockless() [reads none pmd] <bail if trans_huge> <if none:> <pagefault creates transhuge zeropage> __pte_alloc [no-op] <zap PMD> <bail if pmd_trans_huge(*dst_pmd)> BUG_ON(pmd_none(*dst_pmd)) I have experimentally verified this in a kernel with extra mdelay() calls; the BUG_ON(pmd_none(*dst_pmd)) triggers. On kernels newer than commit 0d940a9b270b ("mm/pgtable: allow pte_offset_map[_lock]() to fail"), this can't lead to anything worse than a BUG_ON(), since the page table access helpers are actually designed to deal with page tables concurrently disappearing; but on older kernels (<=6.4), I think we could probably theoretically race past the two BUG_ON() checks and end up treating a hugepage as a page table. The second issue is that, as Qi Zheng pointed out, there are other types of huge PMDs that pmd_trans_huge() can't catch: devmap PMDs and swap PMDs (in particular, migration PMDs). On <=6.4, this is worse than the first issue: If mfill_atomic() runs on a PMD that contains a migration entry (which just requires winning a single, fairly wide race), it will pass the PMD to pte_offset_map_lock(), which assumes that the PMD points to a page table. Breakage follows: First, the kernel tries to take the PTE lock (which will crash or maybe worse if there is no "struct page" for the address bits in the migration entry PMD - I think at least on X86 there usually is no corresponding "struct page" thanks to the PTE inversion mitigation, amd64 looks different). If that didn't crash, the kernel would next try to write a PTE into what it wrongly thinks is a page table. As part of fixing these issues, get rid of the check for pmd_trans_huge() before __pte_alloc() - that's redundant, we're going to have to check for that after the __pte_alloc() anyway. Backport note: pmdp_get_lockless() is pmd_read_atomic() in older kernels. Link: https://lkml.kernel.org/r/20240813-uffd-thp-flip-fix-v2-0-5efa61078a41@google.com Link: https://lkml.kernel.org/r/20240813-uffd-thp-flip-fix-v2-1-5efa61078a41@google.com Fixes: c1a4de99fada ("userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation") Signed-off-by: Jann Horn <jannh@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01mm: vmalloc: ensure vmap_block is initialised before adding to queueWill Deacon
Commit 8c61291fd850 ("mm: fix incorrect vbq reference in purge_fragmented_block") extended the 'vmap_block' structure to contain a 'cpu' field which is set at allocation time to the id of the initialising CPU. When a new 'vmap_block' is being instantiated by new_vmap_block(), the partially initialised structure is added to the local 'vmap_block_queue' xarray before the 'cpu' field has been initialised. If another CPU is concurrently walking the xarray (e.g. via vm_unmap_aliases()), then it may perform an out-of-bounds access to the remote queue thanks to an uninitialised index. This has been observed as UBSAN errors in Android: | Internal error: UBSAN: array index out of bounds: 00000000f2005512 [#1] PREEMPT SMP | | Call trace: | purge_fragmented_block+0x204/0x21c | _vm_unmap_aliases+0x170/0x378 | vm_unmap_aliases+0x1c/0x28 | change_memory_common+0x1dc/0x26c | set_memory_ro+0x18/0x24 | module_enable_ro+0x98/0x238 | do_init_module+0x1b0/0x310 Move the initialisation of 'vb->cpu' in new_vmap_block() ahead of the addition to the xarray. Link: https://lkml.kernel.org/r/20240812171606.17486-1-will@kernel.org Fixes: 8c61291fd850 ("mm: fix incorrect vbq reference in purge_fragmented_block") Signed-off-by: Will Deacon <will@kernel.org> Reviewed-by: Baoquan He <bhe@redhat.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Cc: Hailong.Liu <hailong.liu@oppo.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Lorenzo Stoakes <lstoakes@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01selftests: mm: fix build errors on armhfMuhammad Usama Anjum
The __NR_mmap isn't found on armhf. The mmap() is commonly available system call and its wrapper is present on all architectures. So it should be used directly. It solves problem for armhf and doesn't create problem for other architectures. Remove sys_mmap() functions as they aren't doing anything else other than calling mmap(). There is no need to set errno = 0 manually as glibc always resets it. For reference errors are as following: CC seal_elf seal_elf.c: In function 'sys_mmap': seal_elf.c:39:33: error: '__NR_mmap' undeclared (first use in this function) 39 | sret = (void *) syscall(__NR_mmap, addr, len, prot, | ^~~~~~~~~ mseal_test.c: In function 'sys_mmap': mseal_test.c:90:33: error: '__NR_mmap' undeclared (first use in this function) 90 | sret = (void *) syscall(__NR_mmap, addr, len, prot, | ^~~~~~~~~ Link: https://lkml.kernel.org/r/20240809082511.497266-1-usama.anjum@collabora.com Fixes: 4926c7a52de7 ("selftest mm/mseal memory sealing") Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Kees Cook <kees@kernel.org> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01Merge tag 'x86-urgent-2024-09-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: - x2apic_disable() clears x2apic_state and x2apic_mode unconditionally, even when the state is X2APIC_ON_LOCKED, which prevents the kernel to disable it thereby creating inconsistent state. Reorder the logic so it actually works correctly - The XSTATE logic for handling LBR is incorrect as it assumes that XSAVES supports LBR when the CPU supports LBR. In fact both conditions need to be true. Otherwise the enablement of LBR in the IA32_XSS MSR fails and subsequently the machine crashes on the next XRSTORS operation because IA32_XSS is not initialized. Cache the XSTATE support bit during init and make the related functions use this cached information and the LBR CPU feature bit to cure this. - Cure a long standing bug in KASLR KASLR uses the full address space between PAGE_OFFSET and vaddr_end to randomize the starting points of the direct map, vmalloc and vmemmap regions. It thereby limits the size of the direct map by using the installed memory size plus an extra configurable margin for hot-plug memory. This limitation is done to gain more randomization space because otherwise only the holes between the direct map, vmalloc, vmemmap and vaddr_end would be usable for randomizing. The limited direct map size is not exposed to the rest of the kernel, so the memory hot-plug and resource management related code paths still operate under the assumption that the available address space can be determined with MAX_PHYSMEM_BITS. request_free_mem_region() allocates from (1 << MAX_PHYSMEM_BITS) - 1 downwards. That means the first allocation happens past the end of the direct map and if unlucky this address is in the vmalloc space, which causes high_memory to become greater than VMALLOC_START and consequently causes iounmap() to fail for valid ioremap addresses. Cure this by exposing the end of the direct map via PHYSMEM_END and use that for the memory hot-plug and resource management related places instead of relying on MAX_PHYSMEM_BITS. In the KASLR case PHYSMEM_END maps to a variable which is initialized by the KASLR initialization and otherwise it is based on MAX_PHYSMEM_BITS as before. - Prevent a data leak in mmio_read(). The TDVMCALL exposes the value of an initialized variabled on the stack to the VMM. The variable is only required as output value, so it does not have to exposed to the VMM in the first place. - Prevent an array overrun in the resource control code on systems with Sub-NUMA Clustering enabled because the code failed to adjust the index by the number of SNC nodes per L3 cache. * tag 'x86-urgent-2024-09-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Fix arch_mbm_* array overrun on SNC x86/tdx: Fix data leak in mmio_read() x86/kaslr: Expose and use the end of the physical memory address space x86/fpu: Avoid writing LBR bit to IA32_XSS unless supported x86/apic: Make x2apic_disable() work correctly
2024-09-01Merge tag 'perf-urgent-2024-09-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fix from Thomas Gleixner: "A single fix for x86 performance monitoring. Haswell PMUs suffer from several errata and require a limit the minimal period for counter events, otherwise they suffer from endless loops in the PMU interrupt" * tag 'perf-urgent-2024-09-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Limit the period on Haswell
2024-09-01Merge tag 'locking-urgent-2024-08-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking fix from Thomas Gleixner: "A single fix for rt_mutex. The deadlock detection code drops into an infinite scheduling loop while still holding rt_mutex::wait_lock, which rightfully triggers a 'scheduling in atomic' warning. Unlock it before that" * tag 'locking-urgent-2024-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rtmutex: Drop rt_mutex::wait_lock before scheduling
2024-09-01Merge tag 'irq-urgent-2024-08-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "A set of fixes for interrupt chip drivers: - Unbreak the PLIC driver for Allwinner D1 systems The recent conversion of the PLIC driver to a platform driver broke Allwinnder D1 systems due to the deferred probing of platform drivers. Due to that the only timer available on D1 systems cannot get an interrupt, which causes the system to hang at boot. Other RISCV platforms are not affected because they provide the architected SBI timer which uses the built in core interrupt controller. Cure this by probing PLIC early on D1 systems - Cure a regression in ARM/GIC-V3 on 32-bit ARM systems caused by the recent addition of a initialization function, which accesses system registers before they are enabled. On 64-bit ARM they are enabled prior to that by sheer luck. Ensure they are enabled. - Cure a use before check problem in the MSI library. The existing NULL pointer check is too late. - Cure a lock order inversion in the ARM/GIC-V4 driver - Fix a IS_ERR() vs. NULL pointer check issue in the RISCV APLIC driver - Plug a reference count leak in the ARM/GIC-V2 driver" * tag 'irq-urgent-2024-08-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-msi-lib: Check for NULL ops in msi_lib_irq_domain_select() irqchip/gic-v3: Init SRE before poking sysregs irqchip/gic-v2m: Fix refcount leak in gicv2m_of_init() irqchip/riscv-aplic: Fix an IS_ERR() vs NULL bug in probe() irqchip/gic-v4: Fix ordering between vmapp and vpe locks irqchip/sifive-plic: Probe plic driver early for Allwinner D1 platform
2024-09-01bcachefs: fix rebalance accountingKent Overstreet
Fixes: 49aa7830396b ("bcachefs: Fix rebalance_work accounting") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-01Merge branch 'mctp-serial-tx-escapes'David S. Miller
Matt Johnston says: ==================== net: mctp-serial: Fix for missing tx escapes The mctp-serial code to add escape characters was incorrect due to an off-by-one error. This series adds a test for the chunking which splits by escape characters, and fixes the bug. v2: Fix kunit param const pointer ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-01net: mctp-serial: Fix missing escapes on transmitMatt Johnston
0x7d and 0x7e bytes are meant to be escaped in the data portion of frames, but this didn't occur since next_chunk_len() had an off-by-one error. That also resulted in the final byte of a payload being written as a separate tty write op. The chunk prior to an escaped byte would be one byte short, and the next call would never test the txpos+1 case, which is where the escaped byte was located. That meant it never hit the escaping case in mctp_serial_tx_work(). Example Input: 01 00 08 c8 7e 80 02 Previous incorrect chunks from next_chunk_len(): 01 00 08 c8 7e 80 02 With this fix: 01 00 08 c8 7e 80 02 Cc: stable@vger.kernel.org Fixes: a0c2ccd9b5ad ("mctp: Add MCTP-over-serial transport binding") Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-01net: mctp-serial: Add kunit test for next_chunk_len()Matt Johnston
Test various edge cases of inputs that contain characters that need escaping. This adds a new kunit suite for mctp-serial. Signed-off-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-01hwmon: ltc2991: fix register bits definesPawel Dembicki
In the LTC2991, V5 and V6 channels use the low nibble of the "V5, V6, V7, and V8 Control Register" for configuration, but currently, the high nibble is defined. This patch changes the defines to use the low nibble. Fixes: 2b9ea4262ae9 ("hwmon: Add driver for ltc2991") Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Message-ID: <20240830111349.30531-1-paweldembicki@gmail.com> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2024-09-01fscache: delete fscache_cookie_lru_timer when fscache exits to avoid UAFBaokun Li
The fscache_cookie_lru_timer is initialized when the fscache module is inserted, but is not deleted when the fscache module is removed. If timer_reduce() is called before removing the fscache module, the fscache_cookie_lru_timer will be added to the timer list of the current cpu. Afterwards, a use-after-free will be triggered in the softIRQ after removing the fscache module, as follows: ================================================================== BUG: unable to handle page fault for address: fffffbfff803c9e9 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 21ffea067 P4D 21ffea067 PUD 21ffe6067 PMD 110a7c067 PTE 0 Oops: Oops: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Tainted: G W 6.11.0-rc3 #855 Tainted: [W]=WARN RIP: 0010:__run_timer_base.part.0+0x254/0x8a0 Call Trace: <IRQ> tmigr_handle_remote_up+0x627/0x810 __walk_groups.isra.0+0x47/0x140 tmigr_handle_remote+0x1fa/0x2f0 handle_softirqs+0x180/0x590 irq_exit_rcu+0x84/0xb0 sysvec_apic_timer_interrupt+0x6e/0x90 </IRQ> <TASK> asm_sysvec_apic_timer_interrupt+0x1a/0x20 RIP: 0010:default_idle+0xf/0x20 default_idle_call+0x38/0x60 do_idle+0x2b5/0x300 cpu_startup_entry+0x54/0x60 start_secondary+0x20d/0x280 common_startup_64+0x13e/0x148 </TASK> Modules linked in: [last unloaded: netfs] ================================================================== Therefore delete fscache_cookie_lru_timer when removing the fscahe module. Fixes: 12bb21a29c19 ("fscache: Implement cookie user counting and resource pinning") Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Link: https://lore.kernel.org/r/20240826112056.2458299-1-libaokun@huaweicloud.com Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-09-01Linux 6.11-rc6v6.11-rc6Linus Torvalds
2024-09-01Merge tag 'v6.11-rc5-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - copy_file_range fix - two read fixes including read past end of file rc fix and read retry crediting fix - falloc zero range fix * tag 'v6.11-rc5-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix FALLOC_FL_ZERO_RANGE to preflush buffered part of target region cifs: Fix copy offload to flush destination region netfs, cifs: Fix handling of short DIO read cifs: Fix lack of credit renegotiation on read retry
2024-09-01Merge tag 'bcachefs-2024-08-21' of https://github.com/koverstreet/bcachefsLinus Torvalds
Push bcachefs fixes from Kent Overstreet: "The data corruption in the buffered write path is troubling; inode lock should not have been able to cause that... - Fix a rare data corruption in the rebalance path, caught as a nonce inconsistency on encrypted filesystems - Revert lockless buffered write path - Mark more errors as autofix" * tag 'bcachefs-2024-08-21' of https://github.com/koverstreet/bcachefs: bcachefs: Mark more errors as autofix bcachefs: Revert lockless buffered IO path bcachefs: Fix bch2_extents_match() false positive bcachefs: Fix failure to return error in data_update_index_update()
2024-08-31bcachefs: Mark more errors as autofixKent Overstreet
errors that are known to always be safe to fix should be autofix: this should be most errors even at this point, but that will need some thorough review. note that errors are still logged in the superblock, so we'll still know that they happened. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-31bcachefs: Revert lockless buffered IO pathKent Overstreet
We had a report of data corruption on nixos when building installer images. https://github.com/NixOS/nixpkgs/pull/321055#issuecomment-2184131334 It seems that writes are being dropped, but only when issued by QEMU, and possibly only in snapshot mode. It's undetermined if it's write calls are being dropped or dirty folios. Further testing, via minimizing the original patch to just the change that skips the inode lock on non appends/truncates, reveals that it really is just not taking the inode lock that causes the corruption: it has nothing to do with the other logic changes for preserving write atomicity in corner cases. It's also kernel config dependent: it doesn't reproduce with the minimal kernel config that ktest uses, but it does reproduce with nixos's distro config. Bisection the kernel config initially pointer the finger at page migration or compaction, but it appears that was erroneous; we haven't yet determined what kernel config option actually triggers it. Sadly it appears this will have to be reverted since we're getting too close to release and my plate is full, but we'd _really_ like to fully debug it. My suspicion is that this patch is exposing a preexisting bug - the inode lock actually covers very little in IO paths, and we have a different lock (the pagecache add lock) that guards against races with truncate here. Fixes: 7e64c86cdc6c ("bcachefs: Buffered write path now can avoid the inode lock") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-01Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull misc fixes from Guenter Roeck. These are fixes for regressions that Guenther has been reporting, and the maintainers haven't picked up and sent in. With rc6 fairly imminent, I'm taking them directly from Guenter. * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: apparmor: fix policy_unpack_test on big endian systems Revert "MIPS: csrc-r4k: Apply verification clocksource flags" microblaze: don't treat zero reserved memory regions as error
2024-09-01Merge tag 'pwrseq-fixes-for-v6.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull power sequencing fix from Bartosz Golaszewski: "A follow-up fix for the power sequencing subsystem. It turned out the previous fix for this driver was incomplete and broke the WLAN support on some platforms. This addresses the issue. - set the direction of the wlan-enable GPIO to output after requesting it as-is" * tag 'pwrseq-fixes-for-v6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: power: sequencing: qcom-wcn: set the wlan-enable GPIO to output
2024-08-31power: sequencing: qcom-wcn: set the wlan-enable GPIO to outputBartosz Golaszewski
Commit a9aaf1ff88a8 ("power: sequencing: request the WLAN enable GPIO as-is") broke WLAN on boards on which the wlan-enable GPIO enabling the wifi module isn't in output mode by default. We need to set direction to output while retaining the value that was already set to keep the ath module on if it's already started. Fixes: a9aaf1ff88a8 ("power: sequencing: request the WLAN enable GPIO as-is") Link: https://lore.kernel.org/r/20240823115500.37280-1-brgl@bgdev.pl Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
2024-09-01Merge tag 'usb-6.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for 6.11-rc6. Included in here are: - dwc3 driver fixes for reported issues - MAINTAINER file update, marking a driver as unsupported :( - cdnsp driver fixes - USB gadget driver fix - USB sysfs fix - other tiny fixes - new device ids for usb serial driver All of these have been in linux-next this week with no reported issues" * tag 'usb-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: option: add MeiG Smart SRM825L usb: cdnsp: fix for Link TRB with TC usb: dwc3: st: add missing depopulate in probe error path usb: dwc3: st: fix probed platform device ref count on probe error path usb: dwc3: ep0: Don't reset resource alloc flag (including ep0) usb: core: sysfs: Unmerge @usb3_hardware_lpm_attr_group in remove_power_attributes() usb: typec: fsa4480: Relax CHIP_ID check usb: dwc3: xilinx: add missing depopulate in probe error path usb: dwc3: omap: add missing depopulate in probe error path dt-bindings: usb: microchip,usb2514: Fix reference USB device schema usb: gadget: uvc: queue pump work in uvcg_video_enable() cdc-acm: Add DISABLE_ECHO quirk for GE HealthCare UI Controller usb: cdnsp: fix incorrect index in cdnsp_get_hw_deq function usb: dwc3: core: Prevent USB core invalid event buffer address access MAINTAINERS: Mark UVC gadget driver as orphan
2024-09-01Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Minor fixes only. The sd.c one ignores a sync cache request if format is in progress which can happen if formatting a drive across suspend/resume" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: Ignore command SYNCHRONIZE CACHE error if format in progress scsi: aacraid: Fix double-free on probe failure scsi: lpfc: Fix overflow build issue
2024-09-01Merge tag 'nfsd-6.11-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - One more write delegation fix * tag 'nfsd-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: fix nfsd4_deleg_getattr_conflict in presence of third party lease
2024-09-01Merge tag 'xfs-6.11-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Chandan Babu: - Do not call out v1 inodes with non-zero di_nlink field as being corrupt - Change xfs_finobt_count_blocks() to count "free inode btree" blocks rather than "inode btree" blocks - Don't report the number of trimmed bytes via FITRIM because the underlying storage isn't required to do anything and failed discard IOs aren't reported to the caller anyway - Fix incorrect setting of rm_owner field in an rmap query - Report missing disk offset range in an fsmap query - Obtain m_growlock when extending realtime section of the filesystem - Reset rootdir extent size hint after extending realtime section of the filesystem * tag 'xfs-6.11-fixes-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: reset rootdir extent size hint after growfsrt xfs: take m_growlock when running growfsrt xfs: Fix missing interval for missing_owner in xfs fsmap xfs: use XFS_BUF_DADDR_NULL for daddrs in getfsmap code xfs: Fix the owner setting issue for rmap query in xfs fsmap xfs: don't bother reporting blocks trimmed via FITRIM xfs: xfs_finobt_count_blocks() walks the wrong btree xfs: fix folio dirtying for XFILE_ALLOC callers xfs: fix di_onlink checking for V1/V2 inodes
2024-09-01Merge tag 'arm-fixes-6.11-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC fixes from Arnd Bergmann: "There is a fairly large number of bug fixes for Qualcomm platforms, most of them addressing issues with the devicetree files for the newly added Snapdragon X1 based laptops to make them more reliable. The Qualcomm driver changes address a few build-time issues as well as runtime problems in the tzmem and scm firmware, the USB Type-C driver, and the cmd-db and pmic_glink soc drivers. The NXP i.MX usually gets a bunch of devicetree fixes that is proportional to the number of supported machines. This includes both warning fixes and correctness for the 64-bit i.MX9, i.MX8 and layerscape platforms, as well as a single fix for a 32-bit i.MX6 based board. The other changes are the usual minor changes, including an update to the MAINTAINERS file, an omap3 dts file and a SoC driver for mpfs (risc-v)" * tag 'arm-fixes-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (50 commits) firmware: microchip: fix incorrect error report of programming:timeout on success soc: qcom: pd-mapper: Fix singleton refcount firmware: qcom: tzmem: disable sdm670 platform soc: qcom: pmic_glink: Actually communicate when remote goes down usb: typec: ucsi: Move unregister out of atomic section soc: qcom: pmic_glink: Fix race during initialization firmware: qcom: qseecom: remove unused functions firmware: qcom: tzmem: fix virtual-to-physical address conversion firmware: qcom: scm: Mark get_wq_ctx() as atomic call arm64: dts: qcom: x1e80100: Fix Adreno SMMU global interrupt arm64: dts: qcom: disable GPU on x1e80100 by default arm64: dts: imx8mm-phygate: fix typo pinctrcl-0 arm64: dts: imx95: correct L3Cache cache-sets arm64: dts: imx95: correct a55 power-domains arm64: dts: freescale: imx93-tqma9352-mba93xxla: fix typo arm64: dts: freescale: imx93-tqma9352: fix CMA alloc-ranges ARM: dts: imx6dl-yapp43: Increase LED current to match the yapp4 HW design arm64: dts: imx93: update default value for snps,clk-csr arm64: dts: freescale: tqma9352: Fix watchdog reset arm64: dts: imx8mp-beacon-kit: Fix Stereo Audio on WM8962 ...
2024-08-31Merge tag 'input-for-v6.11-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fix from Dmitry Torokhov: - a fix for Cypress PS/2 touchpad for regression introduced in 6.11 merge window where a timeout condition is incorrectly reported for all extended Cypress commands * tag 'input-for-v6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: cypress_ps2 - fix waiting for command response
2024-08-31Merge tag 'pci-v6.11-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Add Manivannan Sadhasivam as PCI native host bridge and endpoint driver reviewer (Manivannan Sadhasivam) - Disable MHI RAM data parity error interrupt for qcom SA8775P SoC to work around hardware erratum that causes a constant stream of interrupts (Manivannan Sadhasivam) - Don't try to fall back to qcom Operating Performance Points (OPP) support unless the platform actually supports OPP (Manivannan Sadhasivam) - Add imx@lists.linux.dev mailing list to MAINTAINERS for NXP layerscape and imx6 PCI controller drivers (Frank Li) * tag 'pci-v6.11-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: MAINTAINERS: PCI: Add NXP PCI controller mailing list imx@lists.linux.dev PCI: qcom: Use OPP only if the platform supports it PCI: qcom-ep: Disable MHI RAM data parity error interrupt for SA8775P SoC MAINTAINERS: Add Manivannan Sadhasivam as Reviewer for PCI native host bridge and endpoint drivers
2024-08-31Merge tag 'block-6.11-20240830' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fix from Jens Axboe: "Fix for a single regression for WRITE_SAME introduced in the 6.11 merge window" * tag 'block-6.11-20240830' of git://git.kernel.dk/linux: block: fix detection of unsupported WRITE SAME in blkdev_issue_write_zeroes
2024-08-31Merge tag 'io_uring-6.11-20240830' of git://git.kernel.dk/linuxLinus Torvalds
Pull io_uring fixes from Jens Axboe: - A fix for a regression that happened in 6.11 merge window, where the copying of iovecs for compat mode applications got broken for certain cases. - Fix for a bug introduced in 6.10, where if using recv/send bundles with classic provided buffers, the recv/send would fail to set the right iovec count. This caused 0 byte send/recv results. Found via code coverage testing and writing a test case to exercise it. * tag 'io_uring-6.11-20240830' of git://git.kernel.dk/linux: io_uring/kbuf: return correct iovec count from classic buffer peek io_uring/rsrc: ensure compat iovecs are copied correctly
2024-08-30Bluetooth: MGMT: Ignore keys being loaded with invalid typeLuiz Augusto von Dentz
Due to 59b047bc98084f8af2c41483e4d68a5adf2fa7f7 there could be keys stored with the wrong address type so this attempt to detect it and ignore them instead of just failing to load all keys. Cc: stable@vger.kernel.org Link: https://github.com/bluez/bluez/issues/875 Fixes: 59b047bc9808 ("Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-30Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE"Luiz Augusto von Dentz
This reverts commit 59b047bc98084f8af2c41483e4d68a5adf2fa7f7 which breaks compatibility with commands like: bluetoothd[46328]: @ MGMT Command: Load.. (0x0013) plen 74 {0x0001} [hci0] Keys: 2 BR/EDR Address: C0:DC:DA:A5:E5:47 (Samsung Electronics Co.,Ltd) Key type: Authenticated key from P-256 (0x03) Central: 0x00 Encryption size: 16 Diversifier[2]: 0000 Randomizer[8]: 0000000000000000 Key[16]: 6ed96089bd9765be2f2c971b0b95f624 LE Address: D7:2A:DE:1E:73:A2 (Static) Key type: Unauthenticated key from P-256 (0x02) Central: 0x00 Encryption size: 16 Diversifier[2]: 0000 Randomizer[8]: 0000000000000000 Key[16]: 87dd2546ededda380ffcdc0a8faa4597 @ MGMT Event: Command Status (0x0002) plen 3 {0x0001} [hci0] Load Long Term Keys (0x0013) Status: Invalid Parameters (0x0d) Cc: stable@vger.kernel.org Link: https://github.com/bluez/bluez/issues/875 Fixes: 59b047bc9808 ("Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-30Bluetooth: MGMT: Fix not generating command complete for MGMT_OP_DISCONNECTLuiz Augusto von Dentz
MGMT_OP_DISCONNECT can be called while mgmt_device_connected has not been called yet, which will cause the connection procedure to be aborted, so mgmt_device_disconnected shall still respond with command complete to MGMT_OP_DISCONNECT and just not emit MGMT_EV_DEVICE_DISCONNECTED since MGMT_EV_DEVICE_CONNECTED was never sent. To fix this MGMT_OP_DISCONNECT is changed to work similarly to other command which do use hci_cmd_sync_queue and then use hci_conn_abort to disconnect and returns the result, in order for hci_conn_abort to be used from hci_cmd_sync context it now uses hci_cmd_sync_run_once. Link: https://github.com/bluez/bluez/issues/932 Fixes: 12d4a3b2ccb3 ("Bluetooth: Move check for MGMT_CONNECTED flag into mgmt.c") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-30Bluetooth: hci_sync: Introduce hci_cmd_sync_run/hci_cmd_sync_run_onceLuiz Augusto von Dentz
This introduces hci_cmd_sync_run/hci_cmd_sync_run_once which acts like hci_cmd_sync_queue/hci_cmd_sync_queue_once but runs immediately when already on hdev->cmd_sync_work context. Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-30Bluetooth: qca: If memdump doesn't work, re-enable IBSDouglas Anderson
On systems in the field, we are seeing this sometimes in the kernel logs: Bluetooth: qca_controller_memdump() hci0: hci_devcd_init Return:-95 This means that _something_ decided that it wanted to get a memdump but then hci_devcd_init() returned -EOPNOTSUPP (AKA -95). The cleanup code in qca_controller_memdump() when we get back an error from hci_devcd_init() undoes most things but forgets to clear QCA_IBS_DISABLED. One side effect of this is that, during the next suspend, qca_suspend() will always get a timeout. Let's fix it so that we clear the bit. Fixes: 06d3fdfcdf5c ("Bluetooth: hci_qca: Add qcom devcoredump support") Reviewed-by: Guenter Roeck <groeck@chromium.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-08-30can: kvaser_pciefd: Use a single write when releasing RX buffersMartin Jocic
Kvaser's PCIe cards uses the KCAN FPGA IP block which has dual 4K buffers for incoming messages shared by all (currently up to eight) channels. While the driver processes messages in one buffer, new incoming messages are stored in the other and so on. The design of KCAN is such that a buffer must be fully read and then released. Releasing a buffer will make the FPGA switch buffers. If the other buffer contains at least one incoming message the FPGA will also instantly issue a new interrupt, if not the interrupt will be issued after receiving the first new message. With IRQx interrupts, it takes a little time for the interrupt to happen, enough for any previous ISR call to do it's business and return, but MSI interrupts are way faster so this time is reduced to almost nothing. So with MSI, releasing the buffer HAS to be the very last action of the ISR before returning, otherwise the new interrupt might be "masked" by the kernel because the previous ISR call hasn't returned. And the interrupts are edge-triggered so we cannot loose one, or the ping-pong reading process will stop. This is why this patch modifies the driver to use a single write to the SRB_CMD register before returning. Signed-off-by: Martin Jocic <martin.jocic@kvaser.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://patch.msgid.link/20240830153113.2081440-1-martin.jocic@kvaser.com Fixes: 26ad340e582d ("can: kvaser_pciefd: Add driver for Kvaser PCIEcan devices") Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-08-30Merge tag 'at91-fixes-6.11' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/at91/linux into arm/fixes Microchip AT91 fixes for v6.11 It contains: - DTS directory update to match all entries not only those starting with at91 or sama
2024-08-31Merge tag 'lsm-pr-20240830' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm fix from Paul Moore: "One small patch to correct a NFS permissions problem with SELinux and Smack" * tag 'lsm-pr-20240830' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: selinux,smack: don't bypass permissions check in inode_setsecctx hook
2024-08-31Merge tag 'pm-6.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fixes from Rafael Wysocki: "These fix three issues in the amd-pstate cpufreq driver. Specifics: - Remove checks for highest performance match on preferred cores when updating preferred core ranking in amd-pstate (Mario Limonciello) - Make amd-pstate call topology_logical_package_id() instead of logical_die_id() to get a socked ID for a CPU (Gautham Shenoy) - Fix uninitialized variable in amd_pstate_cpu_boost_update() (Dan Carpenter)" * tag 'pm-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq/amd-pstate-ut: Don't check for highest perf matching on prefcore cpufreq/amd-pstate: Use topology_logical_package_id() instead of logical_die_id() cpufreq: amd-pstate: Fix uninitialized variable in amd_pstate_cpu_boost_update()
2024-08-31Merge tag 'dmaengine-fix-6.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine Pull dmaengine fixes from Vinod Koul: - A bunch of dw driver changes to fix the src/dst addr width config - Omap driver fix for sglen initialization - stm32-dma3 driver lli_size init fix - dw edma driver fixes for watermark interrupts and unmasking STOP and ABORT interrupts * tag 'dmaengine-fix-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine: dw-edma: Do not enable watermark interrupts for HDMA dmaengine: dw-edma: Fix unmasking STOP and ABORT interrupts for HDMA dmaengine: stm32-dma3: Set lli_size after allocation dmaengine: ti: omap-dma: Initialize sglen after allocation dmaengine: dw: Unify ret-val local variables naming dmaengine: dw: Simplify max-burst calculation procedure dmaengine: dw: Define encode_maxburst() above prepare_ctllo() callbacks dmaengine: dw: Simplify prepare CTL_LO methods dmaengine: dw: Add memory bus width verification dmaengine: dw: Add peripheral bus width verification
2024-08-31Merge tag 'phy-fixes-6.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy fixes from Vinod Koul: - Qualcomm QMP X1E80100 PCIe Gen4 PHY initialisation fix - Freescale imx8mq tuning parameter name fix - Samsung exynos5 fir for error code in probe() - Xilinx Zynqmp SGMII linkup failure fix * tag 'phy-fixes-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: phy: xilinx: phy-zynqmp: Fix SGMII linkup failure on resume phy: exynos5-usbdrd: fix error code in probe() phy: fsl-imx8mq-usb: fix tuning parameter name phy: qcom: qmp-pcie: Fix X1E80100 PCIe Gen4 PHY initialisation
2024-08-31Merge tag 'soundwire-6.11-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire fix from Vinod Koul: - Single fix for non-continous port map programming * tag 'soundwire-6.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: stream: fix programming slave ports for non-continous port maps
2024-08-31Merge tag 'iommu-fixes-v6.11-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu fixes from Joerg Roedel: - Fix a device-stall problem in bad io-page-fault setups (faults received from devices with no supporting domain attached). - Context flush fix for Intel VT-d. - Do not allow non-read+non-write mapping through iommufd as most implementations can not handle that. - Fix a possible infinite-loop issue in map_pages() path. - Add Jean-Philippe as reviewer for SMMUv3 SVA support * tag 'iommu-fixes-v6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: MAINTAINERS: Add Jean-Philippe as SMMUv3 SVA reviewer iommu: Do not return 0 from map_pages if it doesn't do anything iommufd: Do not allow creating areas without READ or WRITE iommu/vt-d: Fix incorrect domain ID in context flush helper iommu: Handle iommu faults for a bad iopf setup
2024-08-30tcp_bpf: fix return value of tcp_bpf_sendmsg()Cong Wang
When we cork messages in psock->cork, the last message triggers the flushing will result in sending a sk_msg larger than the current message size. In this case, in tcp_bpf_send_verdict(), 'copied' becomes negative at least in the following case: 468 case __SK_DROP: 469 default: 470 sk_msg_free_partial(sk, msg, tosend); 471 sk_msg_apply_bytes(psock, tosend); 472 *copied -= (tosend + delta); // <==== HERE 473 return -EACCES; Therefore, it could lead to the following BUG with a proper value of 'copied' (thanks to syzbot). We should not use negative 'copied' as a return value here. ------------[ cut here ]------------ kernel BUG at net/socket.c:733! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 0 UID: 0 PID: 3265 Comm: syz-executor510 Not tainted 6.11.0-rc3-syzkaller-00060-gd07b43284ab3 #0 Hardware name: linux,dummy-virt (DT) pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : sock_sendmsg_nosec net/socket.c:733 [inline] pc : sock_sendmsg_nosec net/socket.c:728 [inline] pc : __sock_sendmsg+0x5c/0x60 net/socket.c:745 lr : sock_sendmsg_nosec net/socket.c:730 [inline] lr : __sock_sendmsg+0x54/0x60 net/socket.c:745 sp : ffff800088ea3b30 x29: ffff800088ea3b30 x28: fbf00000062bc900 x27: 0000000000000000 x26: ffff800088ea3bc0 x25: ffff800088ea3bc0 x24: 0000000000000000 x23: f9f00000048dc000 x22: 0000000000000000 x21: ffff800088ea3d90 x20: f9f00000048dc000 x19: ffff800088ea3d90 x18: 0000000000000001 x17: 0000000000000000 x16: 0000000000000000 x15: 000000002002ffaf x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: ffff8000815849c0 x9 : ffff8000815b49c0 x8 : 0000000000000000 x7 : 000000000000003f x6 : 0000000000000000 x5 : 00000000000007e0 x4 : fff07ffffd239000 x3 : fbf00000062bc900 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 00000000fffffdef Call trace: sock_sendmsg_nosec net/socket.c:733 [inline] __sock_sendmsg+0x5c/0x60 net/socket.c:745 ____sys_sendmsg+0x274/0x2ac net/socket.c:2597 ___sys_sendmsg+0xac/0x100 net/socket.c:2651 __sys_sendmsg+0x84/0xe0 net/socket.c:2680 __do_sys_sendmsg net/socket.c:2689 [inline] __se_sys_sendmsg net/socket.c:2687 [inline] __arm64_sys_sendmsg+0x24/0x30 net/socket.c:2687 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x48/0x110 arch/arm64/kernel/syscall.c:49 el0_svc_common.constprop.0+0x40/0xe0 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x1c/0x28 arch/arm64/kernel/syscall.c:151 el0_svc+0x34/0xec arch/arm64/kernel/entry-common.c:712 el0t_64_sync_handler+0x100/0x12c arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x19c/0x1a0 arch/arm64/kernel/entry.S:598 Code: f9404463 d63f0060 3108441f 54fffe81 (d4210000) ---[ end trace 0000000000000000 ]--- Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data") Reported-by: syzbot+58c03971700330ce14d8@syzkaller.appspotmail.com Cc: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Reviewed-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20240821030744.320934-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-30MAINTAINERS: PCI: Add NXP PCI controller mailing list imx@lists.linux.devFrank Li
Add imx mailing list imx@lists.linux.dev for PCI controller of NXP chips (Layerscape and iMX). Link: https://lore.kernel.org/r/20240826202740.970015-1-Frank.Li@nxp.com Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Richard Zhu <hongxing.zhu@nxp.com>
2024-08-30io_uring/kbuf: return correct iovec count from classic buffer peekJens Axboe
io_provided_buffers_select() returns 0 to indicate success, but it should be returning 1 to indicate that 1 vec was mapped. This causes peeking to fail with classic provided buffers, and while that's not a use case that anyone should use, it should still work correctly. The end result is that no buffer will be selected, and hence a completion with '0' as the result will be posted, without a buffer attached. Fixes: 35c8711c8fc4 ("io_uring/kbuf: add helpers for getting/peeking multiple buffers") Signed-off-by: Jens Axboe <axboe@kernel.dk>