summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-18x86/fpu: Move CPUID leaf definitions to common codeDave Hansen
Move the XSAVE-related CPUID leaf definitions to common code. Then, use the new definition to remove the last magic number from the CPUID level dependency table. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205037.43C57CDE%40davehans-spike.ostc.intel.com
2024-12-18x86/tsc: Remove CPUID "frequency" leaf magic numbers.Dave Hansen
All the code that reads the CPUID frequency information leaf hard-codes a magic number. Give it a symbolic name and use it. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205036.4397658F%40davehans-spike.ostc.intel.com
2024-12-18x86/tsc: Move away from TSC leaf magic numbersDave Hansen
The TSC code has a bunch of hard-coded references to leaf 0x15. Change them over to the symbolic name. Also zap the 'ART_CPUID_LEAF' definition. It was a duplicate of 'CPUID_TSC_LEAF'. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213205034.B79D6224%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Move TSC CPUID leaf definitionDave Hansen
Prepare to use the TSC CPUID leaf definition more widely by moving it to the common header. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205033.68799E53%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Refresh DCA leaf reading codeDave Hansen
The DCA leaf number is also hard-coded in the CPUID level dependency table. Move its definition to common code and use it. While at it, fix up the naming and types in the probe code. All CPUID data is provided in 32-bit registers, not 'unsigned long'. Also stop referring to "level_9". Move away from test_bit() because the type is no longer an 'unsigned long'. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205032.476A30FE%40davehans-spike.ostc.intel.com
2024-12-18ASoC: rt722: add delay time to wait for the calibration procedureShuming Fan
The calibration procedure needs some time to finish. This patch adds the delay time to ensure the calibration procedure is completed correctly. Signed-off-by: Shuming Fan <shumingf@realtek.com> Link: https://patch.msgid.link/20241218091307.96656-1-shumingf@realtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-12-18x86/cpu: Remove unnecessary MwAIT leaf checksDave Hansen
The CPUID leaf dependency checker will remove X86_FEATURE_MWAIT if the CPUID level is below the required level (CPUID_MWAIT_LEAF). Thus, if you check X86_FEATURE_MWAIT you do not need to also check the CPUID level. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213205030.9B42B458%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Use MWAIT leaf definitionDave Hansen
The leaf-to-feature dependency array uses hard-coded leaf numbers. Use the new common header definition for the MWAIT leaf. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205029.5B055D6E%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Move MWAIT leaf definition to common headerDave Hansen
Begin constructing a common place to keep all CPUID leaf definitions. Move CPUID_MWAIT_LEAF to the CPUID header and include it where needed. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Link: https://lore.kernel.org/all/20241213205028.EE94D02A%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Remove 'x86_cpu_desc' infrastructureDave Hansen
All the users of 'x86_cpu_desc' are gone. Zap it from the tree. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185133.AF0BF2BC%40davehans-spike.ostc.intel.com
2024-12-18x86/cpu: Move AMD erratum 1386 table over to 'x86_cpu_id'Dave Hansen
The AMD erratum 1386 detection code uses and old style 'x86_cpu_desc' table. Replace it with 'x86_cpu_id' so the old style can be removed. I did not create a new helper macro here. The new table is certainly more noisy than the old and it can be improved on. But I was hesitant to create a new macro just for a single site that is only two ugly lines in the end. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185132.07555E1D%40davehans-spike.ostc.intel.com
2024-12-18thermal/thresholds: Fix boundaries and detection routineDaniel Lezcano
The current implementation does not work if the thermal zone is interrupt driven only. The boundaries are not correctly checked and computed as it happens only when the temperature is increasing or decreasing. The problem arises because the routine to detect when we cross a threshold is correlated with the computation of the boundaries. We assume we have to recompute the boundaries when a threshold is crossed but actually we should do that even if the it is not the case. Mixing the boundaries computation and the threshold detection for the sake of optimizing the routine is much more complex as it appears intuitively and prone to errors. This fix separates the boundaries computation and the threshold crossing detection into different routines. The result is a code much more simple to understand, thus easier to maintain. The drawback is we browse the thresholds list several time but we can consider that as neglictible because that happens when the temperature is updated. There are certainly some aeras to improve in the temperature update routine but it would be not adequate as this change aims to fix the thresholds for v6.13. Fixes: 445936f9e258 ("thermal: core: Add user thresholds support") Tested-by: Daniel Lezcano <daniel.lezcano@linaro.org> # rock5b, Lenovo x13s Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://patch.msgid.link/20241216212644.1145122-1-daniel.lezcano@linaro.org Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-12-18pwm: stm32: Fix complementary output in round_waveform_tohw()Fabrice Gasnier
When the timer supports complementary output, the CCxNE bit must be set additionally to the CCxE bit. So to not overwrite the latter use |= instead of = to set the former. Fixes: deaba9cff809 ("pwm: stm32: Implementation of the waveform callbacks") Signed-off-by: Fabrice Gasnier <fabrice.gasnier@foss.st.com> Link: https://lore.kernel.org/r/20241217150021.2030213-1-fabrice.gasnier@foss.st.com [ukleinek: Slightly improve commit log] Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
2024-12-18powerpc/pseries/vas: Add close() callback in vas_vm_ops structHaren Myneni
The mapping VMA address is saved in VAS window struct when the paste address is mapped. This VMA address is used during migration to unmap the paste address if the window is active. The paste address mapping will be removed when the window is closed or with the munmap(). But the VMA address in the VAS window is not updated with munmap() which is causing invalid access during migration. The KASAN report shows: [16386.254991] BUG: KASAN: slab-use-after-free in reconfig_close_windows+0x1a0/0x4e8 [16386.255043] Read of size 8 at addr c00000014a819670 by task drmgr/696928 [16386.255096] CPU: 29 UID: 0 PID: 696928 Comm: drmgr Kdump: loaded Tainted: G B 6.11.0-rc5-nxgzip #2 [16386.255128] Tainted: [B]=BAD_PAGE [16386.255148] Hardware name: IBM,9080-HEX Power11 (architected) 0x820200 0xf000007 of:IBM,FW1110.00 (NH1110_016) hv:phyp pSeries [16386.255181] Call Trace: [16386.255202] [c00000016b297660] [c0000000018ad0ac] dump_stack_lvl+0x84/0xe8 (unreliable) [16386.255246] [c00000016b297690] [c0000000006e8a90] print_report+0x19c/0x764 [16386.255285] [c00000016b297760] [c0000000006e9490] kasan_report+0x128/0x1f8 [16386.255309] [c00000016b297880] [c0000000006eb5c8] __asan_load8+0xac/0xe0 [16386.255326] [c00000016b2978a0] [c00000000013f898] reconfig_close_windows+0x1a0/0x4e8 [16386.255343] [c00000016b297990] [c000000000140e58] vas_migration_handler+0x3a4/0x3fc [16386.255368] [c00000016b297a90] [c000000000128848] pseries_migrate_partition+0x4c/0x4c4 ... [16386.256136] Allocated by task 696554 on cpu 31 at 16377.277618s: [16386.256149] kasan_save_stack+0x34/0x68 [16386.256163] kasan_save_track+0x34/0x80 [16386.256175] kasan_save_alloc_info+0x58/0x74 [16386.256196] __kasan_slab_alloc+0xb8/0xdc [16386.256209] kmem_cache_alloc_noprof+0x200/0x3d0 [16386.256225] vm_area_alloc+0x44/0x150 [16386.256245] mmap_region+0x214/0x10c4 [16386.256265] do_mmap+0x5fc/0x750 [16386.256277] vm_mmap_pgoff+0x14c/0x24c [16386.256292] ksys_mmap_pgoff+0x20c/0x348 [16386.256303] sys_mmap+0xd0/0x160 ... [16386.256350] Freed by task 0 on cpu 31 at 16386.204848s: [16386.256363] kasan_save_stack+0x34/0x68 [16386.256374] kasan_save_track+0x34/0x80 [16386.256384] kasan_save_free_info+0x64/0x10c [16386.256396] __kasan_slab_free+0x120/0x204 [16386.256415] kmem_cache_free+0x128/0x450 [16386.256428] vm_area_free_rcu_cb+0xa8/0xd8 [16386.256441] rcu_do_batch+0x2c8/0xcf0 [16386.256458] rcu_core+0x378/0x3c4 [16386.256473] handle_softirqs+0x20c/0x60c [16386.256495] do_softirq_own_stack+0x6c/0x88 [16386.256509] do_softirq_own_stack+0x58/0x88 [16386.256521] __irq_exit_rcu+0x1a4/0x20c [16386.256533] irq_exit+0x20/0x38 [16386.256544] interrupt_async_exit_prepare.constprop.0+0x18/0x2c ... [16386.256717] Last potentially related work creation: [16386.256729] kasan_save_stack+0x34/0x68 [16386.256741] __kasan_record_aux_stack+0xcc/0x12c [16386.256753] __call_rcu_common.constprop.0+0x94/0xd04 [16386.256766] vm_area_free+0x28/0x3c [16386.256778] remove_vma+0xf4/0x114 [16386.256797] do_vmi_align_munmap.constprop.0+0x684/0x870 [16386.256811] __vm_munmap+0xe0/0x1f8 [16386.256821] sys_munmap+0x54/0x6c [16386.256830] system_call_exception+0x1a0/0x4a0 [16386.256841] system_call_vectored_common+0x15c/0x2ec [16386.256868] The buggy address belongs to the object at c00000014a819670 which belongs to the cache vm_area_struct of size 168 [16386.256887] The buggy address is located 0 bytes inside of freed 168-byte region [c00000014a819670, c00000014a819718) [16386.256915] The buggy address belongs to the physical page: [16386.256928] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14a81 [16386.256950] memcg:c0000000ba430001 [16386.256961] anon flags: 0x43ffff800000000(node=4|zone=0|lastcpupid=0x7ffff) [16386.256975] page_type: 0xfdffffff(slab) [16386.256990] raw: 043ffff800000000 c00000000501c080 0000000000000000 5deadbee00000001 [16386.257003] raw: 0000000000000000 00000000011a011a 00000001fdffffff c0000000ba430001 [16386.257018] page dumped because: kasan: bad access detected This patch adds close() callback in vas_vm_ops vm_operations_struct which will be executed during munmap() before freeing VMA. The VMA address in the VAS window is set to NULL after holding the window mmap_mutex. Fixes: 37e6764895ef ("powerpc/pseries/vas: Add VAS migration handler") Signed-off-by: Haren Myneni <haren@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241214051758.997759-1-haren@linux.ibm.com
2024-12-18Merge patch series "can: m_can: set init flag earlier in probe"Marc Kleine-Budde
This series fixes problems in the m_can_pci driver found on the Intel Elkhart Lake processor. Link: https://patch.msgid.link/e247f331cb72829fcbdfda74f31a59cbad1a6006.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-18can: m_can: fix missed interrupts with m_can_pciMatthias Schiffer
The interrupt line of PCI devices is interpreted as edge-triggered, however the interrupt signal of the m_can controller integrated in Intel Elkhart Lake CPUs appears to be generated level-triggered. Consider the following sequence of events: - IR register is read, interrupt X is set - A new interrupt Y is triggered in the m_can controller - IR register is written to acknowledge interrupt X. Y remains set in IR As at no point in this sequence no interrupt flag is set in IR, the m_can interrupt line will never become deasserted, and no edge will ever be observed to trigger another run of the ISR. This was observed to result in the TX queue of the EHL m_can to get stuck under high load, because frames were queued to the hardware in m_can_start_xmit(), but m_can_finish_tx() was never run to account for their successful transmission. On an Elkhart Lake based board with the two CAN interfaces connected to each other, the following script can reproduce the issue: ip link set can0 up type can bitrate 1000000 ip link set can1 up type can bitrate 1000000 cangen can0 -g 2 -I 000 -L 8 & cangen can0 -g 2 -I 001 -L 8 & cangen can0 -g 2 -I 002 -L 8 & cangen can0 -g 2 -I 003 -L 8 & cangen can0 -g 2 -I 004 -L 8 & cangen can0 -g 2 -I 005 -L 8 & cangen can0 -g 2 -I 006 -L 8 & cangen can0 -g 2 -I 007 -L 8 & cangen can1 -g 2 -I 100 -L 8 & cangen can1 -g 2 -I 101 -L 8 & cangen can1 -g 2 -I 102 -L 8 & cangen can1 -g 2 -I 103 -L 8 & cangen can1 -g 2 -I 104 -L 8 & cangen can1 -g 2 -I 105 -L 8 & cangen can1 -g 2 -I 106 -L 8 & cangen can1 -g 2 -I 107 -L 8 & stress-ng --matrix 0 & To fix the issue, repeatedly read and acknowledge interrupts at the start of the ISR until no interrupt flags are set, so the next incoming interrupt will also result in an edge on the interrupt line. While we have received a report that even with this patch, the TX queue can become stuck under certain (currently unknown) circumstances on the Elkhart Lake, this patch completely fixes the issue with the above reproducer, and it is unclear whether the remaining issue has a similar cause at all. Fixes: cab7ffc0324f ("can: m_can: add PCI glue driver for Intel Elkhart Lake") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Link: https://patch.msgid.link/fdf0439c51bcb3a46c21e9fb21c7f1d06363be84.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-18can: m_can: set init flag earlier in probeMatthias Schiffer
While an m_can controller usually already has the init flag from a hardware reset, no such reset happens on the integrated m_can_pci of the Intel Elkhart Lake. If the CAN controller is found in an active state, m_can_dev_setup() would fail because m_can_niso_supported() calls m_can_cccr_update_bits(), which refuses to modify any other configuration bits when CCCR_INIT is not set. To avoid this issue, set CCCR_INIT before attempting to modify any other configuration flags. Fixes: cd5a46ce6fa6 ("can: m_can: don't enable transceiver when probing") Signed-off-by: Matthias Schiffer <matthias.schiffer@ew.tq-group.com> Reviewed-by: Markus Schneider-Pargmann <msp@baylibre.com> Link: https://patch.msgid.link/e247f331cb72829fcbdfda74f31a59cbad1a6006.1728288535.git.matthias.schiffer@ew.tq-group.com Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-12-18powerpc/book3s64/hugetlb: Fix disabling hugetlb when fadump is activeSourabh Jain
Commit 8597538712eb ("powerpc/fadump: Do not use hugepages when fadump is active") disabled hugetlb support when fadump is active by returning early from hugetlbpage_init():arch/powerpc/mm/hugetlbpage.c and not populating hpage_shift/HPAGE_SHIFT. Later, commit 2354ad252b66 ("powerpc/mm: Update default hugetlb size early") moved the allocation of hpage_shift/HPAGE_SHIFT to early boot, which inadvertently re-enabled hugetlb support when fadump is active. Fix this by implementing hugepages_supported() on powerpc. This ensures that disabling hugetlb for the fadump kernel is independent of hpage_shift/HPAGE_SHIFT. Fixes: 2354ad252b66 ("powerpc/mm: Update default hugetlb size early") Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241217074640.1064510-1-sourabhjain@linux.ibm.com
2024-12-18powerpc/vdso: Mark the vDSO code read-only after initChristophe Leroy
VDSO text is fixed-up during init so it can't be const, but it can be read-only after init. Do the same as x86 in commit 018ef8dcf3de ("x86/vdso: Mark the vDSO code read-only after init") and arm in commit 11bf9b865898 ("ARM/vdso: Mark the vDSO code read-only after init"), move it into ro_after_init section. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/e9892d288b646cbdfeef0b2b73edbaf6d3c6cabe.1734174500.git.christophe.leroy@csgroup.eu
2024-12-18powerpc/64: Use get_user() in start_thread()Michael Ellerman
For ELFv1 binaries (big endian), the ELF entry point isn't the address of the first instruction, instead it points to the function descriptor for the entry point. The address of the first instruction is in the function descriptor. That means the kernel has to fetch the address of the first instruction from user memory. Because start_thread() uses __get_user(), which has no access_ok() checks, it looks like a malicious ELF binary could be crafted to point the entry point address at kernel memory. The kernel would load 8 bytes from kernel memory into the NIP and then start the process, it would typically crash, but a debugger could observe the NIP value which would be the result of reading from kernel memory. However that's NOT possible, because there is a check in load_elf_binary() that ensures the ELF entry point is < TASK_SIZE (look for BAD_ADDR(elf_entry)). However it's fragile for start_thread() to rely on a check elsewhere, even if the ELF parser is unlikely to ever drop the check that elf_entry is a user address. Make it more robust by using get_user(), which checks that the address points at userspace before doing the load. If the address doesn't point at userspace it will just set the result to zero, and the userspace program will crash at zero (which is fine because it's self-inflicted). Note that it's also possible for a malicious binary to have a valid ELF entry address, but with the first instruction address pointing into the kernel. However that's OK, because it is blocked by the MMU, just like any other attempt to jump into the kernel from userspace. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241216121706.26790-1-mpe@ellerman.id.au
2024-12-18macintosh: declare ctl_table as constLuis Felipe Hernandez
Since commit 7abc9b53bd51 ("sysctl: allow registration of const struct ctl_table"), the sysctl registration API allows struct ctl_table variables to be placed into read-only memory. mac_hid_files is registered as a sysctl table and should be treated as read-only. By declaring the mac_hid_files structure as const, we ensure that it cannot be accidentally modified. This change improves safety. Suggested-by: Thomas Weißschuh <linux@weissschuh.net> Suggested-by: Ricardo B. Marliere <rbm@suse.com> Reviewed-by: Ricardo B. Marliere <rbm@suse.com> Signed-off-by: Luis Felipe Hernandez <luis.hernandez093@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241217-constify_ctl_table-v1-1-402ebceaeb8e@gmail.com
2024-12-18selftest/powerpc/ptrace: Cleanup duplicate macro definitionsMadhavan Srinivasan
Both core-pkey.c and ptrace-pkey.c tests have similar macro definitions, move them to "pkeys.h" and remove the macro definitions from the C file. Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241216160257.87252-3-maddy@linux.ibm.com
2024-12-18selftest/powerpc/ptrace/ptrace-pkey: Remove duplicate macrosMadhavan Srinivasan
./powerpc/ptrace/Makefile includes flags.mk. In flags.mk, -I$(selfdir)/powerpc/include is always included as part of CFLAGS. So it will pick up the "pkeys.h" defined in powerpc/include. ptrace-pkey.c test has macros defined which are part of "pkeys.h" header file. Remove those duplicates and include "pkeys.h" Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241216160257.87252-2-maddy@linux.ibm.com
2024-12-18selftest/powerpc/ptrace/core-pkey: Remove duplicate macrosMadhavan Srinivasan
./powerpc/ptrace/Makefile includes flags.mk. In flags.mk, -I$(selfdir)/powerpc/include is always included as part of CFLAGS. So it will pick up the "pkeys.h" defined in powerpc/include. core-pkey.c test has couple of macros defined which are part of "pkeys.h" header file. Remove those duplicates and include "pkeys.h" Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20241216160257.87252-1-maddy@linux.ibm.com
2024-12-17rtnetlink: Try the outer netns attribute in rtnl_get_peer_net().Kuniyuki Iwashima
Xiao Liang reported that the cited commit changed netns handling in newlink() of netkit, veth, and vxcan. Before the patch, if we don't find a netns attribute in the peer device attributes, we tried to find another netns attribute in the outer netlink attributes by passing it to rtnl_link_get_net(). Let's restore the original behaviour. Fixes: 48327566769a ("rtnetlink: fix double call of rtnl_link_get_net_ifla()") Reported-by: Xiao Liang <shaw.leon@gmail.com> Closes: https://lore.kernel.org/netdev/CABAhCORBVVU8P6AHcEkENMj+gD2d3ce9t=A_o48E0yOQp8_wUQ@mail.gmail.com/#t Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Tested-by: Xiao Liang <shaw.leon@gmail.com> Link: https://patch.msgid.link/20241216110432.51488-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17net: netdevsim: fix nsim_pp_hold_write()Eric Dumazet
nsim_pp_hold_write() has two problems: 1) It may return with rtnl held, as found by syzbot. 2) Its return value does not propagate an error if any. Fixes: 1580cbcbfe77 ("net: netdevsim: add some fake page pool use") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241216083703.1859921-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-17x86/cpu: Replace PEBS use of 'x86_cpu_desc' use with 'x86_cpu_id'Dave Hansen
The 'x86_cpu_desc' and 'x86_cpu_id' structures are very similar. Reduce duplicate infrastructure by moving the few users of 'x86_cpu_desc' to the much more common variant. The existing X86_MATCH_VFM_STEPS() helper matches ranges of steppings. Instead of introducing a single-stepping match function which could get confusing when paired with the range, just use the stepping min/max match helper and use min==max. Note that this makes the table more vertically compact because multiple entries like this: INTEL_CPU_DESC(INTEL_SKYLAKE_X, 4, 0x00000000), INTEL_CPU_DESC(INTEL_SKYLAKE_X, 5, 0x00000000), INTEL_CPU_DESC(INTEL_SKYLAKE_X, 6, 0x00000000), INTEL_CPU_DESC(INTEL_SKYLAKE_X, 7, 0x00000000), can be consolidated down to a single stepping range. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185131.8B610039%40davehans-spike.ostc.intel.com
2024-12-17x86/cpu: Expose only stepping min/max interfaceDave Hansen
The x86_match_cpu() infrastructure can match CPU steppings. Since there are only 16 possible steppings, the matching infrastructure goes all out and stores the stepping match as a bitmap. That means it can match any possible steppings in a single list entry. Fun. But it exposes this bitmap to each of the X86_MATCH_*() helpers when none of them really need a bitmap. It makes up for this by exporting a helper (X86_STEPPINGS()) which converts a contiguous stepping range into the bitmap which every single user leverages. Instead of a bitmap, have the main helper for this sort of thing (X86_MATCH_VFM_STEPS()) just take a stepping range. This ends up actually being even more compact than before. Leave the helper in place (renamed to __X86_STEPPINGS()) to make it more clear what is going on instead of just having a random GENMASK() in the middle of an already complicated macro. One oddity that I hit was this macro: X86_MATCH_VFM_STEPS(vfm, X86_STEPPING_MIN, max_stepping, issues) It *could* have been converted over to take a min/max stepping value for each entry. But that would have been a bit too verbose and would prevent the one oddball in the list (INTEL_COMETLAKE_L stepping 0) from sticking out. Instead, just have it take a *maximum* stepping and imply that the match is from 0=>max_stepping. This is functional for all the cases now and also retains the nice property of having INTEL_COMETLAKE_L stepping 0 stick out like a sore thumb. skx_cpuids[] is goofy. It uses the stepping match but encodes all possible steppings. Just use a normal, non-stepping match helper. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185129.65527B2A%40davehans-spike.ostc.intel.com
2024-12-17x86/cpu: Introduce new microcode matching helperDave Hansen
The 'x86_cpu_id' and 'x86_cpu_desc' structures are very similar and need to be consolidated. There is a microcode version matching function for 'x86_cpu_desc' but not 'x86_cpu_id'. Create one for 'x86_cpu_id'. This essentially just leverages the x86_cpu_id->driver_data field to replace the less generic x86_cpu_desc->x86_microcode_rev field. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/all/20241213185128.8F24EEFC%40davehans-spike.ostc.intel.com
2024-12-17bpf: Fix bpf_get_smp_processor_id() on !CONFIG_SMPAndrea Righi
On x86-64 calling bpf_get_smp_processor_id() in a kernel with CONFIG_SMP disabled can trigger the following bug, as pcpu_hot is unavailable: [ 8.471774] BUG: unable to handle page fault for address: 00000000936a290c [ 8.471849] #PF: supervisor read access in kernel mode [ 8.471881] #PF: error_code(0x0000) - not-present page Fix by inlining a return 0 in the !CONFIG_SMP case. Fixes: 1ae6921009e5 ("bpf: inline bpf_get_smp_processor_id() helper") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241217195813.622568-1-arighi@nvidia.com
2024-12-17riscv: selftests: Fix warnings pointer masking testCharlie Jenkins
When compiling the pointer masking tests with -Wall this warning is present: pointer_masking.c: In function ‘test_tagged_addr_abi_sysctl’: pointer_masking.c:203:9: warning: ignoring return value of ‘pwrite’ declared with attribute ‘warn_unused_result’ [-Wunused-result] 203 | pwrite(fd, &value, 1, 0); | ^~~~~~~~~~~~~~~~~~~~~~~~ pointer_masking.c:208:9: warning: ignoring return value of ‘pwrite’ declared with attribute ‘warn_unused_result’ [-Wunused-result] 208 | pwrite(fd, &value, 1, 0); I came across this on riscv64-linux-gnu-gcc (Ubuntu 11.4.0-1ubuntu1~22.04). Fix this by checking that the number of bytes written equal the expected number of bytes written. Fixes: 7470b5afd150 ("riscv: selftests: Add a pointer masking test") Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20241211-fix_warnings_pointer_masking_tests-v6-1-c7ae708fbd2f@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-12-17hexagon: Disable constant extender optimization for LLVM prior to 19.1.0Nathan Chancellor
The Hexagon-specific constant extender optimization in LLVM may crash on Linux kernel code [1], such as fs/bcache/btree_io.c after commit 32ed4a620c54 ("bcachefs: Btree path tracepoints") in 6.12: clang: llvm/lib/Target/Hexagon/HexagonConstExtenders.cpp:745: bool (anonymous namespace)::HexagonConstExtenders::ExtRoot::operator<(const HCE::ExtRoot &) const: Assertion `ThisB->getParent() == OtherB->getParent()' failed. Stack dump: 0. Program arguments: clang --target=hexagon-linux-musl ... fs/bcachefs/btree_io.c 1. <eof> parser at end of file 2. Code generation 3. Running pass 'Function Pass Manager' on module 'fs/bcachefs/btree_io.c'. 4. Running pass 'Hexagon constant-extender optimization' on function '@__btree_node_lock_nopath' Without assertions enabled, there is just a hang during compilation. This has been resolved in LLVM main (20.0.0) [2] and backported to LLVM 19.1.0 but the kernel supports LLVM 13.0.1 and newer, so disable the constant expander optimization using the '-mllvm' option when using a toolchain that is not fixed. Cc: stable@vger.kernel.org Link: https://github.com/llvm/llvm-project/issues/99714 [1] Link: https://github.com/llvm/llvm-project/commit/68df06a0b2998765cb0a41353fcf0919bbf57ddb [2] Link: https://github.com/llvm/llvm-project/commit/2ab8d93061581edad3501561722ebd5632d73892 [3] Reviewed-by: Brian Cain <bcain@quicinc.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-17idpf: trigger SW interrupt when exiting wb_on_itr modeJoshua Hay
There is a race condition between exiting wb_on_itr and completion write backs. For example, we are in wb_on_itr mode and a Tx completion is generated by HW, ready to be written back, as we are re-enabling interrupts: HW SW | | | | idpf_tx_splitq_clean_all | | napi_complete_done | | | tx_completion_wb | idpf_vport_intr_update_itr_ena_irq That tx_completion_wb happens before the vector is fully re-enabled. Continuing with this example, it is a UDP stream and the tx_completion_wb is the last one in the flow (there are no rx packets). Because the HW generated the completion before the interrupt is fully enabled, the HW will not fire the interrupt once the timer expires and the write back will not happen. NAPI poll won't be called. We have indicated we're back in interrupt mode but nothing else will trigger the interrupt. Therefore, the completion goes unprocessed, triggering a Tx timeout. To mitigate this, fire a SW triggered interrupt upon exiting wb_on_itr. This interrupt will catch the rogue completion and avoid the timeout. Add logic to set the appropriate bits in the vector's dyn_ctl register. Fixes: 9c4a27da0ecc ("idpf: enable WB_ON_ITR") Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-17NFSD: fix management of pending async copiesOlga Kornievskaia
Currently the pending_async_copies count is decremented just before a struct nfsd4_copy is destroyed. After commit aa0ebd21df9c ("NFSD: Add nfsd4_copy time-to-live") nfsd4_copy structures sticks around for 10 lease periods after the COPY itself has completed, the pending_async_copies count stays high for a long time. This causes NFSD to avoid the use of background copy even though the actual background copy workload might no longer be running. In this patch, decrement pending_async_copies once async copy thread is done processing the copy work. Fixes: aa0ebd21df9c ("NFSD: Add nfsd4_copy time-to-live") Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-12-17idpf: add support for SW triggered interruptsJoshua Hay
SW triggered interrupts are guaranteed to fire after their timer expires, unlike Tx and Rx interrupts which will only fire after the timer expires _and_ a descriptor write back is available to be processed by the driver. Add the necessary fields, defines, and initializations to enable a SW triggered interrupt in the vector's dyn_ctl register. Reviewed-by: Madhu Chittim <madhu.chittim@intel.com> Signed-off-by: Joshua Hay <joshua.a.hay@intel.com> Tested-by: Krishneil Singh <krishneil.k.singh@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-12-17clk: thead: Fix TH1520 emmc and shdci clock rateMaksim Kiselev
In accordance with LicheePi 4A BSP the clock that comes to emmc/sdhci is 198Mhz which is got through frequency division of source clock VIDEO PLL by 4 [1]. But now the AP_SUBSYS driver sets the CLK EMMC SDIO to the same frequency as the VIDEO PLL, equal to 792 MHz. This causes emmc/sdhci to work 4 times slower. Let's fix this issue by adding fixed factor clock that divides VIDEO PLL by 4 for emmc/sdhci. Link: https://github.com/revyos/thead-kernel/blob/7563179071a314f41cdcdbfd8cf6e101e73707f3/drivers/clk/thead/clk-light-fm.c#L454 Fixes: ae81b69fd2b1 ("clk: thead: Add support for T-Head TH1520 AP_SUBSYS clocks") Signed-off-by: Maksim Kiselev <bigunclemax@gmail.com> Link: https://lore.kernel.org/r/20241210083029.92620-1-bigunclemax@gmail.com Tested-by: Xi Ruoyao <xry111@xry111.site> Reviewed-by: Drew Fustini <dfustini@tenstorrent.com> Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2024-12-17arm64: dts: broadcom: Fix L2 linesize for Raspberry Pi 5Willow Cunningham
Set the cache-line-size parameter of the L2 cache for each core to the correct value of 64 bytes. Previously, the L2 cache line size was incorrectly set to 128 bytes for the Broadcom BCM2712. This causes validation tests for the Performance Application Programming Interface (PAPI) tool to fail as they depend on sysfs accurately reporting cache line sizes. The correct value of 64 bytes is stated in the official documentation of the ARM Cortex A-72, which is linked in the comments of arm64/boot/dts/broadcom/bcm2712.dtsi as the source for cache-line-size. Fixes: faa3381267d0 ("arm64: dts: broadcom: Add minimal support for Raspberry Pi 5") Signed-off-by: Willow Cunningham <willow.e.cunningham@maine.edu> Link: https://lore.kernel.org/r/20241007212954.214724-1-willow.e.cunningham@maine.edu Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
2024-12-17btrfs: tree-checker: reject inline extent items with 0 ref countQu Wenruo
[BUG] There is a bug report in the mailing list where btrfs_run_delayed_refs() failed to drop the ref count for logical 25870311358464 num_bytes 2113536. The involved leaf dump looks like this: item 166 key (25870311358464 168 2113536) itemoff 10091 itemsize 50 extent refs 1 gen 84178 flags 1 ref#0: shared data backref parent 32399126528000 count 0 <<< ref#1: shared data backref parent 31808973717504 count 1 Notice the count number is 0. [CAUSE] There is no concrete evidence yet, but considering 0 -> 1 is also a single bit flipped, it's possible that hardware memory bitflip is involved, causing the on-disk extent tree to be corrupted. [FIX] To prevent us reading such corrupted extent item, or writing such damaged extent item back to disk, enhance the handling of BTRFS_EXTENT_DATA_REF_KEY and BTRFS_SHARED_DATA_REF_KEY keys for both inlined and key items, to detect such 0 ref count and reject them. CC: stable@vger.kernel.org # 5.4+ Link: https://lore.kernel.org/linux-btrfs/7c69dd49-c346-4806-86e7-e6f863a66f48@app.fastmail.com/ Reported-by: Frankie Fisher <frankie@terrorise.me.uk> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-12-17btrfs: split bios to the fs sector size boundaryChristoph Hellwig
Btrfs like other file systems can't really deal with I/O not aligned to it's internal block size (which strangely is called sector size in btrfs, for historical reasons), but the block layer split helper doesn't even know about that. Round down the split boundary so that all I/Os are aligned. Fixes: d5e4377d5051 ("btrfs: split zone append bios in btrfs_submit_bio") CC: stable@vger.kernel.org # 6.12 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-12-17btrfs: use bio_is_zone_append() in the completion handlerChristoph Hellwig
Otherwise it won't catch bios turned into regular writes by the block level zone write plugging. The additional test it adds is for emulated zone append. Fixes: 9b1ce7f0c6f8 ("block: Implement zone append emulation") CC: stable@vger.kernel.org # 6.12 Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-12-17btrfs: fix improper generation check in snapshot deleteJosef Bacik
We have been using the following check if (generation <= root->root_key.offset) to make decisions about whether or not to visit a node during snapshot delete. This is because for normal subvolumes this is set to 0, and for snapshots it's set to the creation generation. The idea being that if the generation of the node is less than or equal to our creation generation then we don't need to visit that node, because it doesn't belong to us, we can simply drop our reference and move on. However reloc roots don't have their generation stored in root->root_key.offset, instead that is the objectid of their corresponding fs root. This means we can incorrectly not walk into nodes that need to be dropped when deleting a reloc root. There are a variety of consequences to making the wrong choice in two distinct areas. visit_node_for_delete() 1. False positive. We think we are newer than the block when we really aren't. We don't visit the node and drop our reference to the node and carry on. This would result in leaked space. 2. False negative. We do decide to walk down into a block that we should have just dropped our reference to. However this means that the child node will have refs > 1, so we will switch to UPDATE_BACKREF, and then the subsequent walk_down_proc() will notice that btrfs_header_owner(node) != root->root_key.objectid and it'll break out of the loop, and then walk_up_proc() will drop our reference, so this appears to be ok. do_walk_down() 1. False positive. We are in UPDATE_BACKREF and incorrectly decide that we are done and don't need to update the backref for our lower nodes. This is another case that simply won't happen with relocation, as we only have to do UPDATE_BACKREF if the node below us was shared and didn't have FULL_BACKREF set, and since we don't own that node because we're a reloc root we actually won't end up in this case. 2. False negative. Again this is tricky because as described above, we simply wouldn't be here from relocation, because we don't own any of the nodes because we never set btrfs_header_owner() to the reloc root objectid, and we always use FULL_BACKREF, we never actually need to set FULL_BACKREF on any children. Having spent a lot of time stressing relocation/snapshot delete recently I've not seen this pop in practice. But this is objectively incorrect, so fix this to get the correct starting generation based on the root we're dropping to keep me from thinking there's a problem here. CC: stable@vger.kernel.org Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com>
2024-12-17drm: rework FB_CORE dependencyArnd Bergmann
The 'select FB_CORE' statement moved from CONFIG_DRM to DRM_CLIENT_LIB, but there are now configurations that have code calling into fb_core as built-in even though the client_lib itself is a loadable module: x86_64-linux-ld: drivers/gpu/drm/drm_fb_helper.o: in function `drm_fb_helper_set_suspend': drm_fb_helper.c:(.text+0x2c6): undefined reference to `fb_set_suspend' x86_64-linux-ld: drivers/gpu/drm/drm_fb_helper.o: in function `drm_fb_helper_resume_worker': drm_fb_helper.c:(.text+0x2e1): undefined reference to `fb_set_suspend' In addition to DRM_CLIENT_LIB, the 'select' needs to be at least in DRM_KMS_HELPER and DRM_GEM_SHMEM_HELPER, so add it here. This patch is the KMS_HELPER part of [1]. Fixes: dadd28d4142f ("drm/client: Add client-lib module") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/series/141411/ # 1 Link: https://patchwork.freedesktop.org/patch/msgid/20241216074450.8590-4-tzimmermann@suse.de Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2024-12-17Merge tag 'ftrace-v6.13-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ftrace fixes from Steven Rostedt: - Always try to initialize the idle functions when graph tracer starts A bug was found that when a CPU is offline when graph tracing starts and then comes online, that CPU is not traced. The fix to that was to move the initialization of the idle shadow stack over to the hot plug online logic, which also handle onlined CPUs. The issue was that it removed the initialization of the shadow stack when graph tracing starts, but the callbacks to the hot plug logic do nothing if graph tracing isn't currently running. Although that fix fixed the onlining of a CPU during tracing, it broke the CPUs that were already online. - Have microblaze not try to get the "true parent" in function tracing If function tracing and graph tracing are both enabled at the same time the parent of the functions traced by the function tracer may sometimes be the graph tracing trampoline. The graph tracing hijacks the return pointer of the function to trace it, but that can interfere with the function tracing parent output. This was fixed by using the ftrace_graph_ret_addr() function passing in the kernel stack pointer using the ftrace_regs_get_stack_pointer() function. But Al Viro reported that Microblaze does not implement the kernel_stack_pointer(regs) helper function that ftrace_regs_get_stack_pointer() uses and fails to compile when function graph tracing is enabled. It was first thought that this was a microblaze issue, but the real cause is that this only works when an architecture implements HAVE_DYNAMIC_FTRACE_WITH_ARGS, as a requirement for that config is to have ftrace always pass a valid ftrace_regs to the callbacks. That also means that the architecture supports ftrace_regs_get_stack_pointer() Microblaze does not set HAVE_DYNAMIC_FTRACE_WITH_ARGS nor does it implement ftrace_regs_get_stack_pointer() which caused it to fail to build. Only implement the "true parent" logic if an architecture has that config set" * tag 'ftrace-v6.13-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: ftrace: Do not find "true_parent" if HAVE_DYNAMIC_FTRACE_WITH_ARGS is not set fgraph: Still initialize idle shadow stacks when starting
2024-12-17Merge tag 's390-6.13-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Alexander Gordeev: - Fix DirectMap accounting in /proc/meminfo file - Fix strscpy() return code handling that led to "unsigned 'len' is never less than zero" warning - Fix the calculation determining whether to use three- or four-level paging: account KMSAN modules metadata * tag 's390-6.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/mm: Consider KMSAN modules metadata for paging levels s390/ipl: Fix never less than zero warning s390/mm: Fix DirectMap accounting
2024-12-17drm/fbdev: Select FB_CORE dependency for fbdev on DMA and TTMThomas Zimmermann
Select FB_CORE if GEM's DMA and TTM implementations support fbdev emulation. Fixes linker errors about missing symbols from the fbdev subsystem. Also see [1] for a related SHMEM fix. Fixes: dadd28d4142f ("drm/client: Add client-lib module") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/series/141411/ # 1 Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241216074450.8590-3-tzimmermann@suse.de
2024-12-17fbdev: Fix recursive dependencies wrt BACKLIGHT_CLASS_DEVICEThomas Zimmermann
Do not select BACKLIGHT_CLASS_DEVICE from FB_BACKLIGHT. The latter only controls backlight support within fbdev core code and data structures. Make fbdev drivers depend on BACKLIGHT_CLASS_DEVICE and let users select it explicitly. Fixes warnings about recursive dependencies, such as error: recursive dependency detected! symbol BACKLIGHT_CLASS_DEVICE is selected by FB_BACKLIGHT symbol FB_BACKLIGHT is selected by FB_SH_MOBILE_LCDC symbol FB_SH_MOBILE_LCDC depends on FB_DEVICE symbol FB_DEVICE depends on FB_CORE symbol FB_CORE is selected by DRM_GEM_DMA_HELPER symbol DRM_GEM_DMA_HELPER is selected by DRM_PANEL_ILITEK_ILI9341 symbol DRM_PANEL_ILITEK_ILI9341 depends on BACKLIGHT_CLASS_DEVICE BACKLIGHT_CLASS_DEVICE is user-selectable, so making drivers adapt to it is the correct approach in any case. For most drivers, backlight support is also configurable separately. v3: - Select BACKLIGHT_CLASS_DEVICE in PowerMac defconfigs (Christophe) - Fix PMAC_BACKLIGHT module dependency corner cases (Christophe) v2: - s/BACKLIGHT_DEVICE_CLASS/BACKLIGHT_CLASS_DEVICE (Helge) - Fix fbdev driver-dependency corner case (Arnd) Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241216074450.8590-2-tzimmermann@suse.de
2024-12-17Merge tag 'erofs-for-6.13-rc4-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: "The first one fixes a syzbot UAF report caused by a commit introduced in this cycle, but it also addresses a longstanding memory leak. The second one resolves a PSI memstall mis-accounting issue. The remaining patches switch file-backed mounts to use buffered I/Os by default instead of direct I/Os, since the page cache of underlay files is typically valid and maybe even dirty. This change also aligns with the default policy of loopback devices. A mount option has been added to try to use direct I/Os explicitly. Summary: - Fix (pcluster) memory leak and (sbi) UAF after umounting - Fix a case of PSI memstall mis-accounting - Use buffered I/Os by default for file-backed mounts" * tag 'erofs-for-6.13-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: use buffered I/O for file-backed mounts by default erofs: reference `struct erofs_device_info` for erofs_map_dev erofs: use `struct erofs_device_info` for the primary device erofs: add erofs_sb_free() helper MAINTAINERS: erofs: update Yue Hu's email address erofs: fix PSI memstall accounting erofs: fix rare pcluster memory leak after unmounting
2024-12-17locking/rtmutex: Make sure we wake anything on the wake_q when we release ↵John Stultz
the lock->wait_lock Bert reported seeing occasional boot hangs when running with PREEPT_RT and bisected it down to commit 894d1b3db41c ("locking/mutex: Remove wakeups from under mutex::wait_lock"). It looks like I missed a few spots where we drop the wait_lock and potentially call into schedule without waking up the tasks on the wake_q structure. Since the tasks being woken are ww_mutex tasks they need to be able to run to release the mutex and unblock the task that currently is planning to wake them. Thus we can deadlock. So make sure we wake the wake_q tasks when we unlock the wait_lock. Closes: https://lore.kernel.org/lkml/20241211182502.2915-1-spasswolf@web.de Fixes: 894d1b3db41c ("locking/mutex: Remove wakeups from under mutex::wait_lock") Reported-by: Bert Karwatzki <spasswolf@web.de> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241212222138.2400498-1-jstultz@google.com
2024-12-17perf/x86/intel/ds: Add PEBS format 6Kan Liang
The only difference between 5 and 6 is the new counters snapshotting group, without the following counters snapshotting enabling patches, it's impossible to utilize the feature in a PEBS record. It's safe to share the same code path with format 5. Add format 6, so the end user can at least utilize the legacy PEBS features. Fixes: a932aa0e868f ("perf/x86: Add Lunar Lake and Arrow Lake support") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20241216204505.748363-1-kan.liang@linux.intel.com
2024-12-17perf/x86/intel/uncore: Add Clearwater Forest supportKan Liang
From the perspective of the uncore PMU, the Clearwater Forest is the same as the previous Sierra Forest. The only difference is the event list, which will be supported in the perf tool later. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20241211161146.235253-1-kan.liang@linux.intel.com