linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-06-25	drm/amd/swsmu: add MALL init support workaround for smu_v14_0_1	Li Ma
	[Why] SMU firmware has not supported MALL PG. [How] Disable MALL PG and make it always on until SMU firmware is ready. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-25	RISC-V: fix vector insn load/store width mask	Jesse Taube
	RVFDQ_FL_FS_WIDTH_MASK should be 3 bits [14-12], shifted down by 12 bits. Replace GENMASK(3, 0) with GENMASK(2, 0). Fixes: cd054837243b ("riscv: Allocate user's vector context in the first-use trap") Signed-off-by: Jesse Taube <jesse@rivosinc.com> Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> Link: https://lore.kernel.org/r/20240606182800.415831-1-jesse@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-25	selftests: net: remove unneeded IP_GRE config	Yujie Liu
	It seems that there is no definition for config IP_GRE, and it is not a dependency of other configs, so remove it. linux$ find -name Kconfig \| xargs grep "IP_GRE" <-- nothing There is a IPV6_GRE config defined in net/ipv6/Kconfig. It only depends on NET_IPGRE_DEMUX but not IP_GRE. Signed-off-by: Yujie Liu <yujie.liu@intel.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20240624055539.2092322-1-yujie.liu@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-25	l2tp: remove incorrect __rcu attribute	James Chapman
	This fixes a sparse warning. Fixes: d18d3f0a24fc ("l2tp: replace hlist with simple list for per-tunnel session list") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202406220754.evK8Hrjw-lkp@intel.com/ Signed-off-by: James Chapman <jchapman@katalix.com> Link: https://patch.msgid.link/20240624082945.1925009-1-jchapman@katalix.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-26	kbuild: Use $(obj)/%.cc to fix host C++ module builds	Nicolas Schier
	Use $(obj)/ instead of $(src)/ prefix when building C++ modules for host, as explained in commit b1992c3772e6 ("kbuild: use $(src) instead of $(srctree)/$(src) for source directory"). This fixes build failures of 'xconfig': $ make O=build/ xconfig make[1]: Entering directory '/data/linux/kbuild-review/build' GEN Makefile make[3]: *** No rule to make target '../scripts/kconfig/qconf-moc.cc', needed by 'scripts/kconfig/qconf-moc.o'. Stop. Fixes: b1992c3772e6 ("kbuild: use $(src) instead of $(srctree)/$(src) for source directory") Reported-by: Rolf Eike Beer <eb@emlix.com> Signed-off-by: Nicolas Schier <n.schier@avm.de> Tested-by: Rolf Eike Beer <eb@emlix.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-06-26	kbuild: rpm-pkg: fix build error with CONFIG_MODULES=n	Masahiro Yamada
	When CONFIG_MODULES is disabled, 'make (bin)rpm-pkg' fails: $ make allnoconfig binrpm-pkg [ snip ] error: File not found: .../linux/rpmbuild/BUILDROOT/kernel-6.10.0_rc3-1.i386/lib/modules/6.10.0-rc3/kernel error: File not found: .../linux/rpmbuild/BUILDROOT/kernel-6.10.0_rc3-1.i386/lib/modules/6.10.0-rc3/modules.order To make it work irrespective of CONFIG_MODULES, this commit specifies the directory path, /lib/modules/%{KERNELRELEASE}, instead of individual files. However, doing so would cause new warnings: warning: File listed twice: /lib/modules/6.10.0-rc3-dirty/modules.alias warning: File listed twice: /lib/modules/6.10.0-rc3-dirty/modules.alias.bin warning: File listed twice: /lib/modules/6.10.0-rc3-dirty/modules.builtin.alias.bin warning: File listed twice: /lib/modules/6.10.0-rc3-dirty/modules.builtin.bin warning: File listed twice: /lib/modules/6.10.0-rc3-dirty/modules.dep warning: File listed twice: /lib/modules/6.10.0-rc3-dirty/modules.dep.bin warning: File listed twice: /lib/modules/6.10.0-rc3-dirty/modules.devname warning: File listed twice: /lib/modules/6.10.0-rc3-dirty/modules.softdep warning: File listed twice: /lib/modules/6.10.0-rc3-dirty/modules.symbols warning: File listed twice: /lib/modules/6.10.0-rc3-dirty/modules.symbols.bin These files exist in /lib/modules/%{KERNELRELEASE} and are also explicitly marked as %ghost. Suppress depmod because depmod-generated files are not packaged. Fixes: 615b3a3d2d41 ("kbuild: rpm-pkg: do not include depmod-generated files") Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Nathan Chancellor <nathan@kernel.org>
2024-06-26	kbuild: Fix build target deb-pkg: ln: failed to create hard link	Thayne Harbaugh
	The make deb-pkg target calls debian-orig which attempts to either hard link the source .tar to the build-output location or copy the source .tar to the build-output location. The test to determine whether to ln or cp is incorrectly expanded by Make and consequently always attempts to ln the source .tar. This fix corrects the escaping of '$' so that the test is expanded by the shell rather than by Make and appropriately selects between ln and cp. Fixes: b44aa8c96e9e ("kbuild: deb-pkg: make .orig tarball a hard link if possible") Signed-off-by: Thayne Harbaugh <thayne@mastodonlabs.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-06-26	kbuild: doc: Update default INSTALL_MOD_DIR from extra to updates	Mark-PK Tsai
	The default INSTALL_MOD_DIR was changed from 'extra' to 'updates' in commit b74d7bb7ca24 ("kbuild: Modify default INSTALL_MOD_DIR from extra to updates"). This commit updates the documentation to align with the latest kernel. Fixes: b74d7bb7ca24 ("kbuild: Modify default INSTALL_MOD_DIR from extra to updates") Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-06-26	kbuild: Install dtb files as 0644 in Makefile.dtbinst	Dragan Simic
	The compiled dtb files aren't executable, so install them with 0644 as their permission mode, instead of defaulting to 0755 for the permission mode and installing them with the executable bits set. Some Linux distributions, including Debian, [1][2][3] already include fixes in their kernel package build recipes to change the dtb file permissions to 0644 in their kernel packages. These changes, when additionally propagated into the long-term kernel versions, will allow such distributions to remove their downstream fixes. [1] https://salsa.debian.org/kernel-team/linux/-/merge_requests/642 [2] https://salsa.debian.org/kernel-team/linux/-/merge_requests/749 [3] https://salsa.debian.org/kernel-team/linux/-/blob/debian/6.8.12-1/debian/rules.real#L193 Cc: Diederik de Haas <didi.debian@cknow.org> Cc: <stable@vger.kernel.org> Fixes: aefd80307a05 ("kbuild: refactor Makefile.dtbinst more") Signed-off-by: Dragan Simic <dsimic@manjaro.org> Reviewed-by: Nicolas Schier <nicolas@fjasle.eu> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-06-25	hrtimer: Prevent queuing of hrtimer without a function callback	Phil Chang
	The hrtimer function callback must not be NULL. It has to be specified by the call side but it is not validated by the hrtimer code. When a hrtimer is queued without a function callback, the kernel crashes with a null pointer dereference when trying to execute the callback in __run_hrtimer(). Introduce a validation before queuing the hrtimer in hrtimer_start_range_ns(). [anna-maria: Rephrase commit message] Signed-off-by: Phil Chang <phil.chang@mediatek.com> Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
2024-06-25	Revert "nfsd: fix oops when reading pool_stats before server is started"	NeilBrown
	This reverts commit 8e948c365d9c10b685d1deb946bd833d6a9b43e0. The reverted commit moves a test on a field protected by a mutex outside of the protection of that mutex, and so is obviously racey. Depending on how the race goes, si->serv might be NULL when dereferenced in svc_pool_stats_start(), or svc_pool_stats_stop() might unlock a mutex that hadn't been locked. This bug that the commit tried to fix has been addressed by initialising ->mutex earlier. Fixes: 8e948c365d9c ("nfsd: fix oops when reading pool_stats before server is started") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-06-25	nfsd: initialise nfsd_info.mutex early.	NeilBrown
	nfsd_info.mutex can be dereferenced by svc_pool_stats_start() immediately after the new netns is created. Currently this can trigger an oops. Move the initialisation earlier before it can possibly be dereferenced. Fixes: 7b207ccd9833 ("svc: don't hold reference for poolstats, only mutex.") Reported-by: Sourabh Jain <sourabhjain@linux.ibm.com> Closes: https://lore.kernel.org/all/c2e9f6de-1ec4-4d3a-b18d-d5a6ec0814a0@linux.ibm.com/ Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2024-06-25	linux/syscalls.h: add missing __user annotations	Arnd Bergmann
	A couple of declarations in linux/syscalls.h are missing __user annotations on their pointers, which can lead to warnings from sparse because these don't match the implementation that have the correct address space annotations. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	syscalls: mmap(): use unsigned offset type consistently	Arnd Bergmann
	Most architectures that implement the old-style mmap() with byte offset use 'unsigned long' as the type for that offset, but microblaze and riscv have the off_t type that is shared with userspace, matching the prototype in include/asm-generic/syscalls.h. Make this consistent by using an unsigned argument everywhere. This changes the behavior slightly, as the argument is shifted to a page number, and an user input with the top bit set would result in a negative page offset rather than a large one as we use elsewhere. For riscv, the 32-bit sys_mmap2() definition actually used a custom type that is different from the global declaration, but this was missed due to an incorrect type check. Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	s390: remove native mmap2() syscall	Arnd Bergmann
	The mmap2() syscall has never been used on 64-bit s390x and should have been removed as part of 5a79859ae0f3 ("s390: remove 31 bit support"). Remove it now. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	hexagon: fix fadvise64_64 calling conventions	Arnd Bergmann
	fadvise64_64() has two 64-bit arguments at the wrong alignment for hexagon, which turns them into a 7-argument syscall that is not supported by Linux. The downstream musl port for hexagon actually asks for a 6-argument version the same way we do it on arm, csky, powerpc, so make the kernel do it the same way to avoid having to change both. Link: https://github.com/quic/musl/blob/hexagon/arch/hexagon/syscall_arch.h#L78 Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	csky, hexagon: fix broken sys_sync_file_range	Arnd Bergmann
	Both of these architectures require u64 function arguments to be passed in even/odd pairs of registers or stack slots, which in case of sync_file_range would result in a seven-argument system call that is not currently possible. The system call is therefore incompatible with all existing binaries. While it would be possible to implement support for seven arguments like on mips, it seems better to use a six-argument version, either with the normal argument order but misaligned as on most architectures or with the reordered sync_file_range2() calling conventions as on arm and powerpc. Cc: stable@vger.kernel.org Acked-by: Guo Ren <guoren@kernel.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	sh: rework sync_file_range ABI	Arnd Bergmann
	The unusual function calling conventions on SuperH ended up causing sync_file_range to have the wrong argument order, with the 'flags' argument getting sorted before 'nbytes' by the compiler. In userspace, I found that musl, glibc, uclibc and strace all expect the normal calling conventions with 'nbytes' last, so changing the kernel to match them should make all of those work. In order to be able to also fix libc implementations to work with existing kernels, they need to be able to tell which ABI is used. An easy way to do this is to add yet another system call using the sync_file_range2 ABI that works the same on all architectures. Old user binaries can now work on new kernels, and new binaries can try the new sync_file_range2() to work with new kernels or fall back to the old sync_file_range() version if that doesn't exist. Cc: stable@vger.kernel.org Fixes: 75c92acdd5b1 ("sh: Wire up new syscalls.") Acked-by: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	powerpc: restore some missing spu syscalls	Arnd Bergmann
	A couple of system calls were inadventently removed from the table during a bugfix for 32-bit powerpc entry. Restore the original behavior. Fixes: e23750623835 ("powerpc/32: fix syscall wrappers with 64-bit arguments of unaligned register-pairs") Acked-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	parisc: use generic sys_fanotify_mark implementation	Arnd Bergmann
	The sys_fanotify_mark() syscall on parisc uses the reverse word order for the two halves of the 64-bit argument compared to all syscalls on all 32-bit architectures. As far as I can tell, the problem is that the function arguments on parisc are sorted backwards (26, 25, 24, 23, ...) compared to everyone else, so the calling conventions of using an even/odd register pair in native word order result in the lower word coming first in function arguments, matching the expected behavior on little-endian architectures. The system call conventions however ended up matching what the other 32-bit architectures do. A glibc cleanup in 2020 changed the userspace behavior in a way that handles all architectures consistently, but this inadvertently broke parisc32 by changing to the same method as everyone else. The change made it into glibc-2.35 and subsequently into debian 12 (bookworm), which is the latest stable release. This means we need to choose between reverting the glibc change or changing the kernel to match it again, but either hange will leave some systems broken. Pick the option that is more likely to help current and future users and change the kernel to match current glibc. This also means the behavior is now consistent across architectures, but it breaks running new kernels with old glibc builds before 2.35. Link: https://sourceware.org/git/?p=glibc.git;a=commitdiff;h=d150181d73d9 Link: https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/arch/parisc/kernel/sys_parisc.c?h=57b1dfbd5b4a39d Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Tested-by: Helge Deller <deller@gmx.de> Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de> --- I found this through code inspection, please double-check to make sure I got the bug and the fix right. The alternative is to fix this by reverting glibc back to the unusual behavior.
2024-06-25	parisc: use correct compat recv/recvfrom syscalls	Arnd Bergmann
	Johannes missed parisc back when he introduced the compat version of these syscalls, so receiving cmsg messages that require a compat conversion is still broken. Use the correct calls like the other architectures do. Fixes: 1dacc76d0014 ("net/compat/wext: send different messages to compat tasks") Acked-by: Helge Deller <deller@gmx.de> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	sparc: fix compat recv/recvfrom syscalls	Arnd Bergmann
	sparc has the wrong compat version of recv() and recvfrom() for both the direct syscalls and socketcall(). The direct syscalls just need to use the compat version. For socketcall, the same thing could be done, but it seems better to completely remove the custom assembler code for it and just use the same implementation that everyone else has. Fixes: 1dacc76d0014 ("net/compat/wext: send different messages to compat tasks") Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	sparc: fix old compat_sys_select()	Arnd Bergmann
	sparc has two identical select syscalls at numbers 93 and 230, respectively. During the conversion to the modern syscall.tbl format, the older one of the two broke in compat mode, and now refers to the native 64-bit syscall. Restore the correct behavior. This has very little effect, as glibc has been using the newer number anyway. Fixes: 6ff645dd683a ("sparc: add system call table generation support") Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	syscalls: fix compat_sys_io_pgetevents_time64 usage	Arnd Bergmann
	Using sys_io_pgetevents() as the entry point for compat mode tasks works almost correctly, but misses the sign extension for the min_nr and nr arguments. This was addressed on parisc by switching to compat_sys_io_pgetevents_time64() in commit 6431e92fc827 ("parisc: io_pgetevents_time64() needs compat syscall in 32-bit compat mode"), as well as by using more sophisticated system call wrappers on x86 and s390. However, arm64, mips, powerpc, sparc and riscv still have the same bug. Change all of them over to use compat_sys_io_pgetevents_time64() like parisc already does. This was clearly the intention when the function was originally added, but it got hooked up incorrectly in the tables. Cc: stable@vger.kernel.org Fixes: 48166e6ea47d ("y2038: add 64-bit time_t syscalls to all 32-bit architectures") Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25	net: ethernet: mtk_eth_soc: ppe: prevent ppe update for non-mtk devices	Elad Yifee
	Introduce an additional validation to ensure that the PPE index is modified exclusively for mtk_eth ingress devices. This primarily addresses the issue related to WED operation with multiple PPEs. Fixes: dee4dd10c79a ("net: ethernet: mtk_eth_soc: ppe: add support for multiple PPEs") Signed-off-by: Elad Yifee <eladwf@gmail.com> Link: https://lore.kernel.org/r/20240623175113.24437-1-eladwf@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	s390/boot: Do not adjust GOT entries for undef weak sym	Jens Remus
	Since commit 778666df60f0 ("s390: compile relocatable kernel without -fPIE") and commit 00cda11d3b2e ("s390: Compile kernel with -fPIC and link with -no-pie") the kernel on s390x may have a Global Offset Table (GOT) whose entries are adjusted for KASLR in kaslr_adjust_got(). The GOT may contain entries for undefined weak symbols that resolved to zero. That is the resulting GOT entry value is zero. Adjusting those entries unconditionally in kaslr_adjust_got() is wrong. Otherwise the following sample code would erroneously assume foo to be defined, due to the adjustment changing the zero-value to a non-zero one: extern int foo __attribute__((weak)); if (foo) / foo is defined [or undefined and erroneously adjusted] / The vmlinux build at commit 00cda11d3b2e ("s390: Compile kernel with -fPIC and link with -no-pie") with defconfig actually had two GOT entries for the undefined weak symbols __start_BTF and __stop_BTF: $ objdump -tw vmlinux \| grep -F "UND" 0000000000000000 w UND* 0000000000000000 __stop_BTF 0000000000000000 w UND 0000000000000000 __start_BTF $ readelf -rw vmlinux \| grep -E "R_390_GOTENT +0{16}" 000000345760 2776a0000001a R_390_GOTENT 0000000000000000 __stop_BTF + 2 000000345766 2d5480000001a R_390_GOTENT 0000000000000000 __start_BTF + 2 The s390-specific vmlinux linker script sets the section start to __START_KERNEL, which is currently defined as 0x100000 on s390x. Access to lowcore is performed via a pointer of 0 and not a symbol in a section starting at 0. The first 64K are reserved for the loader on s390x. Thus it is safe to assume that __START_KERNEL will never be 0. As a result there cannot be any defined symbols resolving to zero in the kernel. Note that the first three GOT entries are reserved for the dynamic loader on s390x. [1] In the kernel they are zero. Therefore no extra handling is required to skip these. Skip adjusting GOT entries with a value of zero in kaslr_adjust_got(). While at it update the comment when a GOT exists on s390x. Since commit 00cda11d3b2e ("s390: Compile kernel with -fPIC and link with -no-pie") it no longer only exists when compiling with Clang, but also with GCC. [1]: s390x ELF ABI, section "Global Offset Table", https://github.com/IBM/s390x-abi/releases Fixes: 778666df60f0 ("s390: compile relocatable kernel without -fPIE") Reviewed-by: Ilya Leoshkevich <iii@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2024-06-25	thermal: gov_step_wise: Go straight to instance->lower when mitigation is over	Rafael J. Wysocki
	Commit b6846826982b ("thermal: gov_step_wise: Restore passive polling management") attempted to fix a Step-Wise thermal governor issue introduced by commit 042a3d80f118 ("thermal: core: Move passive polling management to the core"), which caused the governor to leave cooling devices in high states, by partially reverting that commit. However, this turns out to be insufficient on some systems due to interactions between the governor code restored by commit b6846826982b and the passive polling management in the thermal core. For this reason, revert commit b6846826982b and make the governor set the target cooling device state to the "lower" one as soon as the zone temperature falls below the threshold of the trip point corresponding to the given thermal instance, which means that thermal mitigation is not necessary any more. Before this change the "lower" cooling device state would be reached in steps through the passive polling mechanism which was questionable for three reasons: (1) cooling device were kept in high states when that was not necessary (and it could adversely impact performance), (2) it only worked for thermal zones with nonzero passive_delay_jiffies value, and (3) passive polling belongs to the core and should not be hijacked by governors for their internal purposes. Fixes: b6846826982b ("thermal: gov_step_wise: Restore passive polling management") Closes: https://lore.kernel.org/linux-pm/6759ce9f-281d-4fcd-bb4c-b784a1cc5f6e@oldschoolsolutions.biz Reported-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz> Tested-by: Jens Glathe <jens.glathe@oldschoolsolutions.biz> Link: https://patch.msgid.link/12464461.O9o76ZdvQC@rjwysocki.net Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Steev Klimaszewski <steev@kali.org> Tested-by: Johan Hovold <johan+linaro@kernel.org>
2024-06-25	net: dsa: microchip: fix wrong register write when masking interrupt	Tristram Ha
	The switch global port interrupt mask, REG_SW_PORT_INT_MASK__4, is defined as 0x001C in ksz9477_reg.h. The designers used 32-bit value in anticipation for increase of port count in future product but currently the maximum port count is 7 and the effective value is 0x7F in register 0x001F. Each port has its own interrupt mask and is defined as 0x#01F. It uses only 4 bits for different interrupts. The developer who implemented the current interrupt mechanism in the switch driver noticed there are similarities between the mechanism to mask port interrupts in global interrupt and individual interrupts in each port and so used the same code to handle these interrupts. He updated the code to use the new macro REG_SW_PORT_INT_MASK__1 which is defined as 0x1F in ksz_common.h but he forgot to update the 32-bit write to 8-bit as now the mask registers are 0x1F and 0x#01F. In addition all KSZ switches other than the KSZ9897/KSZ9893 and LAN937X families use only 8-bit access and so this common code will eventually be changed to accommodate them. Fixes: e1add7dd6183 ("net: dsa: microchip: use common irq routines for girq and pirq") Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Link: https://lore.kernel.org/r/1719009262-2948-1-git-send-email-Tristram.Ha@microchip.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	ALSA: dmaengine_pcm: terminate dmaengine before synchronize	Shengjiu Wang
	When dmaengine supports pause function, in suspend state, dmaengine_pause() is called instead of dmaengine_terminate_async(), In end of playback stream, the runtime->state will go to SNDRV_PCM_STATE_DRAINING, if system suspend & resume happen at this time, application will not resume playback stream, the stream will be closed directly, the dmaengine_terminate_async() will not be called before the dmaengine_synchronize(), which violates the call sequence for dmaengine_synchronize(). This behavior also happens for capture streams, but there is no SNDRV_PCM_STATE_DRAINING state for capture. So use dmaengine_tx_status() to check the DMA status if the status is DMA_PAUSED, then call dmaengine_terminate_async() to terminate dmaengine before dmaengine_synchronize(). Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://patch.msgid.link/1718851218-27803-1-git-send-email-shengjiu.wang@nxp.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-06-25	ALSA: hda/relatek: Enable Mute LED on HP Laptop 15-gw0xxx	Aivaz Latypov
	This HP Laptop uses ALC236 codec with COEF 0x07 controlling the mute LED. Enable existing quirk for this device. Signed-off-by: Aivaz Latypov <reichaivaz@gmail.com> Link: https://patch.msgid.link/20240625081217.1049-1-reichaivaz@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-06-25	ALSA: PCM: Allow resume only for suspended streams	Takashi Iwai
	snd_pcm_resume() should bail out if the stream isn't in a suspended state. Otherwise it'd allow doubly resume. Link: https://patch.msgid.link/20240624125443.27808-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-06-25	Merge branch 'net-macb-wol-enhancements'	Paolo Abeni
	Vineeth Karumanchi says: ==================== net: macb: WOL enhancements - Add provisioning for queue tie-off and queue disable during suspend. - Add support for ARP packet types to WoL. - Advertise WoL attributes by default. - Extend MACB supported WoL modes to the PHY supported WoL modes. - Deprecate magic-packet property. v6: https://lore.kernel.org/netdev/20240617070413.2291511-1-vineeth.karumanchi@amd.com/ v5: https://lore.kernel.org/netdev/20240611162827.887162-1-vineeth.karumanchi@amd.com/ v4: https://lore.kernel.org/lkml/20240610053936.622237-1-vineeth.karumanchi@amd.com/ v3: https://lore.kernel.org/netdev/20240605102457.4050539-1-vineeth.karumanchi@amd.com/ v2: https://lore.kernel.org/netdev/20240222153848.2374782-1-vineeth.karumanchi@amd.com/ v1: https://lore.kernel.org/lkml/20240130104845.3995341-1-vineeth.karumanchi@amd.com/#t ==================== Link: https://lore.kernel.org/r/20240621045735.3031357-1-vineeth.karumanchi@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	dt-bindings: net: cdns,macb: Deprecate magic-packet property	Vineeth Karumanchi
	WOL modes such as magic-packet should be an OS policy. By default, advertise supported modes and use ethtool to activate the required mode. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vineeth Karumanchi <vineeth.karumanchi@amd.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	net: macb: Add ARP support to WOL	Vineeth Karumanchi
	Extend wake-on LAN support with an ARP packet. Currently, if PHY supports WOL, ethtool ignores the modes supported by MACB. This change extends the WOL modes with MACB supported modes. Advertise wake-on LAN supported modes by default without relying on dt node. By default, wake-on LAN will be in disabled state. Using ethtool, users can enable/disable or choose packet types. For wake-on LAN via ARP, ensure the IP address is assigned and report an error otherwise. Co-developed-by: Harini Katakam <harini.katakam@amd.com> Signed-off-by: Harini Katakam <harini.katakam@amd.com> Signed-off-by: Vineeth Karumanchi <vineeth.karumanchi@amd.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Tested-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> # on SAMA7G5 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	net: macb: Enable queue disable	Vineeth Karumanchi
	Enable queue disable for Versal devices. Signed-off-by: Vineeth Karumanchi <vineeth.karumanchi@amd.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	net: macb: queue tie-off or disable during WOL suspend	Vineeth Karumanchi
	When GEM is used as a wake device, it is not mandatory for the RX DMA to be active. The RX engine in IP only needs to receive and identify a wake packet through an interrupt. The wake packet is of no further significance; hence, it is not required to be copied into memory. By disabling RX DMA during suspend, we can avoid unnecessary DMA processing of any incoming traffic. During suspend, perform either of the below operations: - tie-off/dummy descriptor: Disable unused queues by connecting them to a looped descriptor chain without free slots. - queue disable: The newer IP version allows disabling individual queues. Co-developed-by: Harini Katakam <harini.katakam@amd.com> Signed-off-by: Harini Katakam <harini.katakam@amd.com> Signed-off-by: Vineeth Karumanchi <vineeth.karumanchi@amd.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Tested-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> # on SAMA7G5 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	ALSA: seq: Fix missing channel at encoding RPN/NRPN MIDI2 messages	Takashi Iwai
	The conversion from the legacy event to MIDI2 UMP for RPN and NRPN missed the setup of the channel number, resulting in always the channel 0. Fix it. Fixes: e9e02819a98a ("ALSA: seq: Automatic conversion of UMP events") Link: https://patch.msgid.link/20240625095200.25745-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-06-25	Fix race for duplicate reqsk on identical SYN	luoxuanqiang
	When bonding is configured in BOND_MODE_BROADCAST mode, if two identical SYN packets are received at the same time and processed on different CPUs, it can potentially create the same sk (sock) but two different reqsk (request_sock) in tcp_conn_request(). These two different reqsk will respond with two SYNACK packets, and since the generation of the seq (ISN) incorporates a timestamp, the final two SYNACK packets will have different seq values. The consequence is that when the Client receives and replies with an ACK to the earlier SYNACK packet, we will reset(RST) it. ======================================================================== This behavior is consistently reproducible in my local setup, which comprises: \| NETA1 ------ NETB1 \| PC_A --- bond --- \| \| --- bond --- PC_B \| NETA2 ------ NETB2 \| - PC_A is the Server and has two network cards, NETA1 and NETA2. I have bonded these two cards using BOND_MODE_BROADCAST mode and configured them to be handled by different CPU. - PC_B is the Client, also equipped with two network cards, NETB1 and NETB2, which are also bonded and configured in BOND_MODE_BROADCAST mode. If the client attempts a TCP connection to the server, it might encounter a failure. Capturing packets from the server side reveals: 10.10.10.10.45182 > localhost: Flags [S], seq 320236027, 10.10.10.10.45182 > localhost: Flags [S], seq 320236027, localhost > 10.10.10.10.45182: Flags [S.], seq 2967855116, localhost > 10.10.10.10.45182: Flags [S.], seq 2967855123, <== 10.10.10.10.45182 > localhost: Flags [.], ack 4294967290, 10.10.10.10.45182 > localhost: Flags [.], ack 4294967290, localhost > 10.10.10.10.45182: Flags [R], seq 2967855117, <== localhost > 10.10.10.10.45182: Flags [R], seq 2967855117, Two SYNACKs with different seq numbers are sent by localhost, resulting in an anomaly. ======================================================================== The attempted solution is as follows: Add a return value to inet_csk_reqsk_queue_hash_add() to confirm if the ehash insertion is successful (Up to now, the reason for unsuccessful insertion is that a reqsk for the same connection has already been inserted). If the insertion fails, release the reqsk. Due to the refcnt, Kuniyuki suggests also adding a return value check for the DCCP module; if ehash insertion fails, indicating a successful insertion of the same connection, simply release the reqsk as well. Simultaneously, In the reqsk_queue_hash_req(), the start of the req->rsk_timer is adjusted to be after successful insertion. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: luoxuanqiang <luoxuanqiang@kylinos.cn> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20240621013929.1386815-1-luoxuanqiang@kylinos.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	Merge branch 'af_unix-remove-spin_lock_nested-and-convert-to-lock_cmp_fn'	Paolo Abeni
	Kuniyuki Iwashima says: ==================== af_unix: Remove spin_lock_nested() and convert to lock_cmp_fn. This series removes spin_lock_nested() in AF_UNIX and instead defines the locking orders as functions tied to each lock by lockdep_set_lock_cmp_fn(). When the defined function returns a negative value, lockdep considers it will not cause deadlock. (See ->cmp_fn() in check_deadlock() and check_prev_add().) When we cannot define the total ordering, we return -1 for the allowed ordering and otherwise 0 as undefined. [0] [0]: https://lore.kernel.org/netdev/thzkgbuwuo3knevpipu4rzsh5qgmwhklihypdgziiruabvh46f@uwdkpcfxgloo/ Changes: v4: * Patch 4 * Make unix_state_lock_cmp_fn() symmetric. v3: https://lore.kernel.org/netdev/20240614200715.93150-1-kuniyu@amazon.com/ * Patch 3 * Cache sk->sk_state * s/unix_state_lock()/unix_state_unlock()/ * Patch 8 * Add embryo -> listener locking order v2: https://lore.kernel.org/netdev/20240611222905.34695-1-kuniyu@amazon.com/ * Patch 1 & 2 * Use (((l) > (r)) - ((l) < (r))) for comparison v1: https://lore.kernel.org/netdev/20240610223501.73191-1-kuniyu@amazon.com/ ==================== Link: https://lore.kernel.org/r/20240620205623.60139-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Don't use spin_lock_nested() in copy_peercred().	Kuniyuki Iwashima
	When (AF_UNIX, SOCK_STREAM) socket connect()s to a listening socket, the listener's sk_peer_pid/sk_peer_cred are copied to the client in copy_peercred(). Then, two sk_peer_locks are held there; one is client's and another is listener's. However, the latter is not needed because we hold the listner's unix_state_lock() there and unix_listen() cannot update the cred concurrently. Let's drop the unnecessary spin_lock() and use the bare spin_lock() for the client to protect concurrent read by getsockopt(SO_PEERCRED). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Remove put_pid()/put_cred() in copy_peercred().	Kuniyuki Iwashima
	When (AF_UNIX, SOCK_STREAM) socket connect()s to a listening socket, the listener's sk_peer_pid/sk_peer_cred are copied to the client in copy_peercred(). Then, the client's sk_peer_pid and sk_peer_cred are always NULL, so we need not call put_pid() and put_cred() there. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Set sk_peer_pid/sk_peer_cred locklessly for new socket.	Kuniyuki Iwashima
	init_peercred() is called in 3 places: 1. socketpair() : both sockets 2. connect() : child socket 3. listen() : listening socket The first two need not hold sk_peer_lock because no one can touch the socket. Let's set cred/pid without holding lock for the two cases and rename the old init_peercred() to update_peercred() to properly reflect the use case. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Define locking order for U_RECVQ_LOCK_EMBRYO in unix_collect_skb().	Kuniyuki Iwashima
	While GC is cleaning up cyclic references by SCM_RIGHTS, unix_collect_skb() collects skb in the socket's recvq. If the socket is TCP_LISTEN, we need to collect skb in the embryo's queue. Then, both the listener's recvq lock and the embroy's one are held. The locking is always done in the listener -> embryo order. Let's define it as unix_recvq_lock_cmp_fn() instead of using spin_lock_nested(). Note that the reverse order is defined for consistency. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Remove U_LOCK_GC_LISTENER.	Kuniyuki Iwashima
	Commit 1971d13ffa84 ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().") added U_LOCK_GC_LISTENER for the old GC, but it's no longer needed for the new GC. Let's remove U_LOCK_GC_LISTENER and unix_state_lock_nested() as there's no user. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Remove U_LOCK_DIAG.	Kuniyuki Iwashima
	sk_diag_dump_icons() acquires embryo's lock by unix_state_lock_nested() to fetch its peer. The embryo's ->peer is set to NULL only when its parent listener is close()d. Then, unix_release_sock() is called for each embryo after unlinking skb by skb_dequeue(). In sk_diag_dump_icons(), we hold the parent's recvq lock, so we need not acquire unix_state_lock_nested(), and peer is always non-NULL. Let's remove unnecessary unix_state_lock_nested() and non-NULL test for peer. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Don't acquire unix_state_lock() for sock_i_ino().	Kuniyuki Iwashima
	sk_diag_dump_peer() and sk_diag_dump() call unix_state_lock() for sock_i_ino() which reads SOCK_INODE(sk->sk_socket)->i_ino, but it's protected by sk->sk_callback_lock. Let's remove unnecessary unix_state_lock(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Define locking order for U_LOCK_SECOND in unix_stream_connect().	Kuniyuki Iwashima
	While a SOCK_(STREAM\|SEQPACKET) socket connect()s to another, we hold two locks of them by unix_state_lock() and unix_state_lock_nested() in unix_stream_connect(). Before unix_state_lock_nested(), the following is guaranteed by checking sk->sk_state: 1. The first socket is TCP_LISTEN 2. The second socket is not the first one 3. Simultaneous connect() must fail So, the client state can be TCP_CLOSE or TCP_LISTEN or TCP_ESTABLISHED. Let's define the expected states as unix_state_lock_cmp_fn() instead of using unix_state_lock_nested(). Note that 2. is detected by debug_spin_lock_before() and 3. cannot be expressed as lock_cmp_fn. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Don't retry after unix_state_lock_nested() in unix_stream_connect().	Kuniyuki Iwashima
	When a SOCK_(STREAM\|SEQPACKET) socket connect()s to another one, we need to lock the two sockets to check their states in unix_stream_connect(). We use unix_state_lock() for the server and unix_state_lock_nested() for client with tricky sk->sk_state check to avoid deadlock. The possible deadlock scenario are the following: 1) Self connect() 2) Simultaneous connect() The former is simple, attempt to grab the same lock, and the latter is AB-BA deadlock. After the server's unix_state_lock(), we check the server socket's state, and if it's not TCP_LISTEN, connect() fails with -EINVAL. Then, we avoid the former deadlock by checking the client's state before unix_state_lock_nested(). If its state is not TCP_LISTEN, we can make sure that the client and the server are not identical based on the state. Also, the latter deadlock can be avoided in the same way. Due to the server sk->sk_state requirement, AB-BA deadlock could happen only with TCP_LISTEN sockets. So, if the client's state is TCP_LISTEN, we can give up the second lock to avoid the deadlock. CPU 1 CPU 2 CPU 3 connect(A -> B) connect(B -> A) listen(A) --- --- --- unix_state_lock(B) B->sk_state == TCP_LISTEN READ_ONCE(A->sk_state) == TCP_CLOSE ^^^^^^^^^ ok, will lock A unix_state_lock(A) .--------------' WRITE_ONCE(A->sk_state, TCP_LISTEN) \| unix_state_unlock(A) \| \| unix_state_lock(A) \| A->sk_sk_state == TCP_LISTEN \| READ_ONCE(B->sk_state) == TCP_LISTEN v ^^^^^^^^^^ unix_state_lock_nested(A) Don't lock B !! Currently, while checking the client's state, we also check if it's TCP_ESTABLISHED, but this is unlikely and can be checked after we know the state is not TCP_CLOSE. Moreover, if it happens after the second lock, we now jump to the restart label, but it's unlikely that the server is not found during the retry, so the jump is mostly to revist the client state check. Let's remove the retry logic and check the state against TCP_CLOSE first. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Define locking order for U_LOCK_SECOND in unix_state_double_lock().	Kuniyuki Iwashima
	unix_dgram_connect() and unix_dgram_{send,recv}msg() lock the socket and peer in ascending order of the socket address. Let's define the order as unix_state_lock_cmp_fn() instead of using unix_state_lock_nested(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-06-25	af_unix: Define locking order for unix_table_double_lock().	Kuniyuki Iwashima
	When created, AF_UNIX socket is put into net->unx.table.buckets[], and the hash is stored in sk->sk_hash. * unbound socket : 0 <= sk_hash <= UNIX_HASH_MOD When bind() is called, the socket could be moved to another bucket. * pathname socket : 0 <= sk_hash <= UNIX_HASH_MOD * abstract socket : UNIX_HASH_MOD + 1 <= sk_hash <= UNIX_HASH_MOD * 2 + 1 Then, we call unix_table_double_lock() which locks a single bucket or two. Let's define the order as unix_table_lock_cmp_fn() instead of using spin_lock_nested(). The locking is always done in ascending order of sk->sk_hash, which is the index of buckets/locks array allocated by kvmalloc_array(). sk_hash_A < sk_hash_B <=> &locks[sk_hash_A].dep_map < &locks[sk_hash_B].dep_map So, the relation of two sk->sk_hash can be derived from the addresses of dep_map in the array of locks. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Paolo Abeni <pabeni@redhat.com>