linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2025-05-08	firmware: SDEI: Allow sdei initialization without ACPI_APEI_GHES	Huang Yiwei
	SDEI usually initialize with the ACPI table, but on platforms where ACPI is not used, the SDEI feature can still be used to handle specific firmware calls or other customized purposes. Therefore, it is not necessary for ARM_SDE_INTERFACE to depend on ACPI_APEI_GHES. In commit dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in acpi_init()"), to make APEI ready earlier, sdei_init was moved into acpi_ghes_init instead of being a standalone initcall, adding ACPI_APEI_GHES dependency to ARM_SDE_INTERFACE. This restricts the flexibility and usability of SDEI. This patch corrects the dependency in Kconfig and splits sdei_init() into two separate functions: sdei_init() and acpi_sdei_init(). sdei_init() will be called by arch_initcall and will only initialize the platform driver, while acpi_sdei_init() will initialize the device from acpi_ghes_init() when ACPI is ready. This allows the initialization of SDEI without ACPI_APEI_GHES enabled. Fixes: dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in apci_init()") Cc: Shuai Xue <xueshuai@linux.alibaba.com> Signed-off-by: Huang Yiwei <quic_hyiwei@quicinc.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20250507045757.2658795-1-quic_hyiwei@quicinc.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08	riscv: Disallow PR_GET_TAGGED_ADDR_CTRL without Supm	Samuel Holland
	When the prctl() interface for pointer masking was added, it did not check that the pointer masking ISA extension was supported, only the individual submodes. Userspace could still attempt to disable pointer masking and query the pointer masking state. commit 81de1afb2dd1 ("riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL") disallowed the former, as the senvcfg write could crash on older systems. PR_GET_TAGGED_ADDR_CTRL state does not crash, because it reads only kernel-internal state and not senvcfg, but it should still be disallowed for consistency. Fixes: 09d6775f503b ("riscv: Add support for userspace pointer masking") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://lore.kernel.org/r/20250507145230.2272871-1-samuel.holland@sifive.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08	scripts: Do not strip .rela.dyn section	Alexandre Ghiti
	The .rela.dyn section contains runtime relocations and is only emitted for a relocatable kernel. riscv uses this section to relocate the kernel at runtime but that section is stripped from vmlinux. That prevents kexec to successfully load vmlinux since it does not contain the relocations info needed. Fixes: 559d1e45a16d ("riscv: Use --emit-relocs in order to move .rela.dyn in init") Tested-by: Björn Töpel <bjorn@rivosinc.com> Reviewed-by: Björn Töpel <bjorn@rivosinc.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250408072851.90275-1-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08	riscv: Fix kernel crash due to PR_SET_TAGGED_ADDR_CTRL	Nam Cao
	When userspace does PR_SET_TAGGED_ADDR_CTRL, but Supm extension is not available, the kernel crashes: Oops - illegal instruction [#1] [snip] epc : set_tagged_addr_ctrl+0x112/0x15a ra : set_tagged_addr_ctrl+0x74/0x15a epc : ffffffff80011ace ra : ffffffff80011a30 sp : ffffffc60039be10 [snip] status: 0000000200000120 badaddr: 0000000010a79073 cause: 0000000000000002 set_tagged_addr_ctrl+0x112/0x15a __riscv_sys_prctl+0x352/0x73c do_trap_ecall_u+0x17c/0x20c andle_exception+0x150/0x15c Fix it by checking if Supm is available. Fixes: 09d6775f503b ("riscv: Add support for userspace pointer masking") Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: stable@vger.kernel.org Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20250504101920.3393053-1-namcao@linutronix.de Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08	riscv: misaligned: use get_user() instead of __get_user()	Clément Léger
	Now that we can safely handle user memory accesses while in the misaligned access handlers, use get_user() instead of __get_user() to have user memory access checks. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250422162324.956065-4-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08	riscv: misaligned: enable IRQs while handling misaligned accesses	Clément Léger
	We can safely reenable IRQs if coming from userspace. This allows to access user memory that could potentially trigger a page fault. Fixes: b686ecdeacf6 ("riscv: misaligned: Restrict user access to kernel memory") Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250422162324.956065-3-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08	riscv: misaligned: factorize trap handling	Clément Léger
	Since both load/store and user/kernel should use almost the same path and that we are going to add some code around that, factorize it. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250422162324.956065-2-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
2025-05-08	pmdomain: Merge branch fixes into next	Ulf Hansson
	Merge the pmdomain fixes for v6.15-rc[n] into the next branch, to allow them to get tested together with the new changes that are targeted for v6.16. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-05-08	pmdomain: core: Fix error checking in genpd_dev_pm_attach_by_id()	Dan Carpenter
	The error checking for of_count_phandle_with_args() does not handle negative error codes correctly. The problem is that "index" is a u32 so in the condition "if (index >= num_domains)" negative error codes stored in "num_domains" are type promoted to very high positive values and "index" is always going to be valid. Test for negative error codes first and then test if "index" is valid. Fixes: 3ccf3f0cd197 ("PM / Domains: Enable genpd_dev_pm_attach_by_id\|name() for single PM domain") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/aBxPQ8AI8N5v-7rL@stanley.mountain Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2025-05-08	dma-buf/sw-sync: Remove unused debug code	Dr. David Alan Gilbert
	sync_file_debug_add() and sync_file_debug_remove() have been unused since 2016's commit d4cab38e153d ("staging/android: prepare sync_file for de-staging") Remove them. Since sync_file_debug_add was the only thing to add to sync_file_list_head, the code that dumps it in part of sync_info_debugfs_show can be removed, and the declaration of the list and it's associated lock can be removed. (The 'fences:\n...' marker in that debugfs file is left in so as not to change the output) That leaves the sync_print_sync_file() helper unused, and is thus removed. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250505233838.105668-1-linux@treblig.org
2025-05-08	MAINTAINERS: Remove entry for Seth Heasley	Andi Shyti
	Seth's mails bounce back, remove his maintainership. Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://lore.kernel.org/r/20250505231511.3175151-1-andi.shyti@kernel.org
2025-05-08	genirq/cpuhotplug: Fix up lock guards conversion brainf..t	Thomas Gleixner
	The lock guard conversion converted raw_spin_lock_irq() to scoped_guard(raw_spinlock), which is obviously bogus and makes lockdep mightily unhappy. Note to self: Copy and pasta without using brain is a patently bad idea. Fixes: 88a4df117ad6 ("genirq/cpuhotplug: Convert to lock guards") Reported-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Borislav Petkov <bp@alien8.de>
2025-05-08	Merge branch 'virtio-net-fix-total-qstat-values'	Paolo Abeni
	Jakub Kicinski says: ==================== virtio-net: fix total qstat values Another small fix discovered after we enabled virtio multi-queue in netdev CI. The queue stat test fails: # Exception\| Exception: Qstats are lower, fetched later not ok 3 stats.pkt_byte_sum The queue stats from disabled queues are supposed to be reported in the "base" stats. ==================== Link: https://patch.msgid.link/20250507003221.823267-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	virtio-net: fix total qstat values	Jakub Kicinski
	NIPA tests report that the interface statistics reported via qstat are lower than those reported via ip link. Looks like this is because some tests flip the queue count up and down, and we end up with some of the traffic accounted on disabled queues. Add up counters from disabled queues. Fixes: d888f04c09bb ("virtio-net: support queue stat") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250507003221.823267-3-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	net: export a helper for adding up queue stats	Jakub Kicinski
	Older drivers and drivers with lower queue counts often have a static array of queues, rather than allocating structs for each queue on demand. Add a helper for adding up qstats from a queue range. Expectation is that driver will pass a queue range [netdev->real_num_x_queues, MAX). It was tempting to always use num_x_queues as the end, but virtio seems to clamp its queue count after allocating the netdev. And this way we can trivaly reuse the helper for [0, real_..). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://patch.msgid.link/20250507003221.823267-2-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	ALSA: usb: mixer_us16x08: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87frhhaucf.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: usb: mixer_quirks: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87h61xaucj.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: pci: ali5451: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87ikmdauco.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: pci: asihpi: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87jz6taucs.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: pci: au88x0: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87ldr9aucw.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: usb: mixer: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87msbpaud1.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: usb: midi: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87o6w5aud5.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: pci: hda: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87plglauda.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: virtio: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87r011audg.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: core: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87selhaudp.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: i2c: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87tt5xaudu.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	ALSA: sh: use snd_kcontrol_chip()	Kuninori Morimoto
	We can use snd_kcontrol_chip(). Let's use it. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://patch.msgid.link/87v7qdaue4.wl-kuninori.morimoto.gx@renesas.com
2025-05-08	pwm: Restore alphabetic ordering in Kconfig and Makefile	Uwe Kleine-König
	The drivers are nearly ordered alphabetically by the symbol name. Fix the few outliers. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Link: https://lore.kernel.org/r/20250508081706.751209-2-u.kleine-koenig@baylibre.com Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org>
2025-05-08	ALSA: gus: Remove deadcode	Dr. David Alan Gilbert
	snd_gus_use_dec(), snd_gus_use_inc() and snd_gf1_print_voice_registers() last uses were removed in 2007 by commit e5723b41abe5 ("[ALSA] Remove sequencer instrument layer") Remove them. While there, remove big #if 0 blocks next to the code being deleted. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://patch.msgid.link/20250508000225.195766-1-linux@treblig.org Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-08	ALSA: hda/tas2781: Create an independent lib to save the shared parts for ↵	Shenghao Ding
	both SPI and I2C driver Some common parts, such as struct tas2781_hda{...} and some audio kcontrols are moved into an independent lib for code cleanup. Signed-off-by: Shenghao Ding <shenghao-ding@ti.com> Link: https://patch.msgid.link/20250507045813.151-1-shenghao-ding@ti.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-08	ALSA: hda: Remove unused snd_hdac_stream_get_spbmaxfifo	Dr. David Alan Gilbert
	snd_hdac_stream_get_spbmaxfifo() was originally added in 2015 in commit ee8bc4df1b5a ("ALSA: hdac: Add support to enable SPIB for hdac ext stream") when it was originally called snd_hdac_ext_stream_set_spbmaxfifo, it was renamed snd_hdac_ext_stream_get_spbmaxfifo shortly after and was finally renamed to snd_hdac_stream_get_spbmaxfifo in 2022. But it was never used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://patch.msgid.link/20250505011037.340592-1-linux@treblig.org Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-08	ALSA: hda: Remove unused snd_hda_add_nid	Dr. David Alan Gilbert
	snd_hda_add_nid() last use was removed in 2014 by commit db8e8a9dc972 ("ALSA: hda - Remove the obsoleted static quirk codes from patch_cmedia.c") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://patch.msgid.link/20250505010922.340534-1-linux@treblig.org Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-05-08	Merge branch 'fbnic-fw-ipc-mailbox-fixes'	Paolo Abeni
	Alexander Duyck says: ==================== fbnic: FW IPC Mailbox fixes This series is meant to address a number of issues that have been found in the FW IPC mailbox over the past several months. The main issues addressed are: 1. Resolve a potential race between host and FW during initialization that can cause the FW to only have the lower 32b of an address. 2. Block the FW from issuing DMA requests after we have closed the mailbox and before we have started issuing requests on it. 3. Fix races in the IRQ handlers that can cause the IRQ to unmask itself if it is being processed while we are trying to disable it. 4. Cleanup the Tx flush logic so that we actually lock down the Tx path before we start flushing it instead of letting it free run while we are shutting it down. 5. Fix several memory leaks that could occur if we failed initialization. 6. Cleanup the mailbox completion if we are flushing Tx since we are no longer processing Rx. 7. Move several allocations out of a potential IRQ/atomic context. There have been a few optimizations we also picked up since then. Rather than split them out I just folded them into these diffs. They mostly address minor issues such as how long it takes to initialize and/or fail so I thought they could probably go in with the rest of the patches. They consist of: 1. Do not sleep more than 20ms waiting on FW to respond as the 200ms value likely originated from simulation/emulation testing. 2. Use jiffies to determine timeout instead of sleep * attempts for better accuracy. Reviewed-by: Jakub Kicinski <kuba@kernel.org> ==================== Link: https://patch.msgid.link/174654659243.499179.11194817277075480209.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready	Alexander Duyck
	We had originally thought to have the mailbox go to ready in the background while we were doing other things. One issue with this though is that we can't disable it by clearing the ready state without also blocking interrupts or calls to mbx_poll as it will just pop back to life during an interrupt. In order to prevent that from happening we can pull the code for toggling to ready out of the interrupt path and instead place it in the fbnic_mbx_poll_tx_ready path so that it becomes the only spot where the Rx/Tx can toggle to the ready state. By doing this we can prevent races where we disable the DMA and/or free buffers only to have an interrupt fire and undo what we have done. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654722518.499179.11612865740376848478.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context	Alexander Duyck
	This change pulls the call to fbnic_fw_xmit_cap_msg out of fbnic_mbx_init_desc_ring and instead places it in the polling function for getting the Tx ready. Doing that we can avoid the potential issue with an interrupt coming in later from the firmware that causes it to get fired in interrupt context. Fixes: 20d2e88cc746 ("eth: fbnic: Add initial messaging to notify FW of our presence") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654721876.499179.9839651602256668493.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready	Alexander Duyck
	There were a couple different issues found in fbnic_mbx_poll_tx_ready. Among them were the fact that we were sleeping much longer than we actually needed to as the actual FW could respond in under 20ms. The other issue was that we would just keep polling the mailbox even if the device itself had gone away. To address the responsiveness issues we can decrease the sleeps to 20ms and use a jiffies based timeout value rather than just counting the number of times we slept and then polled. To address the hardware going away we can move the check for the firmware BAR being present from where it was and place it inside the loop after the mailbox descriptor ring is initialized and before we sleep so that we just abort and return an error if the device went away during initialization. With these two changes we see a significant improvement in boot times for the driver. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654721224.499179.2698616208976624755.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Cleanup handling of completions	Alexander Duyck
	There was an issue in that if we were to shutdown we could be left with a completion in flight as the mailbox went away. To address that I have added an fbnic_mbx_evict_all_cmpl function that is meant to essentially create a "broken pipe" type response so that all callers will receive an error indicating that the connection has been broken as a result of us shutting down the mailbox. Fixes: 378e5cc1c6c6 ("eth: fbnic: hwmon: Add completion infrastructure for firmware requests") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654720578.499179.380252598204530873.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Actually flush_tx instead of stalling out	Alexander Duyck
	The fbnic_mbx_flush_tx function had a number of issues. First, we were waiting 200ms for the firmware to process the packets. We can drop this to 20ms and in almost all cases this should be more than enough time. So by changing this we can significantly reduce shutdown time. Second, we were not making sure that the Tx path was actually shut off. As such we could still have packets added while we were flushing the mailbox. To prevent that we can now clear the ready flag for the Tx side and it should stay down since the interrupt is disabled. Third, we kept re-reading the tail due to the second issue. The tail should not move after we have started the flush so we can just read it once while we are holding the mailbox Tx lock. By doing that we are guaranteed that the value should be consistent. Fourth, we were keeping a count of descriptors cleaned due to the second and third issues called out. That count is not a valid reason to be exiting the cleanup, and with the tail only being read once we shouldn't see any cases where the tail moves after the disable so the tracking of count can be dropped. Fifth, we were using attempts * sleep time to determine how long we would wait in our polling loop to flush out the Tx. This can be very imprecise. In order to tighten up the timing we are shifting over to using a jiffies value of jiffies + 10 * HZ + 1 to determine the jiffies value we should stop polling at as this should be accurate within once sleep cycle for the total amount of time spent polling. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654719929.499179.16406653096197423749.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Add additional handling of IRQs	Alexander Duyck
	We have two issues that need to be addressed in our IRQ handling. One is the fact that we can end up double-freeing IRQs in the event of an exception handling error such as a PCIe reset/recovery that fails. To prevent that from becoming an issue we can use the msix_vector values to indicate that we have successfully requested/freed the IRQ by only setting or clearing them when we have completed the given action. The other issue is that we have several potential races in our IRQ path due to us manipulating the mask before the vector has been truly disabled. In order to handle that in the case of the FW mailbox we need to not auto-enable the IRQ and instead will be enabling/disabling it separately. In the case of the PCS vector we can mitigate this by unmapping it and synchronizing the IRQ before we clear the mask. The general order of operations after this change is now to request the interrupt, poll the FW mailbox to ready, and then enable the interrupt. For the shutdown we do the reverse where we disable the interrupt, flush any pending Tx, and then free the IRQ. I am renaming the enable/disable to request/free to be equivilent with the IRQ calls being used. We may see additions in the future to enable/disable the IRQs versus request/free them for certain use cases. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Fixes: 69684376eed5 ("eth: fbnic: Add link detection") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654719271.499179.3634535105127848325.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Gate AXI read/write enabling on FW mailbox	Alexander Duyck
	In order to prevent the device from throwing spurious writes and/or reads at us we need to gate the AXI fabric interface to the PCIe until such time as we know the FW is in a known good state. To accomplish this we use the mailbox as a mechanism for us to recognize that the FW has acknowledged our presence and is no longer sending any stale message data to us. We start in fbnic_mbx_init by calling fbnic_mbx_reset_desc_ring function, disabling the DMA in both directions, and then invalidating all the descriptors in each ring. We then poll the mailbox in fbnic_mbx_poll_tx_ready and when the interrupt is set by the FW we pick it up and mark the mailboxes as ready, while also enabling the DMA. Once we have completed all the transactions and need to shut down we call into fbnic_mbx_clean which will in turn call fbnic_mbx_reset_desc_ring for each ring and shut down the DMA and once again invalidate the descriptors. Fixes: 3646153161f1 ("eth: fbnic: Add register init to set PCIe/Ethernet device config") Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654718623.499179.7445197308109347982.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	fbnic: Fix initialization of mailbox descriptor rings	Alexander Duyck
	Address to issues with the FW mailbox descriptor initialization. We need to reverse the order of accesses when we invalidate an entry versus writing an entry. When writing an entry we write upper and then lower as the lower 32b contain the valid bit that makes the entire address valid. However for invalidation we should write it in the reverse order so that the upper is marked invalid before we update it. Without this change we may see FW attempt to access pages with the upper 32b of the address set to 0 which will likely result in DMAR faults due to write access failures on mailbox shutdown. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/174654717972.499179.8083789731819297034.stgit@ahduyck-xeon-server.home.arpa Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-08	accel/habanalabs: Don't build the driver on UML	Ingo Molnar
	The following commit: 288a4ff0ad29 ("x86/msr: Move rdtsc{,_ordered}() to <asm/tsc.h>") removed the <asm/msr.h> include from the accel/habanalabs driver, which broke the build on UML: drivers/accel/habanalabs/common/habanalabs_ioctl.c:326:23: error: call to undeclared function 'rdtsc'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration] Make the driver depend on 'X86 && X86_64', instead of just 'X86_64', thus it won't be built on UML. Suggested-by: Johannes Berg <johannes.berg@intel.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Johannes Berg <johannes@sipsolutions.net> Cc: Ofir Bitton <obitton@habana.ai> Cc: Oded Gabbay <ogabbay@kernel.org> Link: https://lore.kernel.org/r/202505080003.0t7ewxGp-lkp@intel.com
2025-05-08	memory: renesas-rpc-if: Add missing static keyword	Biju Das
	Fix the below sparse warnings: symbol 'rpcif_impl' was not declared. Should it be static? symbol 'xspi_impl' was not declared. Should it be static? Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202505072013.1EqwjtaR-lkp@intel.com/ Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/r/20250507162146.140494-1-biju.das.jz@bp.renesas.com Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
2025-05-07	mm: fix folio_pte_batch() on XEN PV	Petr Vaněk
	On XEN PV, folio_pte_batch() can incorrectly batch beyond the end of a folio due to a corner case in pte_advance_pfn(). Specifically, when the PFN following the folio maps to an invalidated MFN, expected_pte = pte_advance_pfn(expected_pte, nr); produces a pte_none(). If the actual next PTE in memory is also pte_none(), the pte_same() succeeds, if (!pte_same(pte, expected_pte)) break; the loop is not broken, and batching continues into unrelated memory. For example, with a 4-page folio, the PTE layout might look like this: [ 53.465673] [ T2552] folio_pte_batch: printing PTE values at addr=0x7f1ac9dc5000 [ 53.465674] [ T2552] PTE[453] = 000000010085c125 [ 53.465679] [ T2552] PTE[454] = 000000010085d125 [ 53.465682] [ T2552] PTE[455] = 000000010085e125 [ 53.465684] [ T2552] PTE[456] = 000000010085f125 [ 53.465686] [ T2552] PTE[457] = 0000000000000000 <-- not present [ 53.465689] [ T2552] PTE[458] = 0000000101da7125 pte_advance_pfn(PTE[456]) returns a pte_none() due to invalid PFN->MFN mapping. The next actual PTE (PTE[457]) is also pte_none(), so the loop continues and includes PTE[457] in the batch, resulting in 5 batched entries for a 4-page folio. This triggers the following warning: [ 53.465751] [ T2552] page: refcount:85 mapcount:20 mapping:ffff88813ff4f6a8 index:0x110 pfn:0x10085c [ 53.465754] [ T2552] head: order:2 mapcount:80 entire_mapcount:0 nr_pages_mapped:4 pincount:0 [ 53.465756] [ T2552] memcg:ffff888003573000 [ 53.465758] [ T2552] aops:0xffffffff8226fd20 ino:82467c dentry name(?):"libc.so.6" [ 53.465761] [ T2552] flags: 0x2000000000416c(referenced\|uptodate\|lru\|active\|private\|head\|node=0\|zone=2) [ 53.465764] [ T2552] raw: 002000000000416c ffffea0004021f08 ffffea0004021908 ffff88813ff4f6a8 [ 53.465767] [ T2552] raw: 0000000000000110 ffff888133d8bd40 0000005500000013 ffff888003573000 [ 53.465768] [ T2552] head: 002000000000416c ffffea0004021f08 ffffea0004021908 ffff88813ff4f6a8 [ 53.465770] [ T2552] head: 0000000000000110 ffff888133d8bd40 0000005500000013 ffff888003573000 [ 53.465772] [ T2552] head: 0020000000000202 ffffea0004021701 000000040000004f 00000000ffffffff [ 53.465774] [ T2552] head: 0000000300000003 8000000300000002 0000000000000013 0000000000000004 [ 53.465775] [ T2552] page dumped because: VM_WARN_ON_FOLIO((_Generic((page + nr_pages - 1), const struct page : (const struct folio )_compound_head(page + nr_pages - 1), struct page : (struct folio )_compound_head(page + nr_pages - 1))) != folio) Original code works as expected everywhere, except on XEN PV, where pte_advance_pfn() can yield a pte_none() after balloon inflation due to MFNs invalidation. In XEN, pte_advance_pfn() ends up calling __pte()->xen_make_pte()->pte_pfn_to_mfn(), which returns pte_none() when mfn == INVALID_P2M_ENTRY. The pte_pfn_to_mfn() documents that nastiness: If there's no mfn for the pfn, then just create an empty non-present pte. Unfortunately this loses information about the original pfn, so pte_mfn_to_pfn is asymmetric. While such hacks should certainly be removed, we can do better in folio_pte_batch() and simply check ahead of time how many PTEs we can possibly batch in our folio. This way, we can not only fix the issue but cleanup the code: removing the pte_pfn() check inside the loop body and avoiding end_ptr comparison + arithmetic. Link: https://lkml.kernel.org/r/20250502215019.822-2-arkamar@atlas.cz Fixes: f8d937761d65 ("mm/memory: optimize fork() with PTE-mapped THP") Co-developed-by: David Hildenbrand <david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Petr Vaněk <arkamar@atlas.cz> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07	nilfs2: fix deadlock warnings caused by lock dependency in init_nilfs()	Ryusuke Konishi
	After commit c0e473a0d226 ("block: fix race between set_blocksize and read paths") was merged, set_blocksize() called by sb_set_blocksize() now locks the inode of the backing device file. As a result of this change, syzbot started reporting deadlock warnings due to a circular dependency involving the semaphore "ns_sem" of the nilfs object, the inode lock of the backing device file, and the locks that this inode lock is transitively dependent on. This is caused by a new lock dependency added by the above change, since init_nilfs() calls sb_set_blocksize() in the lock section of "ns_sem". However, these warnings are false positives because init_nilfs() is called in the early stage of the mount operation and the filesystem has not yet started. The reason why "ns_sem" is locked in init_nilfs() was to avoid a race condition in nilfs_fill_super() caused by sharing a nilfs object among multiple filesystem instances (super block structures) in the early implementation. However, nilfs objects and super block structures have long ago become one-to-one, and there is no longer any need to use the semaphore there. So, fix this issue by removing the use of the semaphore "ns_sem" in init_nilfs(). Link: https://lkml.kernel.org/r/20250503053327.12294-1-konishi.ryusuke@gmail.com Fixes: c0e473a0d226 ("block: fix race between set_blocksize and read paths") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Reported-by: syzbot+00f7f5b884b117ee6773@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=00f7f5b884b117ee6773 Tested-by: syzbot+00f7f5b884b117ee6773@syzkaller.appspotmail.com Reported-by: syzbot+f30591e72bfc24d4715b@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f30591e72bfc24d4715b Tested-by: syzbot+f30591e72bfc24d4715b@syzkaller.appspotmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07	mm/hugetlb: copy the CMA flag when demoting	Frank van der Linden
	Since commit d2d786714080 ("mm/hugetlb: enable bootmem allocation from CMA areas"), a flag is used to mark hugetlb folios as allocated from CMA. This flag is also used to decide if it should be freed to CMA. However, the flag isn't copied to the smaller folios when a hugetlb folio is broken up for demotion, which would cause it to be freed incorrectly. Fix this by copying the flag to the smaller order hugetlb pages created from the original one. Link: https://lkml.kernel.org/r/20250501044325.20365-1-fvdl@google.com Fixes: d2d786714080 ("mm/hugetlb: enable bootmem allocation from CMA areas") Signed-off-by: Frank van der Linden <fvdl@google.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Reviewed-by: Jane Chu <Jane.Chu@oracle.com> Cc: Muchun Song <muchun.song@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07	mm, swap: fix false warning for large allocation with !THP_SWAP	Kairui Song
	The !CONFIG_THP_SWAP check existed before just fine because slot cache would reject high order allocation and let the caller split all folios and try again. But slot cache is gone, so large allocation will directly go to the allocator, and the allocator should just fail silently to inform caller to do the folio split, this is totally fine and expected. Remove this meaningless warning. Link: https://lkml.kernel.org/r/20250429094803.85518-1-ryncsn@gmail.com Fixes: 0ff67f990bd4 ("mm, swap: remove swap slot cache") Signed-off-by: Kairui Song <kasong@tencent.com> Reported-by: Heiko Carstens <hca@linux.ibm.com> Closes: https://lore.kernel.org/linux-mm/20250428135252.25453B17-hca@linux.ibm.com/ Tested-by: Heiko Carstens <hca@linux.ibm.com> Cc: Baoquan He <bhe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07	selftests/mm: fix a build failure on powerpc	Nysal Jan K.A.
	The compiler is unaware of the size of code generated by the ".rept" assembler directive. This results in the compiler emitting branch instructions where the offset to branch to exceeds the maximum allowed value, resulting in build failures like the following: CC protection_keys /tmp/ccypKWAE.s: Assembler messages: /tmp/ccypKWAE.s:2073: Error: operand out of range (0x0000000000020158 is not between 0xffffffffffff8000 and 0x0000000000007ffc) /tmp/ccypKWAE.s:2509: Error: operand out of range (0x0000000000020130 is not between 0xffffffffffff8000 and 0x0000000000007ffc) Fix the issue by manually adding nop instructions using the preprocessor. Link: https://lkml.kernel.org/r/20250428131937.641989-2-nysal@linux.ibm.com Fixes: 46036188ea1f ("selftests/mm: build with -O2") Reported-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Reviewed-by: Donet Tom <donettom@linux.ibm.com> Tested-by: Donet Tom <donettom@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07	selftests/mm: fix build break when compiling pkey_util.c	Madhavan Srinivasan
	Commit 50910acd6f615 ("selftests/mm: use sys_pkey helpers consistently") added a pkey_util.c to refactor some of the protection_keys functions accessible by other tests. But this broken the build in powerpc in two ways, pkey-powerpc.h: In function `arch_is_powervm': pkey-powerpc.h:73:21: error: storage size of `buf' isn't known 73 \| struct stat buf; \| ^~~ pkey-powerpc.h:75:14: error: implicit declaration of function `stat'; did you mean `strcat'? [-Wimplicit-function-declaration] 75 \| if ((stat("/sys/firmware/devicetree/base/ibm,partition-name", &buf) == 0) && \| ^~~~ \| strcat Since pkey_util.c includes pkeys-helper.h, which in turn includes pkeys-powerpc.h, stat.h including is missing for "struct stat". This is fixed by adding "sys/stat.h" in pkeys-powerpc.h Secondly, pkey-powerpc.h:55:18: warning: format `%llx' expects argument of type `long long unsigned int', but argument 3 has type `u64' {aka `long unsigned int'} [-Wformat=] 55 \| dprintf4("%s() changing %016llx to %016llx\n", \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 \| __func__, __read_pkey_reg(), pkey_reg); \| ~~~~~~~~~~~~~~~~~ \| \| \| u64 {aka long unsigned int} pkey-helpers.h:63:32: note: in definition of macro `dprintf_level' 63 \| sigsafe_printf(args); \ \| ^~~~ These format specifier related warning are removed by adding "__SANE_USERSPACE_TYPES__" to pkeys_utils.c. Link: https://lkml.kernel.org/r/20250428131937.641989-1-nysal@linux.ibm.com Fixes: 50910acd6f61 ("selftests/mm: use sys_pkey helpers consistently") Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-07	mm: vmalloc: support more granular vrealloc() sizing	Kees Cook
	Introduce struct vm_struct::requested_size so that the requested (re)allocation size is retained separately from the allocated area size. This means that KASAN will correctly poison the correct spans of requested bytes. This also means we can support growing the usable portion of an allocation that can already be supported by the existing area's existing allocation. Link: https://lkml.kernel.org/r/20250426001105.it.679-kees@kernel.org Fixes: 3ddc2fefe6f3 ("mm: vmalloc: implement vrealloc()") Signed-off-by: Kees Cook <kees@kernel.org> Reported-by: Erhard Furtner <erhard_f@mailbox.org> Closes: https://lore.kernel.org/all/20250408192503.6149a816@outsider.home/ Reviewed-by: Danilo Krummrich <dakr@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>