linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2025-01-15	signal/posixtimers: Handle ignore/blocked sequences correctly	Thomas Gleixner
	syzbot triggered the warning in posixtimer_send_sigqueue(), which warns about a non-ignored signal being already queued on the ignored list. The warning is actually bogus, as the following sequence causes this: signal($SIG, SIGIGN); timer_settime(...); // arm periodic timer timer fires, signal is ignored and queued on ignored list sigprocmask(SIG_BLOCK, ...); // block the signal timer_settime(...); // re-arm periodic timer timer fires, signal is not ignored because it is blocked ---> Warning triggers as signal is on the ignored list Ideally timer_settime() could remove the signal, but that's racy and incomplete vs. other scenarios and requires a full reevaluation of the pending signal list. Instead of adding more complexity, handle it gracefully by removing the warning and requeueing the signal to the pending list. That's correct versus: 1) sig[timed]wait() as that does not check for SIGIGN and only relies on dequeue_signal() -> posixtimers_deliver_signal() to check whether the pending signal is still valid. 2) Unblocking of the signal. - If the unblocking happens before SIGIGN is replaced by a signal handler, then the timer is rearmed in dequeue_signal(), but get_signal() will ignore it. The next timer expiry will move it back to the ignored list. - If SIGIGN was replaced before unblocking, then the signal will be delivered and a subsequent expiry will queue a signal on the pending list again. There is a related scenario to trigger the complementary warning in the signal ignored path, which does not expect the signal to be on the pending list when it is ignored. That can be triggered even before the above change via: task1 task2 signal($SIG, SIGIGN); sigprocmask(SIG_BLOCK, ...); timer_create(); // Signal target is task2 timer_settime(...); // arm periodic timer timer fires, signal is not ignored because it is blocked and queued on the pending list of task2 syscall() // Sets the pending flag sigprocmask(SIG_UNBLOCK, ...); -> preemption, task2 cannot dequeue the signal timer_settime(...); // re-arm periodic timer timer fires, signal is ignored ---> Warning triggers as signal is on task2's pending list and the thread group is not exiting Consequently, remove that warning too and just keep the signal on the pending list. The following attempt to deliver the signal on return to user space of task2 will ignore the signal and a subsequent expiry will bring it back to the ignored list, if it did not get blocked or un-ignored before that. Fixes: df7a996b4dab ("signal: Queue ignored posixtimers on ignore list") Reported-by: syzbot+3c2e3cc60665d71de2f7@syzkaller.appspotmail.com Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/87ikqhcnjn.ffs@tglx
2025-01-15	block: Change blk_stack_atomic_writes_limits() unit_min check	John Garry
	The current check in blk_stack_atomic_writes_limits() for a bottom device supporting atomic writes is to verify that limit atomic_write_unit_min is non-zero. This would cause a problem for device mapper queue limits calculation. This is because it uses a temporary queue_limits structure to stack the limits, before finally commiting the limits update. The value of atomic_write_unit_min for the temporary queue_limits structure is never evaluated and so cannot be used, so use limit atomic_write_hw_unit_min. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20250109114000.2299896-3-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-15	block: Ensure start sector is aligned for stacking atomic writes	John Garry
	For stacking atomic writes, ensure that the start sector is aligned with the device atomic write unit min and any boundary. Otherwise, we may permit misaligned atomic writes. Rework bdev_can_atomic_write() into a common helper to resuse the alignment check. There also use atomic_write_hw_unit_min, which is more proper (than atomic_write_unit_min). Fixes: d7f36dc446e89 ("block: Support atomic writes limits for stacked devices") Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Link: https://lore.kernel.org/r/20250109114000.2299896-2-john.g.garry@oracle.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-15	s390/futex: Fix FUTEX_OP_ANDN implementation	Heiko Carstens
	The futex operation FUTEX_OP_ANDN is supposed to implement (int )UADDR2 &= ~OPARG; The s390 implementation just implements an AND instead of ANDN. Add the missing bitwise not operation to oparg to fix this. This is broken since nearly 19 years, so it looks like user space is not making use of this operation. Fixes: 3363fbdd6fb4 ("[PATCH] s390: futex atomic operations") Cc: stable@vger.kernel.org Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-15	io_uring: reuse io_should_terminate_tw() for cmds	Pavel Begunkov
	io_uring_cmd_work() rolled a hard coded version of io_should_terminate_tw() to avoid conflicts, but now it's time to converge them. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/8a88dd6e4ed8e6c00c6552af0c20c9de02e458de.1736955455.git.asml.silence@gmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-15	io_uring: Factor out a function to parse restrictions	Josh Triplett
	Preparation for subsequent work on inherited restrictions. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Reviewed-by: Pavel Begunkov <asml.silence@gmail.com> Link: https://lore.kernel.org/r/9bac2b4d1b9b9ab41c55ea3816021be847f354df.1736932318.git.josh@joshtriplett.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-15	io_uring/register: cache old SQ/CQ head reading for copies	Jens Axboe
	The SQ and CQ ring heads are read twice - once for verifying that it's within bounds, and once inside the loops copying SQE and CQE entries. This is technically incorrect, in case the values could get modified in between verifying them and using them in the copy loop. While this won't lead to anything truly nefarious, it may cause longer loop times for the copies than expected. Read the ring head values once, and use the verified value in the copy loops. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-15	io_uring/register: document io_register_resize_rings() shared mem usage	Jens Axboe
	It can be a bit hard to tell which parts of io_register_resize_rings() are operating on shared memory, and which ones are not. And anything reading or writing to those regions should really use the read/write once primitives. Hence add those, ensuring sanity in how this memory is accessed, and helping document the shared nature of it. Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-15	Merge tag 'ti-driver-soc-for-v6.14' of ↵	Arnd Bergmann
	https://git.kernel.org/pub/scm/linux/kernel/git/ti/linux into arm/fixes TI SoC driver updates for v6.14 - Build fixup when CONFIG_TI_PRUSS is disabled.
2025-01-15	io_uring/register: use stable SQ/CQ ring data during resize	Jens Axboe
	Normally the kernel would not expect an application to modify any of the data shared with the kernel during a resize operation, but of course the kernel cannot always assume good intent on behalf of the application. As part of resizing the rings, existing SQEs and CQEs are copied over to the new storage. Resizing uses the masks in the newly allocated shared storage to index the arrays, however it's possible that malicious userspace could modify these after they have been sanity checked. Use the validated and locally stored CQ and SQ ring sizing for masking to ensure the values are both stable and valid. Fixes: 79cfe9e59c2a ("io_uring/register: add IORING_REGISTER_RESIZE_RINGS") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-01-15	hwmon: (drivetemp) Set scsi command timeout to 10s	Russell Harmon
	There's at least one drive (MaxDigitalData OOS14000G) such that if it receives a large amount of I/O while entering an idle power state will first exit idle before responding, including causing SMART temperature requests to be delayed. This causes the drivetemp request to exceed its timeout of 1 second. Signed-off-by: Russell Harmon <russ@har.mn> Link: https://lore.kernel.org/r/20250115131340.3178988-1-russ@har.mn Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-01-15	hwmon: (acpi_power_meter) Fix a check for the return value of ↵	Kazuhiro Abe
	read_domain_devices(). After commit fabb1f813ec0 ("hwmon: (acpi_power_meter) Fix fail to load module on platform without _PMD method"), the acpi_power_meter driver fails to load if the platform has _PMD method. To address this, add a check for successful read_domain_devices(). Tested on Nvidia Grace machine. Fixes: fabb1f813ec0 ("hwmon: (acpi_power_meter) Fix fail to load module on platform without _PMD method") Signed-off-by: Kazuhiro Abe <fj1078ii@aa.jp.fujitsu.com> Link: https://lore.kernel.org/r/20250115073532.3211000-1-fj1078ii@aa.jp.fujitsu.com [groeck: Dropped unnecessary () from expression] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-01-15	Merge tag 'reset-fixes-for-v6.13' of git://git.pengutronix.de/pza/linux into ↵	Arnd Bergmann
	arm/fixes Reset controller fixes for v6.13 * Fix rzg2l-usb-vbus-regulator lookup by assigning the proper of node to the allocated platform device in the rzg2l-usbphy-ctrl driver. * tag 'reset-fixes-for-v6.13' of git://git.pengutronix.de/pza/linux: reset: rzg2l-usbphy-ctrl: Assign proper of node to the allocated device Link: https://lore.kernel.org/r/20250113163642.1757160-1-p.zabel@pengutronix.de Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-01-15	s390/diag: Add memory topology information via diag310	Mete Durlu
	Introduce diag310 and memory topology related subcodes. Provide memory topology information obtanied from diag310 to userspace via diag ioctl. Signed-off-by: Mete Durlu <meted@linux.ibm.com> Reviewed-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
2025-01-15	afs: Fix the fallback handling for the YFS.RemoveFile2 RPC call	David Howells
	Fix a pair of bugs in the fallback handling for the YFS.RemoveFile2 RPC call: (1) Fix the abort code check to also look for RXGEN_OPCODE. The lack of this masks the second bug. (2) call->server is now not used for ordinary filesystem RPC calls that have an operation descriptor. Fix to use call->op->server instead. Fixes: e49c7b2f6de7 ("afs: Build an abstraction around an "operation" concept") Signed-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/109541.1736865963@warthog.procyon.org.uk cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-01-15	drm/xe/oa: Add missing VISACTL mux registers	Ashutosh Dixit
	Add missing VISACTL mux registers required for some OA config's (e.g. RenderPipeCtrl). Fixes: cdf02fe1a94a ("drm/xe/oa/uapi: Add/remove OA config perf ops") Cc: stable@vger.kernel.org Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250111021539.2920346-1-ashutosh.dixit@intel.com (cherry picked from commit c26f22dac3449d8a687237cdfc59a6445eb8f75a) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-01-15	drm/xe: make change ccs_mode a synchronous action	Maciej Patelczyk
	If ccs_mode is being modified via /sys/class/drm/cardX/device/tileY/gtY/ccs_mode the asynchronous reset is triggered and the write returns immediately. With that some test receive false information about number of CCS engines or even fail if they proceed without delay after changing the ccs_mode. Changing the ccs_mode change from async to sync to prevent failures in tests. Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Fixes: f3bc5bb4d53d ("drm/xe: Allow userspace to configure CCS mode") Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211111727.1481476-3-maciej.patelczyk@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit 480fb9806e2e073532f7786166287114c696b340) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-01-15	drm/xe: introduce xe_gt_reset and xe_gt_wait_for_reset	Maciej Patelczyk
	Add synchronous version gt reset as there are few places where it is expected. Also add a wait helper to wait until gt reset is done. Signed-off-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Fixes: f3bc5bb4d53d ("drm/xe: Allow userspace to configure CCS mode") Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211111727.1481476-2-maciej.patelczyk@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> (cherry picked from commit 155c77f45f63dd58a37eeb0896b0b140ab785836) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-01-15	drm/xe/guc: Adding steering info support for GuC register lists	Jesus Narvaez
	The guc_mmio_reg interface supports steering, but it is currently not implemented. This will allow the GuC to control steering of MMIO registers after save-restore and avoid reading from fused off MCR register instances. Fixes: 9c57bc08652a ("drm/xe/lnl: Drop force_probe requirement") Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241212190100.3768068-1-jesus.narvaez@intel.com (cherry picked from commit ee5a1321df90891d59d83b7c9d5b6c5b755d059d) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2025-01-15	vdso: Correct typo in PAGE_SHIFT comment	Haiyue Wang
	Signed-off-by: Haiyue Wang <haiyuewa@163.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241228121518.80812-1-haiyuewa@163.com
2025-01-15	genirq: Provide IRQCHIP_MOVE_DEFERRED	Thomas Gleixner
	The logic of GENERIC_PENDING_IRQ is backwards for historical reasons. Most interrupt controllers allow to move the interrupt from arbitrary contexts. If GENERIC_PENDING_IRQ is enabled by an architecture to support a chip, which requires the affinity change to happen in interrupt context, all other chips have to be marked with IRQF_MOVE_PCNTXT. That's tedious and there is no real good reason for the extra flags in the irq descriptor and the irq data status fields. In fact the decision whether interrupts can be moved in arbitrary context or not is a property of the interrupt chip. To simplify adoption for RISC-V provide a new mechanism which is enabled via a config switch and allows to add a flag to irq_chip::flags to request that interrupt affinity changes are deferred. Setting the top level chip of an interrupt evaluates the flag and maps it into the existing logic. The config switch and the various PCNTXT flags are temporary until x86 is converted over to this scheme. This intermediate step also allows trivial backporting of the mechanism to plug the affinity change race of various RISC-V interrupt controllers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241210103335.500314436@linutronix.de
2025-01-15	hexagon: Remove GENERIC_PENDING_IRQ leftover	Thomas Gleixner
	Commented out since 2011.... Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Brian Cain <bcain@quicinc.com> Link: https://lore.kernel.org/all/20241210103335.437630614@linutronix.de
2025-01-15	ARC: Remove GENERIC_PENDING_IRQ	Thomas Gleixner
	Nothing uses the actual functionality and the MCIP controller sets the flags which disables the deferred affinity change. The other interrupt controller does not support affinity setting at all. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Vineet Gupta <vgupta@kernel.org> # arch/arc/ Link: https://lore.kernel.org/all/20241210103335.373392568@linutronix.de
2025-01-15	genirq: Remove handle_enforce_irqctx() wrapper	Thomas Gleixner
	Now that it is unconditionally available, remove the wrapper. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241210101811.561078243@linutronix.de
2025-01-15	genirq: Make handle_enforce_irqctx() unconditionally available	Thomas Gleixner
	Commit 1b57d91b969c ("irqchip/gic-v2, v3: Prevent SW resends entirely") sett the flag which enforces interrupt handling in interrupt context and prevents software base resends for ARM GIC v2/v3. But it missed that the helper function which checks the flag was hidden behind CONFIG_GENERIC_PENDING_IRQ, which is not set by ARM[64]. Make the helper unconditionally available so that the enforcement actually works. Fixes: 1b57d91b969c ("irqchip/gic-v2, v3: Prevent SW resends entirely") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20241210101811.497716609@linutronix.de
2025-01-15	irqchip: Plug a OF node reference leak in platform_irqchip_probe()	Joe Hattori
	platform_irqchip_probe() leaks a OF node when irq_init_cb() fails. Fix it by declaring par_np with the __free(device_node) cleanup construct. This bug was found by an experimental static analysis tool that I am developing. Fixes: f8410e626569 ("irqchip: Add IRQCHIP_PLATFORM_DRIVER_BEGIN/END and IRQCHIP_MATCH helper macros") Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241215033945.3414223-1-joe@pf.is.s.u-tokyo.ac.jp
2025-01-15	selftests: net: Adapt ethtool mq tests to fix in qdisc graft	Victor Nogueira
	Because of patch[1] the graft behaviour changed So the command: tcq replace parent 100:1 handle 204: Is no longer valid and will not delete 100:4 added by command: tcq replace parent 100:4 handle 204: pfifo_fast So to maintain the original behaviour, this patch manually deletes 100:4 and grafts 100:1 Note: This change will also work fine without [1] [1] https://lore.kernel.org/netdev/20250111151455.75480-1-jhs@mojatatu.com/T/#u Signed-off-by: Victor Nogueira <victor@mojatatu.com> Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-15	irqchip/loongarch-avec: Add multi-nodes topology support	Tianyang Zhang
	avecintc_init() enables the Advanced Interrupt Controller (AVEC) of the boot CPU node, but nothing enables the AVEC on secondary nodes. Move the enablement to the CPU hotplug callback so that secondary nodes get the AVEC enabled too. In theory enabling it once per node would be sufficient, but redundant enabling does no hurt, so keep the code simple and do it unconditionally. Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://lore.kernel.org/all/20250111023704.17285-1-zhangtianyang@loongson.cn
2025-01-15	irqchip/ts4800: Replace seq_printf() by seq_puts()	Geert Uytterhoeven
	Simplify "seq_printf(p, "%s", ...)" to "seq_puts(p, ...)". Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/1ba5692126804f9e1ff062ac24939b24030b4f72.1733403985.git.geert+renesas@glider.be
2025-01-15	irqchip/ti-sci-inta : Add module build support	Nicolas Frayer
	Add module build support in Kconfig for the TI SCI interrupt aggregator driver. The driver's default build is built-in and it also depends on ARCH_K3 as the driver uses some 64 bit ops and should only be built for 64-bit platforms. Signed-off-by: Nicolas Frayer <nfrayer@baylibre.com> Signed-off-by: Guillaume La Roque <glaroque@baylibre.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Nishanth Menon <nm@ti.com> Link: https://lore.kernel.org/all/20241224-timodules-v4-2-c5e010f58e2c@baylibre.com
2025-01-15	irqchip/ti-sci-intr: Add module build support	Nicolas Frayer
	Add module build support in Kconfig for the TI SCI interrupt router driver. This driver depends on the TI sci firmware driver which aready supports module build. Signed-off-by: Nicolas Frayer <nfrayer@baylibre.com> Signed-off-by: Guillaume La Roque <glaroque@baylibre.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Nishanth Menon <nm@ti.com> Link: https://lore.kernel.org/all/20241224-timodules-v4-1-c5e010f58e2c@baylibre.com
2025-01-15	irqchip/irq-brcmstb-l2: Replace brcmstb_l2_mask_and_ack() by generic function	Dr. David Alan Gilbert
	Replace brcmstb_l2_mask_and_ack() by the generic irq_gc_mask_disable_and_ack_set(). brcmstb_l2_mask_and_ack() was added in commit 49aa6ef0b439 ("irqchip/brcmstb-l2: Remove some processing from the handler") in September 2017 with a comment saying it was actually generic and someone should add it to the generic code. commit 20608924cc2e ("genirq: generic chip: Add irq_gc_mask_disable_and_ack_set()") did that a few weeks later, however no one went back and took the brcmstb variant out. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/all/20241224001727.149337-1-linux@treblig.org
2025-01-15	irqchip: keystone: Use syscon_regmap_lookup_by_phandle_args	Krzysztof Kozlowski
	Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over syscon_regmap_lookup_by_phandle() combined with getting the syscon argument. Except simpler code this annotates within one line that given phandle has arguments, so grepping for code would be easier. There is also no real benefit in printing errors on missing syscon argument, because this is done just too late: runtime check on static/build-time data. Dtschema and Devicetree bindings offer the static/build-time check for this already. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250111185414.183971-1-krzysztof.kozlowski@linaro.org
2025-01-15	irqchip/sunxi-nmi: Add missing SKIP_WAKE flag	Philippe Simons
	Some boards with Allwinner SoCs connect the PMIC's IRQ pin to the SoC's NMI pin instead of a normal GPIO. Since the power key is connected to the PMIC, and people expect to wake up a suspended system via this key, the NMI IRQ controller must stay alive when the system goes into suspend. Add the SKIP_WAKE flag to prevent the sunxi NMI controller from going to sleep, so that the power key can wake up those systems. [ tglx: Fixed up coding style ] Signed-off-by: Philippe Simons <simons.philippe@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250112123402.388520-1-simons.philippe@gmail.com
2025-01-15	irqchip/gic-v3-its: Don't enable interrupts in its_irq_set_vcpu_affinity()	Tomas Krcka
	The following call-chain leads to enabling interrupts in a nested interrupt disabled section: irq_set_vcpu_affinity() irq_get_desc_lock() raw_spin_lock_irqsave() <--- Disable interrupts its_irq_set_vcpu_affinity() guard(raw_spinlock_irq) <--- Enables interrupts when leaving the guard() irq_put_desc_unlock() <--- Warns because interrupts are enabled This was broken in commit b97e8a2f7130, which replaced the original raw_spin_[un]lock() pair with guard(raw_spinlock_irq). Fix the issue by using guard(raw_spinlock). [ tglx: Massaged change log ] Fixes: b97e8a2f7130 ("irqchip/gic-v3-its: Fix potential race condition in its_vlpi_prop_update()") Signed-off-by: Tomas Krcka <krckatom@amazon.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241230150825.62894-1-krckatom@amazon.de
2025-01-15	irqchip/gic-v3: Handle CPU_PM_ENTER_FAILED correctly	Yogesh Lal
	When a CPU attempts to enter low power mode, it disables the redistributor and Group 1 interrupts and reinitializes the system registers upon wakeup. If the transition into low power mode fails, then the CPU_PM framework invokes the PM notifier callback with CPU_PM_ENTER_FAILED to allow the drivers to undo the state changes. The GIC V3 driver ignores CPU_PM_ENTER_FAILED, which leaves the GIC in disabled state. Handle CPU_PM_ENTER_FAILED in the same way as CPU_PM_EXIT to restore normal operation. [ tglx: Massage change log, add Fixes tag ] Fixes: 3708d52fc6bb ("irqchip: gic-v3: Implement CPU PM notifier") Signed-off-by: Yogesh Lal <quic_ylal@quicinc.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20241220093907.2747601-1-quic_ylal@quicinc.com
2025-01-15	drm/bridge: ite-it6263: Prevent error pointer dereference in probe()	Dan Carpenter
	If devm_i2c_new_dummy_device() fails then we were supposed to return an error code, but instead the function continues and will crash on the next line. Add the missing return statement. Fixes: 049723628716 ("drm/bridge: Add ITE IT6263 LVDS to HDMI converter") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Liu Ying <victor.liu@nxp.com> Signed-off-by: Liu Ying <victor.liu@nxp.com> Link: https://patchwork.freedesktop.org/patch/msgid/804a758b-f2e7-4116-b72d-29bc8905beed@stanley.mountain
2025-01-14	net: fec: handle page_pool_dev_alloc_pages error	Kevin Groeneveld
	The fec_enet_update_cbd function calls page_pool_dev_alloc_pages but did not handle the case when it returned NULL. There was a WARN_ON(!new_page) but it would still proceed to use the NULL pointer and then crash. This case does seem somewhat rare but when the system is under memory pressure it can happen. One case where I can duplicate this with some frequency is when writing over a smbd share to a SATA HDD attached to an imx6q. Setting /proc/sys/vm/min_free_kbytes to higher values also seems to solve the problem for my test case. But it still seems wrong that the fec driver ignores the memory allocation error and can crash. This commit handles the allocation error by dropping the current packet. Fixes: 95698ff6177b5 ("net: fec: using page pool to manage RX buffers") Signed-off-by: Kevin Groeneveld <kgroeneveld@lenbrook.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20250113154846.1765414-1-kgroeneveld@lenbrook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14	net: netpoll: ensure skb_pool list is always initialized	John Sperbeck
	When __netpoll_setup() is called directly, instead of through netpoll_setup(), the np->skb_pool list head isn't initialized. If skb_pool_flush() is later called, then we hit a NULL pointer in skb_queue_purge_reason(). This can be seen with this repro, when CONFIG_NETCONSOLE is enabled as a module: ip tuntap add mode tap tap0 ip link add name br0 type bridge ip link set dev tap0 master br0 modprobe netconsole netconsole=4444@10.0.0.1/br0,9353@10.0.0.2/ rmmod netconsole The backtrace is: BUG: kernel NULL pointer dereference, address: 0000000000000008 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page ... ... ... Call Trace: <TASK> __netpoll_free+0xa5/0xf0 br_netpoll_cleanup+0x43/0x50 [bridge] do_netpoll_cleanup+0x43/0xc0 netconsole_netdev_event+0x1e3/0x300 [netconsole] unregister_netdevice_notifier+0xd9/0x150 cleanup_module+0x45/0x920 [netconsole] __se_sys_delete_module+0x205/0x290 do_syscall_64+0x70/0x150 entry_SYSCALL_64_after_hwframe+0x76/0x7e Move the skb_pool list setup and initial skb fill into __netpoll_setup(). Fixes: 221a9c1df790 ("net: netpoll: Individualize the skb pool") Signed-off-by: John Sperbeck <jsperbeck@google.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20250114011354.2096812-1-jsperbeck@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14	net: xilinx: axienet: Fix IRQ coalescing packet count overflow	Sean Anderson
	If coalesce_count is greater than 255 it will not fit in the register and will overflow. This can be reproduced by running # ethtool -C ethX rx-frames 256 which will result in a timeout of 0us instead. Fix this by checking for invalid values and reporting an error. Fixes: 8a3b7a252dca ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver") Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Reviewed-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com> Link: https://patch.msgid.link/20250113163001.2335235-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14	hwmon: (tmp513) Fix division of negative numbers	David Lechner
	Fix several issues with division of negative numbers in the tmp513 driver. The docs on the DIV_ROUND_CLOSEST macro explain that dividing a negative value by an unsigned type is undefined behavior. The driver was doing this in several places, i.e. data->shunt_uohms has type of u32. The actual "undefined" behavior is that it converts both values to unsigned before doing the division, for example: int ret = DIV_ROUND_CLOSEST(-100, 3U); results in ret == 1431655732 instead of -33. Furthermore the MILLI macro has a type of unsigned long. Multiplying a signed long by an unsigned long results in an unsigned long. So, we need to cast both MILLI and data data->shunt_uohms to long when using the DIV_ROUND_CLOSEST macro. Fixes: f07f9d2467f4 ("hwmon: (tmp513) Use SI constants from units.h") Fixes: 59dfa75e5d82 ("hwmon: Add driver for Texas Instruments TMP512/513 sensor chips.") Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://lore.kernel.org/r/20250114-fix-si-prefix-macro-sign-bugs-v1-1-696fd8d10f00@baylibre.com [groeck: Drop some continuation lines] Signed-off-by: Guenter Roeck <linux@roeck-us.net>
2025-01-14	nfp: bpf: prevent integer overflow in nfp_bpf_event_output()	Dan Carpenter
	The "sizeof(struct cmsg_bpf_event) + pkt_size + data_size" math could potentially have an integer wrapping bug on 32bit systems. Check for this and return an error. Fixes: 9816dd35ecec ("nfp: bpf: perf event output helpers support") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/6074805b-e78d-4b8a-bf05-e929b5377c28@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14	x86/fred: Fix the FRED RSP0 MSR out of sync with its per-CPU cache	Xin Li (Intel)
	The FRED RSP0 MSR is only used for delivering events when running userspace. Linux leverages this property to reduce expensive MSR writes and optimize context switches. The kernel only writes the MSR when about to run userspace and when the MSR has actually changed since the last time userspace ran. This optimization is implemented by maintaining a per-CPU cache of FRED RSP0 and then checking that against the value for the top of current task stack before running userspace. However cpu_init_fred_exceptions() writes the MSR without updating the per-CPU cache. This means that the kernel might return to userspace with MSR_IA32_FRED_RSP0==0 when it needed to point to the top of current task stack. This would induce a double fault (#DF), which is bad. A context switch after cpu_init_fred_exceptions() can paper over the issue since it updates the cached value. That evidently happens most of the time explaining how this bug got through. Fix the bug through resynchronizing the FRED RSP0 MSR with its per-CPU cache in cpu_init_fred_exceptions(). Fixes: fe85ee391966 ("x86/entry: Set FRED RSP0 on return to userspace instead of context switch") Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20250110174639.1250829-1-xin%40zytor.com
2025-01-14	Merge tag 'seccomp-v6.13-rc8' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp fix from Kees Cook: "Fix a randconfig failure: - Unconditionally define stub for !CONFIG_SECCOMP (Linus Walleij)" * tag 'seccomp-v6.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: seccomp: Stub for !CONFIG_SECCOMP
2025-01-14	Merge branch '100GbE' of ↵	Jakub Kicinski
	git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Fix E825 initialization Grzegorz Nitka says: E825 products have incorrect initialization procedure, which may lead to initialization failures and register values. Fix E825 products initialization by adding correct sync delay, checking the PHY revision only for current PHY and adding proper destination device when reading port/quad. In addition, E825 uses PF ID for indexing per PF registers and as a primary PHY lane number, which is incorrect. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: Add correct PHY lane assignment ice: Fix ETH56G FC-FEC Rx offset value ice: Fix quad registers read on E825 ice: Fix E825 initialization ==================== Link: https://patch.msgid.link/20250113182840.3564250-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14	Merge branch 'mptcp-fixes-for-connect-selftest-flakes'	Jakub Kicinski
	Matthieu Baerts says: ==================== mptcp: fixes for connect selftest flakes Last week, Jakub reported [1] that the MPTCP Connect selftest was unstable. It looked like it started after the introduction of some fixes [2]. After analysis from Paolo, these patches revealed existing bugs, that should be fixed by the following patches. - Patch 1: Make sure ACK are sent when MPTCP-level window re-opens. In some corner cases, the other peer was not notified when more data could be sent. A fix for v5.11, but depending on a feature introduced in v5.19. - Patch 2: Fix spurious wake-up under memory pressure. In this situation, the userspace could be invited to read data not being there yet. A fix for v6.7. - Patch 3: Fix a false positive error when running the MPTCP Connect selftest with the "disconnect" cases. The userspace could disconnect the socket too soon, which would reset (MP_FASTCLOSE) the connection, interpreted as an error by the test. A fix for v5.17. Link: https://lore.kernel.org/20250107131845.5e5de3c5@kernel.org [1] Link: https://lore.kernel.org/20241230-net-mptcp-rbuf-fixes-v1-0-8608af434ceb@kernel.org [2] ==================== Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-0-0d986ee7b1b6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14	selftests: mptcp: avoid spurious errors on disconnect	Paolo Abeni
	The disconnect test-case generates spurious errors: INFO: disconnect INFO: extra options: -I 3 -i /tmp/tmp.r43niviyoI 01 ns1 MPTCP -> ns1 (10.0.1.1:10000 ) MPTCP (duration 140ms) [FAIL] file received by server does not match (in, out): Unexpected revents: POLLERR/POLLNVAL(19) -rw-r--r-- 1 root root 10028676 Jan 10 10:47 /tmp/tmp.r43niviyoI.disconnect Trailing bytes are: ��\��R��!8��u2��5N% -rw------- 1 root root 9992290 Jan 10 10:47 /tmp/tmp.Os4UbnWbI1 Trailing bytes are: ��\��R��!8��u2��5N% 02 ns1 MPTCP -> ns1 (dead:beef:1::1:10001) MPTCP (duration 206ms) [ OK ] 03 ns1 MPTCP -> ns1 (dead:beef:1::1:10002) TCP (duration 31ms) [ OK ] 04 ns1 TCP -> ns1 (dead:beef:1::1:10003) MPTCP (duration 26ms) [ OK ] [FAIL] Tests of the full disconnection have failed Time: 2 seconds The root cause is actually in the user-space bits: the test program currently disconnects as soon as all the pending data has been spooled, generating an FASTCLOSE. If such option reaches the peer before the latter has reached the closed status, the msk socket will report an error to the user-space, as per protocol specification, causing the above failure. Address the issue explicitly waiting for all the relevant sockets to reach a closed status before performing the disconnect. Fixes: 05be5e273c84 ("selftests: mptcp: add disconnect tests") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-3-0d986ee7b1b6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14	mptcp: fix spurious wake-up on under memory pressure	Paolo Abeni
	The wake-up condition currently implemented by mptcp_epollin_ready() is wrong, as it could mark the MPTCP socket as readable even when no data are present and the system is under memory pressure. Explicitly check for some data being available in the receive queue. Fixes: 5684ab1a0eff ("mptcp: give rcvlowat some love") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-2-0d986ee7b1b6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14	mptcp: be sure to send ack when mptcp-level window re-opens	Paolo Abeni
	mptcp_cleanup_rbuf() is responsible to send acks when the user-space reads enough data to update the receive windows significantly. It tries hard to avoid acquiring the subflow sockets locks by checking conditions similar to the ones implemented at the TCP level. To avoid too much code duplication - the MPTCP protocol can't reuse the TCP helpers as part of the relevant status is maintained into the msk socket - and multiple costly window size computation, mptcp_cleanup_rbuf uses a rough estimate for the most recently advertised window size: the MPTCP receive free space, as recorded as at last-ack time. Unfortunately the above does not allow mptcp_cleanup_rbuf() to detect a zero to non-zero win change in some corner cases, skipping the tcp_cleanup_rbuf call and leaving the peer stuck. After commit ea66758c1795 ("tcp: allow MPTCP to update the announced window"), MPTCP has actually cheap access to the announced window value. Use it in mptcp_cleanup_rbuf() for a more accurate ack generation. Fixes: e3859603ba13 ("mptcp: better msk receive window updates") Cc: stable@vger.kernel.org Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/20250107131845.5e5de3c5@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250113-net-mptcp-connect-st-flakes-v1-1-0d986ee7b1b6@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14	drm/v3d: Ensure job pointer is set to NULL after job completion	Maíra Canal
	After a job completes, the corresponding pointer in the device must be set to NULL. Failing to do so triggers a warning when unloading the driver, as it appears the job is still active. To prevent this, assign the job pointer to NULL after completing the job, indicating the job has finished. Fixes: 14d1d1908696 ("drm/v3d: Remove the bad signaled() implementation.") Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250113154741.67520-1-mcanal@igalia.com