linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-08-05	Input: cyttsp4 - remove driver	Dmitry Torokhov
	The cyttsp4 touchscreen driver was contributed in 2013 and since then has seen no updates. The driver uses platform data (no device tree support) and there are no users of it in the mainline kernel. There were occasional fixes to it for issues either found by static code analysis tools or via visual inspection, but otherwise the driver is completely untested. Remove the driver. Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://lore.kernel.org/r/ZrAZ2cUow_z838tp@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2024-08-05	hid-asus: use hid for brightness control on keyboard	Luke D. Jones
	On almost all ASUS ROG series laptops the MCU used for the USB keyboard also has a HID packet used for setting the brightness. This is usually the same as the WMI method. But in some laptops the WMI method either is missing or doesn't work, so we should default to the HID control. Signed-off-by: Luke D. Jones <luke@ljones.dev> Acked-by: Benjamin Tissoires <bentiss@kernel.org> Link: https://lore.kernel.org/r/20240713074733.77334-2-luke@ljones.dev Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-08-05	cleanup: Add usage and style documentation	Dan Williams
	When proposing that PCI grow some new cleanup helpers for pci_dev_put() and pci_dev_{lock,unlock} [1], Bjorn had some fundamental questions about expectations and best practices. Upon reviewing an updated changelog with those details he recommended adding them to documentation in the header file itself. Add that documentation and link it into the rendering for Documentation/core-api/. Signed-off-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/171175585714.2192972.12661675876300167762.stgit@dwillia2-xfh.jf.intel.com
2024-08-05	pmdomain: core: Enable s2idle for CPU PM domains on PREEMPT_RT	Ulf Hansson
	To allow a genpd provider for a CPU PM domain to enter a domain-idle-state during s2idle on a PREEMPT_RT based configuration, we can't use the regular spinlock, as they are turned into sleepable locks on PREEMPT_RT. To address this problem, let's convert into using the raw spinlock, but only for genpd providers that have the GENPD_FLAG_CPU_DOMAIN bit set. In this way, the lock can still be acquired/released in atomic context, which is needed in the idle-path for PREEMPT_RT. Do note that the genpd power-on/off notifiers may also be fired during s2idle, but these are already prepared for PREEMPT_RT as they are based on the raw notifiers. However, consumers of them may need to adopt accordingly to work properly on PREEMPT_RT. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Tested-by: Raghavendra Kakarla <quic_rkakarla@quicinc.com> # qcm6490 with PREEMPT_RT set Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20240527142557.321610-2-ulf.hansson@linaro.org
2024-08-05	bpf: kprobe: remove unused declaring of bpf_kprobe_override	Menglong Dong
	After the commit 66665ad2f102 ("tracing/kprobe: bpf: Compare instruction pointer with original one"), "bpf_kprobe_override" is not used anywhere anymore, and we can remove it now. Link: https://lore.kernel.org/all/20240710085939.11520-1-dongml2@chinatelecom.cn/ Fixes: 66665ad2f102 ("tracing/kprobe: bpf: Compare instruction pointer with original one") Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-08-04	profiling: remove profile=sleep support	Tetsuo Handa
	The kernel sleep profile is no longer working due to a recursive locking bug introduced by commit 42a20f86dc19 ("sched: Add wrapper for get_wchan() to keep task blocked") Booting with the 'profile=sleep' kernel command line option added or executing # echo -n sleep > /sys/kernel/profiling after boot causes the system to lock up. Lockdep reports kthreadd/3 is trying to acquire lock: ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: get_wchan+0x32/0x70 but task is already holding lock: ffff93ac82e08d58 (&p->pi_lock){....}-{2:2}, at: try_to_wake_up+0x53/0x370 with the call trace being lock_acquire+0xc8/0x2f0 get_wchan+0x32/0x70 __update_stats_enqueue_sleeper+0x151/0x430 enqueue_entity+0x4b0/0x520 enqueue_task_fair+0x92/0x6b0 ttwu_do_activate+0x73/0x140 try_to_wake_up+0x213/0x370 swake_up_locked+0x20/0x50 complete+0x2f/0x40 kthread+0xfb/0x180 However, since nobody noticed this regression for more than two years, let's remove 'profile=sleep' support based on the assumption that nobody needs this functionality. Fixes: 42a20f86dc19 ("sched: Add wrapper for get_wchan() to keep task blocked") Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-08-04	Merge branch 'sched/core' of ↵	Tejun Heo
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into for-6.12 Pull tip/sched/core to resolve the following four conflicts. While 2-4 are simple context conflicts, 1 is a bit subtle and easy to resolve incorrectly. 1. 2c8d046d5d51 ("sched: Add normal_policy()") vs. faa42d29419d ("sched/fair: Make SCHED_IDLE entity be preempted in strict hierarchy") The former converts direct test on p->policy to use the helper normal_policy(). The latter moves the p->policy test to a different location. Resolve by converting the test on p->plicy in the new location to use normal_policy(). 2. a7a9fc549293 ("sched_ext: Add boilerplate for extensible scheduler class") vs. a110a81c52a9 ("sched/deadline: Deferrable dl server") Both add calls to put_prev_task_idle() and set_next_task_idle(). Simple context conflict. Resolve by taking changes from both. 3. a7a9fc549293 ("sched_ext: Add boilerplate for extensible scheduler class") vs. c245910049d0 ("sched/core: Add clearing of ->dl_server in put_prev_task_balance()") The former changes for_each_class() itertion to use for_each_active_class(). The latter moves away the adjacent dl_server handling code. Simple context conflict. Resolve by taking changes from both. 4. 60c27fb59f6c ("sched_ext: Implement sched_ext_ops.cpu_online/offline()") vs. 31b164e2e4af ("sched/smt: Introduce sched_smt_present_inc/dec() helper") 2f027354122f ("sched/core: Introduce sched_set_rq_on/offline() helper") The former adds scx_rq_deactivate() call. The latter two change code around it. Simple context conflict. Resolve by taking changes from both. Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-04	i2c: Fix conditional for substituting empty ACPI functions	Richard Fitzgerald
	Add IS_ENABLED(CONFIG_I2C) to the conditional around a bunch of ACPI functions. The conditional around these functions depended only on CONFIG_ACPI. But the functions are implemented in I2C core, so are only present if CONFIG_I2C is enabled. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
2024-08-04	pinctrl: pinconf-generic: Add support for "input-schmitt-microvolt" property	Inochi Amaoto
	Add "input-schmitt-microvolt" property to generic options used for DT parsing files. This enables drivers, which use generic pin configurations, to get the value passed to this property. Signed-off-by: Inochi Amaoto <inochiama@outlook.com> Link: https://lore.kernel.org/IA1PR20MB4953806785BA04E075DC4F03BBAC2@IA1PR20MB4953.namprd20.prod.outlook.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2024-08-03	iio: backend: add a modified prbs23 support	Nuno Sa
	Support ADI specific prb23 sequence that can be used both for calibrating or debugging digital interfaces. Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20240802-dev-iio-backend-add-debugfs-v2-3-4cb62852f0d0@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-03	iio: backend: add debugFs interface	Nuno Sa
	This adds a basic debugfs interface for backends. Two new ops are being added: * debugfs_reg_access: Analogous to the core IIO one but for backend devices. * debugfs_print_chan_status: One useful usecase for this one is for testing test tones in a digital interface and "ask" the backend to dump more details on why a test tone might have errors. Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20240802-dev-iio-backend-add-debugfs-v2-2-4cb62852f0d0@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-03	iio: backend: introduce struct iio_backend_info	Nuno Sa
	Instead of only passing the backend ops when calling devm_iio_backend_register(), pass an info like structure that will contains the ops and additional information. Fow now, the backend name is being added as that will be used by the debugFS interface introduced in a later patch. It also opens the door for further customizations passed by backends. All users of devm_iio_backend_register() were updated accordingly. Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20240802-dev-iio-backend-add-debugfs-v2-1-4cb62852f0d0@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-03	iio: core: add function to retrieve active_scan_mask index	Julien Stephan
	Add a function to retrieve the index of the active scan mask inside the available scan masks array. As in iio_scan_mask_match and iio_sanity_check_avail_scan_masks, this function does not handle multi-long masks correctly. It only checks the first long to be zero, and will use such mask as a terminator even if there was bits set after the first long. This should be fine since the available_scan_mask has already been sanity tested using iio_sanity_check_avail_scan_masks. See iio_scan_mask_match and iio_sanity_check_avail_scan_masks for more details Signed-off-by: Julien Stephan <jstephan@baylibre.com> Reviewed-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20240731-ad7380-add-single-ended-chips-v2-2-cd63bf05744c@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-03	Merge tag 'spi-mosi-config' into togreg	Jonathan Cameron
	spi: Support MOSI idle configuration Add support for configuring the idle state of the MOSI signal in controllers.
2024-08-03	iio: core: annotate masklength as __private	Nuno Sa
	Now that all users are using the proper accessors, we can mark masklength as __private so that no one tries to write. We also get help from checkers in warning us in case someone does it. To access the private field from IIO core code, we need to use the ACCESS_PRIVATE() macro. Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20240726-dev-iio-masklength-private3-v1-23-82913fc0fb87@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-03	iio: backend: spelling: continuous -> continuous	David Lechner
	This fixes the spelling in IIO_BACKEND_INTERNAL_CONTINUOUS_WAVE. Signed-off-by: David Lechner <dlechner@baylibre.com> Link: https://patch.msgid.link/20240726-iio-backend-spelling-continuous-v1-1-467c6e3f78ff@baylibre.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-03	iio: backend: remove unused parameter	Nuno Sa
	Indio_dev was not being used in iio_backend_extend_chan_spec() so remove it. Signed-off-by: Nuno Sa <nuno.sa@analog.com> Link: https://patch.msgid.link/20240709-dev-iio-backend-add-debugfs-v1-1-fb4b8f2373c7@analog.com Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
2024-08-02	linkmode: Change return type of linkmode_andnot to bool	Simon Horman
	linkmode_andnot() simply returns the result of bitmap_andnot(). And the return type of bitmap_andnot() is bool. So it makes sense for the return type of linkmode_andnot() to also be bool. I checked all call-sites and they either ignore the return value or treat it as a bool. Compile tested only. Link: https://lore.kernel.org/netdev/68088998-4486-4930-90a4-96a32f08c490@lunn.ch/ Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240801-linkfield-bowl-v1-1-d58f68967802@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-02	net: remove IFF_* re-definition	Jakub Kicinski
	We re-define values of enum netdev_priv_flags as preprocessor macros with the same name. I guess this was done to avoid breaking out of tree modules which may use #ifdef X for kernel compatibility? Commit 7aa98047df95 ("net: move net_device priv_flags out from UAPI") which added the enum doesn't say. In any case, the flags with defines are quite old now, and defines for new flags don't get added. OOT drivers have to resort to code greps for compat detection, anyway. Let's delete these defines, save LoC, help LXR link to the right place. Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20240801163401.378723-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-02	spi: Add dummy definitions for ACPI lookup functions	Mark Brown
	Merge series from Richard Fitzgerald <rf@opensource.cirrus.com>: Provide empty versions of acpi_spi_count_resources(), acpi_spi_device_alloc() and acpi_spi_find_controller_by_adev() if the real functions are not being built. This commit fixes two problems with the original definitions: 1) There wasn't an empty version of these functions 2) The #if only depended on CONFIG_ACPI. But the functions are implemented in the core spi.c so CONFIG_SPI_MASTER must also be enabled for the real functions to exist.
2024-08-02	sched_ext: Allow p->scx.disallow only while loading	Tejun Heo
	From 1232da7eced620537a78f19c8cf3d4a3508e2419 Mon Sep 17 00:00:00 2001 From: Tejun Heo <tj@kernel.org> Date: Wed, 31 Jul 2024 09:14:52 -1000 p->scx.disallow provides a way for the BPF scheduler to reject certain tasks from attaching. It's currently allowed for both the load and fork paths; however, the latter doesn't actually work as p->sched_class is already set by the time scx_ops_init_task() is called during fork. This is a convenience feature which is mostly useful from the load path anyway. Allow it only from the load path. v2: Trigger scx_ops_error() iff @p->policy == SCHED_EXT to make it a bit easier for the BPF scheduler (David). Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: "Zhangqiao (2012 lab)" <zhangqiao22@huawei.com> Link: http://lkml.kernel.org/r/20240711110720.1285-1-zhangqiao22@huawei.com Fixes: 7bb6f0810ecf ("sched_ext: Allow BPF schedulers to disallow specific tasks from joining SCHED_EXT") Acked-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-08-02	spi: Add empty versions of ACPI functions	Richard Fitzgerald
	Provide empty versions of acpi_spi_count_resources(), acpi_spi_device_alloc() and acpi_spi_find_controller_by_adev() if the real functions are not being built. This commit fixes two problems with the original definitions: 1) There wasn't an empty version of these functions 2) The #if only depended on CONFIG_ACPI. But the functions are implemented in the core spi.c so CONFIG_SPI_MASTER must also be enabled for the real functions to exist. Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20240802152215.20831-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-08-02	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull kvm updates from Paolo Bonzini: "The bulk of the changes here is a largish change to guest_memfd, delaying the clearing and encryption of guest-private pages until they are actually added to guest page tables. This started as "let's make it impossible to misuse the API" for SEV-SNP; but then it ballooned a bit. The new logic is generally simpler and more ready for hugepage support in guest_memfd. Summary: - fix latent bug in how usage of large pages is determined for confidential VMs - fix "underline too short" in docs - eliminate log spam from limited APIC timer periods - disallow pre-faulting of memory before SEV-SNP VMs are initialized - delay clearing and encrypting private memory until it is added to guest page tables - this change also enables another small cleanup: the checks in SNP_LAUNCH_UPDATE that limit it to non-populated, private pages can now be moved in the common kvm_gmem_populate() function - fix compilation error that the RISC-V merge introduced in selftests" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/mmu: fix determination of max NPT mapping level for private pages KVM: riscv: selftests: Fix compile error KVM: guest_memfd: abstract how prepared folios are recorded KVM: guest_memfd: let kvm_gmem_populate() operate only on private gfns KVM: extend kvm_range_has_memory_attributes() to check subset of attributes KVM: cleanup and add shortcuts to kvm_range_has_memory_attributes() KVM: guest_memfd: move check for already-populated page to common code KVM: remove kvm_arch_gmem_prepare_needed() KVM: guest_memfd: make kvm_gmem_prepare_folio() operate on a single struct kvm KVM: guest_memfd: delay kvm_gmem_prepare_folio() until the memory is passed to the guest KVM: guest_memfd: return locked folio from __kvm_gmem_get_pfn KVM: rename CONFIG_HAVE_KVM_GMEM_* to CONFIG_HAVE_KVM_ARCH_GMEM_* KVM: guest_memfd: do not go through struct page KVM: guest_memfd: delay folio_mark_uptodate() until after successful preparation KVM: guest_memfd: return folio from __kvm_gmem_get_pfn() KVM: x86: disallow pre-fault for SNP VMs before initialization KVM: Documentation: Fix title underline too short warning KVM: x86: Eliminate log spam from limited APIC timer periods
2024-08-02	Merge branch 'kvm-fixes' into HEAD	Paolo Bonzini
	* fix latent bug in how usage of large pages is determined for confidential VMs * fix "underline too short" in docs * eliminate log spam from limited APIC timer periods * disallow pre-faulting of memory before SEV-SNP VMs are initialized * delay clearing and encrypting private memory until it is added to guest page tables * this change also enables another small cleanup: the checks in SNP_LAUNCH_UPDATE that limit it to non-populated, private pages can now be moved in the common kvm_gmem_populate() function
2024-08-02	Merge tag 'riscv-for-linus-6.11-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix to avoid dropping some of the internal pseudo-extensions, which breaks envcfg dependency parsing - The kernel entry address is now aligned in purgatory, which avoids a misaligned load that can lead to crash on systems that don't support misaligned accesses early in boot - The FW_SFENCE_VMA_RECEIVED perf event was duplicated in a handful of perf JSON configurations, one of them been updated to FW_SFENCE_VMA_ASID_SENT - The starfive cache driver is now restricted to 64-bit systems, as it isn't 32-bit clean - A fix for to avoid aliasing legacy-mode perf counters with software perf counters - VM_FAULT_SIGSEGV is now handled in the page fault code - A fix for stalls during CPU hotplug due to IPIs being disabled - A fix for memblock bounds checking. This manifests as a crash on systems with discontinuous memory maps that have regions that don't fit in the linear map tag 'riscv-for-linus-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: Fix linear mapping checks for non-contiguous memory regions RISC-V: Enable the IPI before workqueue_online_cpu() riscv/mm: Add handling for VM_FAULT_SIGSEGV in mm_fault_error() perf: riscv: Fix selecting counters in legacy mode cache: StarFive: Require a 64-bit system perf arch events: Fix duplicate RISC-V SBI firmware event name riscv/purgatory: align riscv_kernel_entry riscv: cpufeature: Do not drop Linux-internal extensions
2024-08-02	clockevents/drivers/i8253: Fix stop sequence for timer 0	David Woodhouse
	According to the data sheet, writing the MODE register should stop the counter (and thus the interrupts). This appears to work on real hardware, at least modern Intel and AMD systems. It should also work on Hyper-V. However, on some buggy virtual machines the mode change doesn't have any effect until the counter is subsequently loaded (or perhaps when the IRQ next fires). So, set MODE 0 and then load the counter, to ensure that those buggy VMs do the right thing and the interrupts stop. And then write MODE 0 again to stop the counter on compliant implementations too. Apparently, Hyper-V keeps firing the IRQ repeatedly even in mode zero when it should only happen once, but the second MODE write stops that too. Userspace test program (mostly written by tglx): ===== #include <stdio.h> #include <unistd.h> #include <stdlib.h> #include <stdint.h> #include <sys/io.h> static __always_inline void __out##bwl(type value, uint16_t port) \ { \ asm volatile("out" #bwl " %" #bw "0, %w1" \ : : "a"(value), "Nd"(port)); \ } \ \ static __always_inline type __in##bwl(uint16_t port) \ { \ type value; \ asm volatile("in" #bwl " %w1, %" #bw "0" \ : "=a"(value) : "Nd"(port)); \ return value; \ } BUILDIO(b, b, uint8_t) #define inb __inb #define outb __outb #define PIT_MODE 0x43 #define PIT_CH0 0x40 #define PIT_CH2 0x42 static int is8254; static void dump_pit(void) { if (is8254) { // Latch and output counter and status outb(0xC2, PIT_MODE); printf("%02x %02x %02x\n", inb(PIT_CH0), inb(PIT_CH0), inb(PIT_CH0)); } else { // Latch and output counter outb(0x0, PIT_MODE); printf("%02x %02x\n", inb(PIT_CH0), inb(PIT_CH0)); } } int main(int argc, char* argv[]) { int nr_counts = 2; if (argc > 1) nr_counts = atoi(argv[1]); if (argc > 2) is8254 = 1; if (ioperm(0x40, 4, 1) != 0) return 1; dump_pit(); printf("Set oneshot\n"); outb(0x38, PIT_MODE); outb(0x00, PIT_CH0); outb(0x0F, PIT_CH0); dump_pit(); usleep(1000); dump_pit(); printf("Set periodic\n"); outb(0x34, PIT_MODE); outb(0x00, PIT_CH0); outb(0x0F, PIT_CH0); dump_pit(); usleep(1000); dump_pit(); dump_pit(); usleep(100000); dump_pit(); usleep(100000); dump_pit(); printf("Set stop (%d counter writes)\n", nr_counts); outb(0x30, PIT_MODE); while (nr_counts--) outb(0xFF, PIT_CH0); dump_pit(); usleep(100000); dump_pit(); usleep(100000); dump_pit(); printf("Set MODE 0\n"); outb(0x30, PIT_MODE); dump_pit(); usleep(100000); dump_pit(); usleep(100000); dump_pit(); return 0; } ===== Suggested-by: Sean Christopherson <seanjc@google.com> Co-developed-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mhkelley@outlook.com> Link: https://lore.kernel.org/all/20240802135555.564941-2-dwmw2@infradead.org
2024-08-02	x86/i8253: Disable PIT timer 0 when not in use	David Woodhouse
	Leaving the PIT interrupt running can cause noticeable steal time for virtual guests. The VMM generally has a timer which toggles the IRQ input to the PIC and I/O APIC, which takes CPU time away from the guest. Even on real hardware, running the counter may use power needlessly (albeit not much). Make sure it's turned off if it isn't going to be used. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Michael Kelley <mhkelley@outlook.com> Link: https://lore.kernel.org/all/20240802135555.564941-1-dwmw2@infradead.org
2024-08-02	cpufreq: Remove LATENCY_MULTIPLIER	Qais Yousef
	The current LATENCY_MULTIPLIER which has been around for nearly 20 years causes rate_limit_us to be always in ms range. On M1 mac mini I get 50 and 56us transition latency, but due to the 1000 multiplier we end up setting rate_limit_us to 50 and 56ms, which gets capped into 2ms and was 10ms before e13aa799c2a6 ("cpufreq: Change default transition delay to 2ms") On Intel I5 system transition latency is 20us but due to the multiplier we end up with 20ms that again is capped to 2ms. Given how good modern hardware and how modern workloads require systems to be more responsive to cater for sudden changes in workload (tasks sleeping/wakeup/migrating, uclamp causing a sudden boost or cap) and that 2ms is quarter of the time of 120Hz refresh rate system, drop the old logic in favour of providing 50% headroom. rate_limit_us = 1.5 * latency. I considered not adding any headroom which could mean that we can end up with infinite back-to-back requests. I also considered providing a constant headroom (e.g: 100us) assuming that any h/w or f/w dealing with the request shouldn't require a large headroom when transition_latency is actually high. But for both cases I wasn't sure if h/w or f/w can end up being overwhelmed dealing with the freq requests in a potentially busy system. So I opted for providing 50% breathing room. This is expected to impact schedutil only as the other user, dbs_governor, takes the max(2*tick, transition_delay_us) and the former was at least 2ms on 1ms TICK, which is equivalent to the max_delay_us before applying this patch. For systems with TICK of 4ms, this value would have almost always ended up with 8ms sampling rate. For systems that report 0 transition latency, we still default to returning 1ms as transition delay. This helps in eliminating a source of latency for applying requests as mentioned in [1]. For example if we have a 1ms tick, most systems will miss sending an update at tick when updating the util_avg for a task/CPU (rate_limit_us will be 2ms for most systems). Link: https://lore.kernel.org/lkml/20240724212255.mfr2ybiv2j2uqek7@airbuntu/ # [1] Link: https://lore.kernel.org/lkml/20240205022500.2232124-1-qyousef@layalina.io/ Signed-off-by: Qais Yousef <qyousef@layalina.io> Link: https://patch.msgid.link/20240728192659.58115-1-qyousef@layalina.io [ rjw: Subject edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-02	thermal: trip: Drop thermal_zone_get_trip()	Rafael J. Wysocki
	There are no more callers of thermal_zone_get_trip() in the tree, so drop it. No functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/2220301.Mh6RI2rZIc@rjwysocki.net
2024-08-02	thermal: trip: Get rid of thermal_zone_get_num_trips()	Rafael J. Wysocki
	The only existing caller of thermal_zone_get_num_trips(), which is rcar_gen3_thermal_probe(), uses this function to put the number of trip points into a kernel log message, but this information is also available from the thermal sysfs interface. For this reason, remove the thermal_zone_get_num_trips() call from rcar_gen3_thermal_probe() and drop the former altogether. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Link: https://patch.msgid.link/2636988.Lt9SDvczpP@rjwysocki.net
2024-08-02	HID: core: add helper for finding a field with a certain usage	Kerem Karabay
	This helper will allow HID drivers to easily determine if they should bind to a hid_device by checking for the prescence of a certain field when its ID is not enough, which can be the case on USB devices with multiple interfaces and/or configurations. Convert google-hammer driver to use it, and remove now superfluous hammer_has_usage(). [jkosina@suse.com: expand changelog with the information about google-hammer being added as user of this API ] Signed-off-by: Kerem Karabay <kekrby@gmail.com> Signed-off-by: Aditya Garg <gargaditya08@live.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2024-08-02	uprobes: make uprobe_register() return struct uprobe *	Oleg Nesterov
	This way uprobe_unregister() and uprobe_apply() can use "struct uprobe *" rather than inode + offset. This simplifies the code and allows to avoid the unnecessary find_uprobe() + put_uprobe() in these functions. TODO: uprobe_unregister() still needs get_uprobe/put_uprobe to ensure that this uprobe can't be freed before up_write(&uprobe->register_rwsem). Co-developed-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20240801132734.GA8803@redhat.com
2024-08-02	uprobes: kill uprobe_register_refctr()	Oleg Nesterov
	It doesn't make any sense to have 2 versions of _register(). Note that trace_uprobe_enable(), the only user of uprobe_register(), doesn't need to check tu->ref_ctr_offset to decide which one should be used, it could safely pass ref_ctr_offset == 0 to uprobe_register_refctr(). Add this argument to uprobe_register(), update the callers, and kill uprobe_register_refctr(). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20240801132728.GA8800@redhat.com
2024-08-02	perf,x86: avoid missing caller address in stack traces captured in uprobe	Andrii Nakryiko
	When tracing user functions with uprobe functionality, it's common to install the probe (e.g., a BPF program) at the first instruction of the function. This is often going to be `push %rbp` instruction in function preamble, which means that within that function frame pointer hasn't been established yet. This leads to consistently missing an actual caller of the traced function, because perf_callchain_user() only records current IP (capturing traced function) and then following frame pointer chain (which would be caller's frame, containing the address of caller's caller). So when we have target_1 -> target_2 -> target_3 call chain and we are tracing an entry to target_3, captured stack trace will report target_1 -> target_3 call chain, which is wrong and confusing. This patch proposes a x86-64-specific heuristic to detect `push %rbp` (`push %ebp` on 32-bit architecture) instruction being traced. Given entire kernel implementation of user space stack trace capturing works under assumption that user space code was compiled with frame pointer register (%rbp/%ebp) preservation, it seems pretty reasonable to use this instruction as a strong indicator that this is the entry to the function. In that case, return address is still pointed to by %rsp/%esp, so we fetch it and add to stack trace before proceeding to unwind the rest using frame pointer-based logic. We also check for `endbr64` (for 64-bit modes) as another common pattern for function entry, as suggested by Josh Poimboeuf. Even if we get this wrong sometimes for uprobes attached not at the function entry, it's OK because stack trace will still be overall meaningful, just with one extra bogus entry. If we don't detect this, we end up with guaranteed to be missing caller function entry in the stack trace, which is worse overall. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20240729175223.23914-1-andrii@kernel.org
2024-08-02	perf: Support PERF_SAMPLE_READ with inherit	Ben Gainey
	This change allows events to use PERF_SAMPLE_READ with inherit so long as PERF_SAMPLE_TID is also set. This enables sample based profiling of a group of counters over a hierarchy of processes or threads. This is useful, for example, for collecting per-thread counters/metrics, event based sampling of multiple counters as a unit, access to the enabled and running time when using multiplexing and so on. Prior to this, users were restricted to either collecting aggregate statistics for a multi-threaded/-process application (e.g. with "perf stat"), or to sample individual threads, or to profile the entire system (which requires root or CAP_PERFMON, and may produce much more data than is required). Theoretically a tool could poll for or otherwise monitor thread/process creation and construct whatever events the user is interested in using perf_event_open, for each new thread or process, but this is racy, can lead to file-descriptor exhaustion, and ultimately just replicates the behaviour of inherit, but in userspace. This configuration differs from inherit without PERF_SAMPLE_READ in that the accumulated event count, and consequently any sample (such as if triggered by overflow of sample_period) will be on a per-thread rather than on an aggregate basis. The meaning of read_format::value field of both PERF_RECORD_READ and PERF_RECORD_SAMPLE is changed such that if the sampled event uses this new configuration then the values reported will be per-thread rather than the global aggregate value. This is a change from the existing semantics of read_format (where PERF_SAMPLE_READ is used without inherit), but it is necessary to expose the per-thread counter values, and it avoids reinventing a separate "read_format_thread" field that otherwise replicates the same behaviour. This change should not break existing tools, since this configuration was not previously valid and was rejected by the kernel. Tools that opt into this new mode will need to account for this when calculating the counter delta for a given sample. Tools that wish to have both the per-thread and aggregate value can perform the global aggregation themselves from the per-thread values. The change to read_format::value does not affect existing valid perf_event_attr configurations, nor does it change the behaviour of calls to "read" on an event descriptor. Both continue to report the aggregate value for the entire thread/process hierarchy. The difference between the results reported by "read" and PERF_RECORD_SAMPLE in this new configuration is justified on the basis that it is not (easily) possible for "read" to target a specific thread (the caller only has the fd for the original parent event). Signed-off-by: Ben Gainey <ben.gainey@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20240730084417.7693-3-ben.gainey@arm.com
2024-08-02	perf: Rename perf_event_context.nr_pending to nr_no_switch_fast.	Ben Gainey
	nr_pending counts the number of events in the context that either pending_sigtrap or pending_work, but it is used to prevent taking the fast path in perf_event_context_sched_out. Renamed to reflect what it is used for, rather than what it counts. This change allows using the field to track other event properties that also require skipping the fast path without possible confusion over the name. Signed-off-by: Ben Gainey <ben.gainey@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20240730084417.7693-2-ben.gainey@arm.com
2024-08-02	vsock/virtio: add SIOCOUTQ support for all virtio based transports	Luigi Leonardi
	Introduce support for virtio_transport_unsent_bytes ioctl for virtio_transport, vhost_vsock and vsock_loopback. For all transports the unsent bytes counter is incremented in virtio_transport_get_credit. In virtio_transport (G2H) and in vhost-vsock (H2G) the counter is decremented when the skbuff is consumed. In vsock_loopback the same skbuff is passed from the transmitter to the receiver, so the counter is decremented before queuing the skbuff to the receiver. Signed-off-by: Luigi Leonardi <luigi.leonardi@outlook.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-02	ata: libata: Print device quirks only once	Damien Le Moal
	In ata_dev_print_quirks(), return early if ata_dev_print_info() returns false or if we already printed quirk information. This is to avoid printing a device quirks multiple times (that is, each time ata_dev_revalidate() is called). To remember if ata_dev_print_quirks() was already executed, define the EH context flag ATA_EHI_DID_PRINT_QUIRKS and set this flag in ata_dev_print_quirks(). Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Fixes: 58157d607aec ("ata: libata: Print quirks applied to devices") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
2024-08-02	ata: libata: Remove ata_noop_qc_prep()	Damien Le Moal
	The function ata_noop_qc_prep(), as its name implies, does nothing and simply returns AC_ERR_OK. For drivers that do not need any special preparations of queued commands, we can avoid having to define struct ata_port qc_prep operation by simply testing if that operation is defined or not in ata_qc_issue(). Make this change and remove ata_noop_qc_prep(). Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
2024-08-01	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Link: https://patch.msgid.link/20240801131917.34494-1-pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-01	Merge tag 'net-6.11-rc2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from wireless, bleutooth, BPF and netfilter. Current release - regressions: - core: drop bad gso csum_start and offset in virtio_net_hdr - wifi: mt76: fix null pointer access in mt792x_mac_link_bss_remove - eth: tun: add missing bpf_net_ctx_clear() in do_xdp_generic() - phy: aquantia: only poll GLOBAL_CFG regs on aqr113, aqr113c and aqr115c Current release - new code bugs: - smc: prevent UAF in inet_create() - bluetooth: btmtk: fix kernel crash when entering btmtk_usb_suspend - eth: bnxt: reject unsupported hash functions Previous releases - regressions: - sched: act_ct: take care of padding in struct zones_ht_key - netfilter: fix null-ptr-deref in iptable_nat_table_init(). - tcp: adjust clamping window for applications specifying SO_RCVBUF Previous releases - always broken: - ethtool: rss: small fixes to spec and GET - mptcp: - fix signal endpoint re-add - pm: fix backup support in signal endpoints - wifi: ath12k: fix soft lockup on suspend - eth: bnxt_en: fix RSS logic in __bnxt_reserve_rings() - eth: ice: fix AF_XDP ZC timeout and concurrency issues - eth: mlx5: - fix missing lock on sync reset reload - fix error handling in irq_pool_request_irq" * tag 'net-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (76 commits) mptcp: fix duplicate data handling mptcp: fix bad RCVPRUNED mib accounting ipv6: fix ndisc_is_useropt() handling for PIO igc: Fix double reset adapter triggered from a single taprio cmd net: MAINTAINERS: Demote Qualcomm IPA to "maintained" net: wan: fsl_qmc_hdlc: Discard received CRC net: wan: fsl_qmc_hdlc: Convert carrier_lock spinlock to a mutex net/mlx5e: Add a check for the return value from mlx5_port_set_eth_ptys net/mlx5e: Fix CT entry update leaks of modify header context net/mlx5e: Require mlx5 tc classifier action support for IPsec prio capability net/mlx5: Fix missing lock on sync reset reload net/mlx5: Lag, don't use the hardcoded value of the first port net/mlx5: DR, Fix 'stack guard page was hit' error in dr_rule net/mlx5: Fix error handling in irq_pool_request_irq net/mlx5: Always drain health in shutdown callback net: Add skbuff.h to MAINTAINERS r8169: don't increment tx_dropped in case of NETDEV_TX_BUSY netfilter: iptables: Fix potential null-ptr-deref in ip6table_nat_table_init(). netfilter: iptables: Fix null-ptr-deref in iptable_nat_table_init(). net: drop bad gso csum_start and offset in virtio_net_hdr ...
2024-08-01	RISC-V: Enable the IPI before workqueue_online_cpu()	Nick Hu
	Sometimes the hotplug cpu stalls at the arch_cpu_idle() for a while after workqueue_online_cpu(). When cpu stalls at the idle loop, the reschedule IPI is pending. However the enable bit is not enabled yet so the cpu stalls at WFI until watchdog timeout. Therefore enable the IPI before the workqueue_online_cpu() to fix the issue. Fixes: 63c5484e7495 ("workqueue: Add multiple affinity scopes and interface to select them") Signed-off-by: Nick Hu <nick.hu@sifive.com> Reviewed-by: Anup Patel <anup@brainfault.org> Link: https://lore.kernel.org/r/20240717031714.1946036-1-nick.hu@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-01	platform/x86: asus-wmi: add support for vivobook fan profiles	Mohamed Ghanmi
	Add support for vivobook fan profiles wmi call on the ASUS VIVOBOOK to adjust power limits. These fan profiles have a different device id than the ROG series and different order. This reorders the existing modes. As part of keeping the patch clean the throttle_thermal_policy_available boolean stored in the driver struct is removed and throttle_thermal_policy_dev is used in place (as on init it is zeroed). Co-developed-by: Luke D. Jones <luke@ljones.dev> Signed-off-by: Luke D. Jones <luke@ljones.dev> Signed-off-by: Mohamed Ghanmi <mohamed.ghanmi@supcom.tn> Reviewed-by: Luke D. Jones <luke@ljones.dev> Link: https://lore.kernel.org/r/20240609144849.2532-2-mohamed.ghanmi@supcom.tn Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-08-01	mfd: adp5585: Add Analog Devices ADP5585 core support	Haibo Chen
	The ADP5585 is a 10/11 input/output port expander with a built in keypad matrix decoder, programmable logic, reset generator, and PWM generator. This driver supports the chip by modelling it as an MFD device, with two child devices for the GPIO and PWM functions. The driver is derived from an initial implementation from NXP, available in commit 8059835bee19 ("MLK-25917-1 mfd: adp5585: add ADI adp5585 core support") in their BSP kernel tree. It has been extensively rewritten. Signed-off-by: Haibo Chen <haibo.chen@nxp.com> Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://lore.kernel.org/r/20240722121100.2855-3-laurent.pinchart@ideasonboard.com Signed-off-by: Lee Jones <lee@kernel.org>
2024-08-01	leds: trigger: netdev: Add support for tx_err and rx_err notification with LEDs	Lukasz Majewski
	This patch provides support for enabling blinking of LEDs when RX or TX errors are detected. Approach taken in this patch is similar to one for TX or RX data transmission indication (i.e. TRIGGER_NETDEV_TX/RX attribute). One can inspect transmission errors with: ip -s link show eth0 Example LED configuration: cd /sys/devices/platform/amba_pl@0/a001a000.leds/leds/ echo netdev > mode:blue/trigger && \ echo eth0 > mode:blue/device_name && \ echo 1 > mode:blue/tx_err Signed-off-by: Lukasz Majewski <lukma@denx.de> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20240710100651.4059887-1-lukma@denx.de Signed-off-by: Lee Jones <lee@kernel.org>
2024-08-01	ACPI: PRM: Add PRM handler direct call support	John Allen
	Platform Runtime Mechanism (PRM) handlers can be invoked from either the AML interpreter or directly by an OS driver. Implement the latter. [ bp: Massage commit message. ] Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20240730151731.15363-2-john.allen@amd.com
2024-08-01	net: skbuff: Skip early return in skb_unref when debugging	Breno Leitao
	This patch modifies the skb_unref function to skip the early return optimization when CONFIG_DEBUG_NET is enabled. The change ensures that the reference count decrement always occurs in debug builds, allowing for more thorough checking of SKB reference counting. Previously, when the SKB's reference count was 1 and CONFIG_DEBUG_NET was not set, the function would return early after a memory barrier (smp_rmb()) without decrementing the reference count. This optimization assumes it's safe to proceed with freeing the SKB without the overhead of an atomic decrement from 1 to 0. With this change: - In non-debug builds (CONFIG_DEBUG_NET not set), behavior remains unchanged, preserving the performance optimization. - In debug builds (CONFIG_DEBUG_NET set), the reference count is always decremented, even when it's 1, allowing for consistent behavior and potentially catching subtle SKB management bugs. This modification enhances debugging capabilities for networking code without impacting performance in production kernels. It helps kernel developers identify and diagnose issues related to SKB management and reference counting in the network stack. Cc: Chris Mason <clm@fb.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240729104741.370327-1-leitao@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-31	lsm: Refactor return value of LSM hook vm_enough_memory	Xu Kuohai
	To be consistent with most LSM hooks, convert the return value of hook vm_enough_memory to 0 or a negative error code. Before: - Hook vm_enough_memory returns 1 if permission is granted, 0 if not. - LSM_RET_DEFAULT(vm_enough_memory_mm) is 1. After: - Hook vm_enough_memory reutrns 0 if permission is granted, negative error code if not. - LSM_RET_DEFAULT(vm_enough_memory_mm) is 0. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2024-07-31	scx: Allow calling sleepable kfuncs from BPF_PROG_TYPE_SYSCALL	David Vernet
	We currently only allow calling sleepable scx kfuncs (i.e. scx_bpf_create_dsq()) from BPF_PROG_TYPE_STRUCT_OPS progs. The idea here was that we'd never have to call scx_bpf_create_dsq() outside of a sched_ext struct_ops callback, but that might not actually be true. For example, a scheduler could do something like the following: 1. Open and load (not yet attach) a scheduler skel 2. Synchronously call into a BPF_PROG_TYPE_SYSCALL prog from user space. For example, to initialize an LLC domain, or some other global, read-only state. 3. Attach the skel, which actually enables the scheduler The advantage of doing this is that it can preclude having to do pretty ugly boilerplate like initializing a read-only, statically sized array of u64[]'s which the kernel consumes literally once at init time to then create struct bpf_cpumask objects which are actually queried at runtime. Doing the above is already possible given that we can invoke core BPF kfuncs, such as bpf_cpumask_create(), from BPF_PROG_TYPE_SYSCALL progs. We already allow many scx kfuncs to be called from BPF_PROG_TYPE_SYSCALL progs (e.g. scx_bpf_kick_cpu()). Let's allow the sleepable kfuncs as well. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-07-31	hwmon: (max6697) Drop platform data support	Guenter Roeck
	Platform data is not used anywhere in the upstram kernel. Drop support for it to simplify code maintenance. Reviewed-by: Tzung-Bi Shih <tzungbi@kernel.org> Signed-off-by: Guenter Roeck <linux@roeck-us.net>