summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-05-16PCI: apple: Convert to MSI parent infrastructureMarc Zyngier
In an effort to move ARM64 away from the legacy MSI setup, convert the Apple PCIe driver to the MSI-parent infrastructure and let each device have its own MSI domain. [ tglx: Moved the struct out of the function call argument ] Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Link: https://lore.kernel.org/all/20250513172819.2216709-8-maz@kernel.org
2025-05-16irqchip/msi-lib: Honour the MSI_FLAG_NO_AFFINITY flagMarc Zyngier
Bad MSI implementations multiplex MSIs onto a single downstream interrupt, meaning they have no concept of individual affinity. The old MSI code did a reasonable job at this by honouring the MSI_FLAG_NO_AFFINITY, but the new shiny device MSI code doesn't. Teach it about the sad reality of existing hardware. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513172819.2216709-7-maz@kernel.org
2025-05-16irqchip/mvebu: Convert to msi_create_parent_irq_domain() helperMarc Zyngier
Switch the MVEBU family of interrupt chip drivers over to the common helper function to create the interrupt domains. [ tglx: Moved the struct out of the function call argument and fix up the of_node_to_fwnode() instances ] Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513172819.2216709-5-maz@kernel.org
2025-05-16irqchip/gic: Convert to msi_create_parent_irq_domain() helperMarc Zyngier
Switch the GIC family of interrupt chip drivers over to the common helper function to create the interrupt domains. [ tglx: Moved the struct out of the function call argument ] Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513172819.2216709-4-maz@kernel.org
2025-05-16genirq/msi: Add helper for creating MSI-parent irq domainsMarc Zyngier
Creating an irq domain that serves as an MSI parent requires a substantial amount of esoteric boiler-plate code, some of which is often provided twice (such as the bus token). To make things a bit simpler for the unsuspecting MSI tinkerer, provide a helper that does it for them, and serves as documentation of what needs to be provided. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513172819.2216709-3-maz@kernel.org
2025-05-16irqchip: Make irq-msi-lib.h globally availableMarc Zyngier
Move irq-msi-lib.h into include/linux/irqchip, making it available to compilation units outside of drivers/irqchip. This requires some churn in drivers to fetch it from the new location, generated using this script: git grep -l -w \"irq-msi-lib.h\" | \ xargs sed -i -e 's:"irq-msi-lib.h":\<linux/irqchip/irq-msi-lib.h\>:' Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513172819.2216709-2-maz@kernel.org
2025-05-16Merge irq/cleanup fragments into irq/msiThomas Gleixner
Pick up the PCI changes to avoid conflicts.
2025-05-14irqchip/gic-v3-its: Use allocation size from the prepare callMarc Zyngier
Now that .msi_prepare() gets called at the right time and not with semi-random parameters, remove the ugly hack that tried to fix up the number of allocated vectors. It is now correct by construction. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513163144.2215824-6-maz@kernel.org
2025-05-14genirq/msi: Engage the .msi_teardown() callback on domain removalMarc Zyngier
Kindly inform the MSI driver that the domain is torn down, providing the allocation context previously populated on domain creation. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513163144.2215824-5-maz@kernel.org
2025-05-14genirq/msi: Move prepare() call to per-device allocationMarc Zyngier
The current device MSI infrastructure is subtly broken, as it will issue an .msi_prepare() callback into the MSI controller driver every time it needs to allocate an MSI. That's pretty wrong, as the contract (or unwarranted assumption, depending who you ask) between the MSI controller and the core code is that .msi_prepare() is called exactly once per device. This leads to some subtle breakage in some MSI controller drivers, as it gives the impression that there are multiple endpoints sharing a bus identifier (RID in PCI parlance, DID for GICv3+). It implies that whatever allocation the ITS driver (for example) has done on behalf of these devices cannot be undone, as there is no way to track the shared state. This is particularly bad for wire-MSI devices, for which .msi_prepare() is called for each input line. To address this issue, move the call to .msi_prepare() to take place at the point of irq domain allocation, which is the only place that makes sense. The msi_alloc_info_t structure is made part of the msi_domain_template, so that its life-cycle is that of the domain as well. Finally, the msi_info::alloc_data field is made to point at this allocation tracking structure, ensuring that it is carried around the block. This is all pretty straightforward, except for the non-device-MSI leftovers, which still have to call .msi_prepare() at the old spot. One day... Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513163144.2215824-4-maz@kernel.org
2025-05-14irqchip/gic-v3-its: Implement .msi_teardown() callbackMarc Zyngier
The ITS driver currently nukes the structure representing an endpoint device translating via an ITS on freeing the last LPI allocated for it. That's an unfortunate state of affair, as it is pretty common for a driver to allocate a single MSI, do something clever, teardown this MSI, and reallocate a whole bunch of them. The NVME driver does exactly that, amongst others. What happens in that case is that the core code is accidentaly issuing another .msi_prepare() call, even if it shouldn't. This luckily cancels the above behaviour and hides the problem. In order to fix the core code, start by implementing the new .msi_teardown() callback. Nothing calls it yet, so a side effect is that the its_dev structure will not be freed and that the DID will stay mapped. Not a big deal, and this will be solved in following patches. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513163144.2215824-3-maz@kernel.org
2025-05-14genirq/msi: Add .msi_teardown() callback as the reverse of .msi_prepare()Marc Zyngier
While the MSI ops do have a .msi_prepare() callback that is responsible for setting up the relevant (usually per-device) allocation, there is no callback reversing this setup. For this purpose, add .msi_teardown() callback. In order to avoid breaking the ITS driver that suffers from related issues, do not call the callback just yet. Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250513163144.2215824-2-maz@kernel.org
2025-05-08Merge branch 'irq/platform-msi' into irq/msiThomas Gleixner
Pull in the platform MSI/GIC changes which are seperate for the PCI endpoint driver updates.
2025-05-07irqchip/gic-v3-its: Add support for device tree msi-map and msi-maskFrank Li
Some platform devices create child devices dynamically and require the parent device's msi-map to map device IDs to actual sideband information. A typical use case is using ITS as a PCIe Endpoint Controller(EPC)'s doorbell function, where PCI hosts send TLP memory writes to the EP controller. The EP controller converts these writes to AXI transactions and appends platform-specific sideband information. EPC's DTS will provide such information by msi-map and msi-mask. A simplified dts as pcie-ep@10000000 { ... msi-map = <0 &its 0xc 8>; ^^^ 0xc is implement defined sideband information, which append to AXI write transaction. ^ 0 is function index. msi-mask = <0x7> } Check msi-map if msi-parent missed to keep compatility with existing systems. Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250414-ep-msi-v18-5-f69b49917464@nxp.com
2025-05-07dt-bindings: PCI: pci-ep: Add support for iommu-map and msi-mapFrank Li
Document the use of (msi|iommu)-map for PCI Endpoint (EP) controllers, which can use MSI as a doorbell mechanism. Each EP controller can support up to 8 physical functions and 65,536 virtual functions. Define how to construct device IDs using function bits [2:0] and virtual function index bits [31:3], enabling (msi|iommu)-map to associate each child device with a specific (msi|iommu)-specifier. The EP cannot rely on PCI Requester ID (RID) because the RID is determined by the PCI topology of the host system. Since the EP may be connected to different PCI hosts, the RID can vary between systems and is therefore not a reliable identifier. Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rob Herring (Arm) <robh@kernel.org> Link: https://lore.kernel.org/all/20250414-ep-msi-v18-4-f69b49917464@nxp.com
2025-05-07irqchip/gic-v3-its: Set IRQ_DOMAIN_FLAG_MSI_IMMUTABLE for ITSFrank Li
Set the IRQ_DOMAIN_FLAG_MSI_IMMUTABLE flag for ITS, as it does not change the address/data pair after setup. Ensure compatibility with MSI users, such as PCIe Endpoint Doorbell, which require the address/data pair to remain unchanged. Enable PCIe endpoints to use ITS for triggering doorbells from the PCIe Root Complex (RC) side. Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250414-ep-msi-v18-3-f69b49917464@nxp.com
2025-05-07irqdomain: Add IRQ_DOMAIN_FLAG_MSI_IMMUTABLE and irq_domain_is_msi_immutable()Frank Li
Add the flag IRQ_DOMAIN_FLAG_MSI_IMMUTABLE and the API function irq_domain_is_msi_immutable() to check if the MSI controller retains an immutable address/data pair during irq_set_affinity(). Ensure compatibility with MSI users like PCIe Endpoint Doorbell, which require the address/data pair to remain unchanged after setup. Use this function to verify if the MSI controller is immutable. Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250414-ep-msi-v18-2-f69b49917464@nxp.com
2025-05-07platform-msi: Add msi_remove_device_irq_domain() in ↵Frank Li
platform_device_msi_free_irqs_all() platform_device_msi_init_and_alloc_irqs() performs two tasks: allocating the MSI domain for a platform device, and allocate a number of MSIs in that domain. platform_device_msi_free_irqs_all() only frees the MSIs, and leaves the MSI domain alive. Given that platform_device_msi_init_and_alloc_irqs() is the sole tool a platform device has to allocate platform MSIs, it makes sense for platform_device_msi_free_irqs_all() to teardown the MSI domain at the same time as the MSIs. This avoids warnings and unexpected behaviours when a driver repeatedly allocates and frees MSIs. Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/all/20250414-ep-msi-v18-1-f69b49917464@nxp.com
2025-04-09genirq/msi: Rename msi_[un]lock_descs()Thomas Gleixner
Now that all abuse is gone and the legit users are converted to guard(msi_descs_lock), rename the lock functions and document them as internal. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huwei.com> Link: https://lore.kernel.org/all/20250319105506.864699741@linutronix.de
2025-04-09scsi: ufs: qcom: Remove the MSI descriptor abuseThomas Gleixner
The driver abuses the MSI descriptors for internal purposes. Aside of core code and MSI providers nothing has to care about their existence. They have been encapsulated with a lot of effort because this kind of abuse caused all sorts of issues including a maintainability nightmare. Rewrite the code so it uses dedicated storage to hand the required information to the interrupt handler and use a custom cleanup function to simplify the error path. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com> Link: https://lore.kernel.org/all/20250319105506.805529593@linutronix.de
2025-04-09PCI/TPH: Replace the broken MSI-X control word updateThomas Gleixner
The driver walks the MSI descriptors to test whether a descriptor exists for a given index. That's just abuse of the MSI internals. The same test can be done with a single function call by looking up whether there is a Linux interrupt number assigned at the index. What's worse is that the function is completely unserialized against modifications of the MSI-X control by operations issued from the interrupt core. It also brings the PCI/MSI-X internal cached control word out of sync. Remove the trainwreck and invoke the function provided by the PCI/MSI core to update it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/all/20250319105506.744271447@linutronix.de
2025-04-09PCI/MSI: Provide a sane mechanism for TPHThomas Gleixner
The PCI/TPH driver fiddles with the MSI-X control word of an active interrupt completely unserialized against concurrent operations issued from the interrupt core. It also brings the PCI/MSI-X internal cached control word out of sync. Provide a function, which has the required serialization and keeps the control word cache in sync. Unfortunately this requires to look up and lock the interrupt descriptor, which should be only done in the interrupt core code. But confining this particular oddity in the PCI/MSI core is the lesser of all evil. A interrupt core implementation would require a larger pile of infrastructure and indirections for dubious value. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/all/20250319105506.683663807@linutronix.de
2025-04-09PCI: hv: Switch MSI descriptor locking to guard()Thomas Gleixner
Convert the code to use the new guard(msi_descs_lock). No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/all/20250319105506.624838146@linutronix.de
2025-04-09PCI/MSI: Switch msix_capability_init() to guard(msi_desc_lock)Thomas Gleixner
Split the lock protected functionality of msix_capability_init() out into a helper function and use guard(msi_desc_lock) to replace the lock/unlock pair. Simplify the error path in the helper function by utilizing a custom cleanup to get rid of the remaining gotos. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/all/20250319105506.564105011@linutronix.de
2025-04-09PCI/MSI: Switch msi_capability_init() to guard(msi_desc_lock)Thomas Gleixner
Split the lock protected functionality of msi_capability_init() out into a helper function and use guard(msi_desc_lock) to replace the lock/unlock pair. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/all/20250319105506.504992208@linutronix.de
2025-04-09PCI/MSI: Use __free() for affinity masksThomas Gleixner
Let cleanup handle the freeing of the affinity mask. That prepares for switching the MSI descriptor locking to a guard(). No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/all/20250319105506.444764312@linutronix.de
2025-04-09PCI/MSI: Set pci_dev:: Msi_enabled lateThomas Gleixner
The comment claiming that pci_dev::msi_enabled has to be set across setup is a leftover from ancient code versions. Nothing in the setup code requires the flag to be set anymore. Set it in the success path and remove the extra goto label. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/all/20250319105506.383222333@linutronix.de
2025-04-09PCI/MSI: Use guard(msi_desc_lock) where applicableThomas Gleixner
Convert the trivial cases of msi_desc_lock/unlock() pairs. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/all/20250319105506.322536126@linutronix.de
2025-04-09NTB/msi: Switch MSI descriptor locking to lock guard()Thomas Gleixner
Convert the code to use the new guard(msi_descs_lock). No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Acked-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/all/20250319105506.263456735@linutronix.de
2025-04-09soc: ti: ti_sci_inta_msi: Switch MSI descriptor locking to guard()Thomas Gleixner
Convert the code to use the new guard(msi_descs_lock). No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nishanth Menon <nm@ti.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Link: https://lore.kernel.org/all/20250319105506.203802081@linutronix.de
2025-04-09genirq/msi: Use lock guards for MSI descriptor lockingThomas Gleixner
Provide a lock guard for MSI descriptor locking and update the core code accordingly. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/all/20250319105506.144672678@linutronix.de
2025-04-09cleanup: Provide retain_and_null_ptr()Thomas Gleixner
In cases where an allocation is consumed by another function, the allocation needs to be retained on success or freed on failure. The code pattern is usually: struct foo *f = kzalloc(sizeof(*f), GFP_KERNEL); struct bar *b; ,,, // Initialize f ... if (ret) goto free; ... bar = bar_create(f); if (!bar) { ret = -ENOMEM; goto free; } ... return 0; free: kfree(f); return ret; This prevents using __free(kfree) on @f because there is no canonical way to tell the cleanup code that the allocation should not be freed. Abusing no_free_ptr() by force ignoring the return value is not really a sensible option either. Provide an explicit macro retain_and_null_ptr(), which NULLs the cleanup pointer. That makes it easy to analyze and reason about. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: James Bottomley <James.Bottomley@HansenPartnership.com> Link: https://lore.kernel.org/all/20250319105506.083538907@linutronix.de
2025-04-07irqdomain: pci: Switch to of_fwnode_handle()Jiri Slaby (SUSE)
of_node_to_fwnode() is irqdomain's reimplementation of the "officially" defined of_fwnode_handle(). The former is in the process of being removed, so use the latter instead. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250319092951.37667-8-jirislaby@kernel.org
2025-04-06Linux 6.15-rc1v6.15-rc1Linus Torvalds
2025-04-06tools/include: make uapi/linux/types.h usable from assemblyThomas Weißschuh
The "real" linux/types.h UAPI header gracefully degrades to a NOOP when included from assembly code. Mirror this behaviour in the tools/ variant. Test for __ASSEMBLER__ over __ASSEMBLY__ as the former is provided by the toolchain automatically. Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/lkml/af553c62-ca2f-4956-932c-dd6e3a126f58@sirena.org.uk/ Fixes: c9fbaa879508 ("selftests: vDSO: parse_vdso: Use UAPI headers instead of libc headers") Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://patch.msgid.link/20250321-uapi-consistency-v1-1-439070118dc0@linutronix.de Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-06Merge tag 'turbostat-2025.05.06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: - support up to 8192 processors - add cpuidle governor debug telemetry, disabled by default - update default output to exclude cpuidle invocation counts - bug fixes * tag 'turbostat-2025.05.06' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: v2025.05.06 tools/power turbostat: disable "cpuidle" invocation counters, by default tools/power turbostat: re-factor sysfs code tools/power turbostat: Restore GFX sysfs fflush() call tools/power turbostat: Document GNR UncMHz domain convention tools/power turbostat: report CoreThr per measurement interval tools/power turbostat: Increase CPU_SUBSET_MAXCPUS to 8192 tools/power turbostat: Add idle governor statistics reporting tools/power turbostat: Fix names matching tools/power turbostat: Allow Zero return value for some RAPL registers tools/power turbostat: Clustered Uncore MHz counters should honor show/hide options
2025-04-06Merge tag 'soundwire-6.15-rc1-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire Pull soundwire fix from Vinod Koul: - add missing config symbol CONFIG_SND_HDA_EXT_CORE required for asoc driver CONFIG_SND_SOF_SOF_HDA_SDW_BPT * tag 'soundwire-6.15-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: ASoC: SOF: Intel: Let SND_SOF_SOF_HDA_SDW_BPT select SND_HDA_EXT_CORE
2025-04-06tools/power turbostat: v2025.05.06Len Brown
Support up to 8192 processors Add cpuidle governor debug telemetry, disabled by default Update default output to exclude cpuidle invocation counts Bug fixes Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06tools/power turbostat: disable "cpuidle" invocation counters, by defaultLen Brown
Create "pct_idle" counter group, the sofware notion of residency so it can now be singled out, independent of other counter groups. Create "cpuidle" group, the cpuidle invocation counts. Disable "cpuidle", by default. Create "swidle" = "cpuidle" + "pct_idle". Undocument "sysfs", the old name for "swidle", but keep it working for backwards compatibilty. Create "hwidle", all the HW idle counters Modify "idle", enabled by default "idle" = "hwidle" + "pct_idle" (and now excludes "cpuidle") Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06Merge tag 'perf-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fix from Ingo Molnar: "Fix a perf events time accounting bug" * tag 'perf-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/core: Fix child_total_time_enabled accounting bug at task exit
2025-04-06Merge tag 'sched-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: - Fix a nonsensical Kconfig combination - Remove an unnecessary rseq-notification * tag 'sched-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rseq: Eliminate useless task_work on execve sched/isolation: Make CONFIG_CPU_ISOLATION depend on CONFIG_SMP
2025-04-06Disable SLUB_TINY for build testingLinus Torvalds
... and don't error out so hard on missing module descriptions. Before commit 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()") we used to warn about missing module descriptions, but only when building with extra warnigns (ie 'W=1'). After that commit the warning became an unconditional hard error. And it turns out not all modules have been converted despite the claims to the contrary. As reported by Damian Tometzki, the slub KUnit test didn't have a module description, and apparently nobody ever really noticed. The reason nobody noticed seems to be that the slub KUnit tests get disabled by SLUB_TINY, which also ends up disabling a lot of other code, both in tests and in slub itself. And so anybody doing full build tests didn't actually see this failre. So let's disable SLUB_TINY for build-only tests, since it clearly ends up limiting build coverage. Also turn the missing module descriptions error back into a warning, but let's keep it around for non-'W=1' builds. Reported-by: Damian Tometzki <damian@riscv-rocks.de> Link: https://lore.kernel.org/all/01070196099fd059-e8463438-7b1b-4ec8-816d-173874be9966-000000@eu-central-1.amazonses.com/ Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Jeff Johnson <jeff.johnson@oss.qualcomm.com> Fixes: 6c6c1fc09de3 ("modpost: require a MODULE_DESCRIPTION()") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-06tools/power turbostat: re-factor sysfs codeLen Brown
Probe cpuidle "sysfs" residency and counts separately, since soon we will make one disabled on, and the other disabled off. Clarify that some BIC (build-in-counters) are actually "groups". since we're about to re-name some of those groups. no functional change. Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06tools/power turbostat: Restore GFX sysfs fflush() callZhang Rui
Do fflush() to discard the buffered data, before each read of the graphics sysfs knobs. Fixes: ba99a4fc8c24 ("tools/power turbostat: Remove unnecessary fflush() call") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06tools/power turbostat: Document GNR UncMHz domain conventionLen Brown
Document that on Intel Granite Rapids Systems, Uncore domains 0-2 are CPU domains, and uncore domains 3-4 are IO domains. Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06tools/power turbostat: report CoreThr per measurement intervalLen Brown
The CoreThr column displays total thermal throttling events since boot time. Change it to report events during the measurement interval. This is more useful for showing a user the current conditions. Total events since boot time are still available to the user via /sys/devices/system/cpu/cpu*/thermal_throttle/* Document CoreThr on turbostat.8 Fixes: eae97e053fe30 ("turbostat: Support thermal throttle count print") Reported-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com> Cc: Chen Yu <yu.c.chen@intel.com>
2025-04-06tools/power turbostat: Increase CPU_SUBSET_MAXCPUS to 8192Justin Ernst
On systems with >= 1024 cpus (in my case 1152), turbostat fails with the error output: "turbostat: /sys/fs/cgroup/cpuset.cpus.effective: cpu str malformat 0-1151" A similar error appears with the use of turbostat --cpu when the inputted cpu range contains a cpu number >= 1024: # turbostat -c 1100-1151 "--cpu 1100-1151" malformed ... Both errors are caused by parse_cpu_str() reaching its limit of CPU_SUBSET_MAXCPUS. It's a good idea to limit the maximum cpu number being parsed, but 1024 is too low. For a small increase in compute and allocated memory, increasing CPU_SUBSET_MAXCPUS brings support for parsing cpu numbers >= 1024. Increase CPU_SUBSET_MAXCPUS to 8192, a common setting for CONFIG_NR_CPUS on x86_64. Signed-off-by: Justin Ernst <justin.ernst@hpe.com> Signed-off-by: Len Brown <len.brown@intel.com>
2025-04-06Merge tag 'timers-cleanups-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "A set of final cleanups for the timer subsystem: - Convert all del_timer[_sync]() instances over to the new timer_delete[_sync]() API and remove the legacy wrappers. Conversion was done with coccinelle plus some manual fixups as coccinelle chokes on scoped_guard(). - The final cleanup of the hrtimer_init() to hrtimer_setup() conversion. This has been delayed to the end of the merge window, so that all patches which have been merged through other trees are in mainline and all new users are catched. Doing this right before rc1 ensures that new code which is merged post rc1 is not introducing new instances of the original functionality" * tag 'timers-cleanups-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tracing/timers: Rename the hrtimer_init event to hrtimer_setup hrtimers: Rename debug_init_on_stack() to debug_setup_on_stack() hrtimers: Rename debug_init() to debug_setup() hrtimers: Rename __hrtimer_init_sleeper() to __hrtimer_setup_sleeper() hrtimers: Remove unnecessary NULL check in hrtimer_start_range_ns() hrtimers: Make callback function pointer private hrtimers: Merge __hrtimer_init() into __hrtimer_setup() hrtimers: Switch to use __htimer_setup() hrtimers: Delete hrtimer_init() treewide: Convert new and leftover hrtimer_init() users treewide: Switch/rename to timer_delete[_sync]()
2025-04-06Merge tag 'irq-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more irq updates from Thomas Gleixner: "A set of updates for the interrupt subsystem: - A treewide cleanup for the irq_domain code, which makes the naming consistent and gets rid of the original oddity of naming domains 'host'. This is a trivial mechanical change and is done late to ensure that all instances have been catched and new code merged post rc1 wont reintroduce new instances. - A trivial consistency fix in the migration code The recent introduction of irq_force_complete_move() in the core code, causes a problem for the nostalgia crowd who maintains ia64 out of tree. The code assumes that hierarchical interrupt domains are enabled and dereferences irq_data::parent_data unconditionally. That works in mainline because both architectures which enable that code have hierarchical domains enabled. Though it breaks the ia64 build, which enables the functionality, but does not have hierarchical domains. While it's not really a problem for mainline today, this unconditional dereference is inconsistent and trivially fixable by using the existing helper function irqd_get_parent_data(), which has the appropriate #ifdeffery in place" * tag 'irq-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/migration: Use irqd_get_parent_data() in irq_force_complete_move() irqdomain: Stop using 'host' for domain irqdomain: Rename irq_get_default_host() to irq_get_default_domain() irqdomain: Rename irq_set_default_host() to irq_set_default_domain()
2025-04-06Merge tag 'timers-urgent-2025-04-06' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A revert to fix a adjtimex() regression: The recent change to prevent that time goes backwards for the coarse time getters due to immediate multiplier adjustments via adjtimex(), changed the way how the timekeeping core treats that. That change result in a regression on the adjtimex() side, which is user space visible: 1) The forwarding of the base time moves the update out of the original period and establishes a new one. That's changing the behaviour of the [PF]LL control, which user space expects to be applied periodically. 2) The clearing of the accumulated NTP error due to #1, changes the behaviour as well. An attempt to delay the multiplier/frequency update to the next tick did not solve the problem as userspace expects that the multiplier or frequency updates are in effect, when the syscall returns. There is a different solution for the coarse time problem available, so revert the offending commit to restore the existing adjtimex() behaviour" * tag 'timers-urgent-2025-04-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Revert "timekeeping: Fix possible inconsistencies in _COARSE clockids"