linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2023-06-16	sched/wait: Fix a kthread_park race with wait_woken()	Arve Hjønnevåg
	kthread_park and wait_woken have a similar race that kthread_stop and wait_woken used to have before it was fixed in commit cb6538e740d7 ("sched/wait: Fix a kthread race with wait_woken()"). Extend that fix to also cover kthread_park. [jstultz: Made changes suggested by Peter to optimize memory loads] Signed-off-by: Arve Hjønnevåg <arve@android.com> Signed-off-by: John Stultz <jstultz@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://lore.kernel.org/r/20230602212350.535358-1-jstultz@google.com
2023-06-16	sched/topology: Mark set_sched_topology() __init	Miaohe Lin
	All callers of set_sched_topology() are within __init section. Mark it __init too. Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://lore.kernel.org/r/20230603073645.1173332-1-linmiaohe@huawei.com
2023-06-16	sched/fair: Rename variable cpu_util eff_util	Tom Rix
	cppcheck reports kernel/sched/fair.c:7436:17: style: Local variable 'cpu_util' shadows outer function [shadowFunction] unsigned long cpu_util; ^ Clean this up by renaming the variable to eff_util Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lore.kernel.org/r/20230611122535.183654-1-trix@redhat.com
2023-06-16	perf/x86/intel: Fix the FRONTEND encoding on GNR and MTL	Kan Liang
	When counting a FRONTEND event, the MSR_PEBS_FRONTEND is not correctly set on GNR and MTL p-core. The umask value for the FRONTEND events is changed on GNR and MTL. The new umask is missing in the extra_regs[] table. Add a dedicated intel_gnr_extra_regs[] for GNR and MTL p-core. Fixes: bc4000fdb009 ("perf/x86/intel: Add Granite Rapids") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20230615173242.3726364-1-kan.liang@linux.intel.com
2023-06-16	perf/core: Drop __weak attribute from arch_perf_update_userpage() prototype	Marc Zyngier
	Reiji reports that the arm64 implementation of arch_perf_update_userpage() is now ignored and replaced by the dummy stub in core code. This seems to happen since the PMUv3 driver was moved to driver/perf. As it turns out, dropping the __weak attribute from the prototype of the function solves the problem. You're right, this doesn't seem to make much sense. And yet... It appears that both symbols get flagged as weak, and that the first one to appear in the link order wins: $ nm drivers/perf/arm_pmuv3.o\|grep arch_perf_update_userpage 0000000000001db0 W arch_perf_update_userpage Dropping the attribute from the prototype restores the expected behaviour, and arm64 is able to enjoy arch_perf_update_userpage() again. Fixes: 7755cec63ade ("arm64: perf: Move PMUv3 driver to drivers/perf") Fixes: f1ec3a517b43 ("kernel/events: Add a missing prototype for arch_perf_update_userpage()") Reported-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Reiji Watanabe <reijiw@google.com> Link: https://lkml.kernel.org/r/20230616114831.3186980-1-maz@kernel.org
2023-06-16	locking/atomic: scripts: fix ${atomic}_dec_if_positive() kerneldoc	Mark Rutland
	The ${atomic}_dec_if_positive() ops are unlike all the other conditional atomic ops. Rather than returning a boolean success value, these return the value that the atomic variable would be updated to, even when no update is performed. We missed this when adding kerneldoc comments, and the documentation for ${atomic}_dec_if_positive() erroneously states: \| Return: @true if @v was updated, @false otherwise. Ideally we'd clean this up by aligning ${atomic}_dec_if_positive() with the usual atomic op conventions: with ${atomic}_fetch_dec_if_positive() for those who care about the value of the varaible, and ${atomic}_dec_if_positive() returning a boolean success value. In the mean time, align the documentation with the current reality. Fixes: ad8110706f381170 ("locking/atomic: scripts: generate kerneldoc comments") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Paul E. McKenney <paulmck@kernel.org> Link: https://lore.kernel.org/r/20230615132734.1119765-1-mark.rutland@arm.com
2023-06-16	iommu/amd: Fix possible memory leak of 'domain'	Su Hui
	Move allocation code down to avoid memory leak. Fixes: 29f54745f245 ("iommu/amd: Add missing domain type checks") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/20230608021933.856045-1-suhui@nfschina.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2023-06-16	dt-bindings: Update Documentation/arm references	Jonathan Corbet
	The Arm documentation has moved to Documentation/arch/arm; update one devicetree reference to match. Cc: Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org> Cc: devicetree@vger.kernel.org Acked-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-06-16	docs: update some straggling Documentation/arm references	Jonathan Corbet
	The Arm documentation has moved to Documentation/arch/arm; update the last remaining references to match. Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: Chen-Yu Tsai <wens@csie.org> Cc: Samuel Holland <samuel@sholland.org> Cc: Thierry Reding <thierry.reding@gmail.com> Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com> Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> # for pwm Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-06-16	Documentation: KVM: make corrections to vcpu-requests.rst	Randy Dunlap
	Make corrections to punctuation and grammar. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Andrew Jones <drjones@redhat.com> Cc: Christoffer Dall <cdall@linaro.org> Cc: kvm@vger.kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20230612030810.23376-5-rdunlap@infradead.org
2023-06-16	Documentation: KVM: make corrections to ppc-pv.rst	Randy Dunlap
	Correct the path of a header file. Change "guest to ... guest" to "guest to ... host" in one place. Hyphenate "32-bit" systems. Add a comma at one parenthetical phrase. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: kvm@vger.kernel.org Cc: Alexander Graf <agraf@suse.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20230612030810.23376-4-rdunlap@infradead.org
2023-06-16	Documentation: KVM: make corrections to locking.rst	Randy Dunlap
	Correct grammar and punctuation. Use "read-only" for consistency. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20230612030810.23376-3-rdunlap@infradead.org
2023-06-16	Documentation: KVM: make corrections to halt-polling.rst	Randy Dunlap
	Module parameters are in sysfs, not debugfs, so change that. Remove superfluous "that" following "Note:". Hyphenate "system-wide" values. Hyphenate "trade-off". Don't treat "denial of service" as a verb. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: kvm@vger.kernel.org Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20230612030810.23376-2-rdunlap@infradead.org
2023-06-16	Documentation: virt: correct location of haltpoll module params	Randy Dunlap
	Module parameters are located in sysfs, not debugfs, so correct the statement. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20230610054302.6223-1-rdunlap@infradead.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2023-06-16	Documentation/mm: Initial page table documentation	Linus Walleij
	This is based on an earlier blog post at people.kernel.org, it describes the concepts about page tables that were hardest for me to grasp when dealing with them for the first time, such as the prevalent three-letter acronyms pfn, pgd, p4d, pud, pmd and pte. I don't know if this is what people want, but it's what I would have wanted. The wording, introduction, choice of initial subjects and choice of style is mine. I discussed at one point with Mike Rapoport to bring this into the kernel documentation, so here is a small proposal. The current form is augmented in response to feedback from Mike Rapoport, Matthew Wilcox, Jonathan Cameron, Kuan-Ying Lee, Randy Dunlap and Bagas Sanjaya. Cc: Matthew Wilcox <willy@infradead.org> Reviewed-by: Mike Rapoport <rppt@kernel.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://people.kernel.org/linusw/arm32-page-tables Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Link: https://lore.kernel.org/r/20230614072548.996940-1-linus.walleij@linaro.org
2023-06-16	irqchip/loongson-eiointc: Fix irq affinity setting during resume	Jianmin Lv
	The hierarchy of PCH PIC, PCH PCI MSI and EIONTC is as following: PCH PIC ------->\| \|---->EIOINTC PCH PCI MSI --->\| so the irq_data list of irq_desc for IRQs on PCH PIC and PCH PCI MSI is like this: irq_desc->irq_data(domain: PCH PIC)->parent_data(domain: EIOINTC) irq_desc->irq_data(domain: PCH PCI MSI)->parent_data(domain: EIOINTC) In eiointc_resume(), the irq_data passed into eiointc_set_irq_affinity() should be matched to EIOINTC domain instead of PCH PIC or PCH PCI MSI domain, so fix it. Fixes: a90335c2dfb4 ("irqchip/loongson-eiointc: Add suspend/resume support") Reported-by: yangqiming <yangqiming@loongson.cn> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230614115936.5950-6-lvjianmin@loongson.cn
2023-06-16	irqchip/loongson-liointc: Add IRQCHIP_SKIP_SET_WAKE flag	Yinbo Zhu
	LIOINTC doesn't require specific logic to work with wakeup IRQs, and no irq_set_wake callback is needed. To allow registered IRQs from LIOINTC to be used as a wakeup-source, and ensure irq_set_irq_wake() works well, the flag IRQCHIP_SKIP_SET_WAKE should be added. Reviewed-by: Huacai Chen <chenhuacai@kernel.org> Signed-off-by: Yinbo Zhu <zhuyinbo@loongson.cn> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230614115936.5950-5-lvjianmin@loongson.cn
2023-06-16	irqchip/loongson-liointc: Fix IRQ trigger polarity	Jianmin Lv
	For the INT_POLARITY register of Loongson-2K series IRQ controller, '0' indicates high level or rising edge triggered, '1' indicates low level or falling edge triggered, and we can find out the information from the Loongson 2K1000LA User Manual v1.0, Table 9-2, Section 9.3 (中断寄存器描述 / Description of the Interrupt Registers). For Loongson-3 CPU series, setting INT_POLARITY register is not supported and writting it has no effect. So trigger polarity setting shouled be fixed for Loongson-2K CPU series. Fixes: 17343d0b4039 ("irqchip/loongson-liointc: Support to set IRQ type for ACPI path") Cc: stable@vger.kernel.org Reviewed-by: Huacai Chen <chenhuacai@kernel.org> Co-developed-by: Chong Qiao <qiaochong@loongson.cn> Signed-off-by: Chong Qiao <qiaochong@loongson.cn> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230614115936.5950-4-lvjianmin@loongson.cn
2023-06-16	irqchip/loongson-pch-pic: Fix potential incorrect hwirq assignment	Liu Peibao
	In DeviceTree path, when ht_vec_base is not zero, the hwirq of PCH PIC will be assigned incorrectly. Because when pch_pic_domain_translate() adds the ht_vec_base to hwirq, the hwirq does not have the ht_vec_base subtracted when calling irq_domain_set_info(). The ht_vec_base is designed for the parent irq chip/domain of the PCH PIC. It seems not proper to deal this in callbacks of the PCH PIC domain and let's put this back like the initial commit ef8c01eb64ca ("irqchip: Add Loongson PCH PIC controller"). Fixes: bcdd75c596c8 ("irqchip/loongson-pch-pic: Add ACPI init support") Cc: stable@vger.kernel.org Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Liu Peibao <liupeibao@loongson.cn> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230614115936.5950-3-lvjianmin@loongson.cn
2023-06-16	irqchip/loongson-pch-pic: Fix initialization of HT vector register	Jianmin Lv
	In an ACPI-based dual-bridge system, IRQ of each bridge's PCH PIC sent to CPU is always a zero-based number, which means that the IRQ on PCH PIC of each bridge is mapped into vector range from 0 to 63 of upstream irqchip(e.g. EIOINTC). EIOINTC N: [0 ... 63 \| 64 ... 255] -------- ---------- ^ ^ \| \| PCH PIC N \| PCH MSI N For example, the IRQ vector number of sata controller on PCH PIC of each bridge is 16, which is sent to upstream irqchip of EIOINTC when an interrupt occurs, which will set bit 16 of EIOINTC. Since hwirq of 16 on EIOINTC has been mapped to a irq_desc for sata controller during hierarchy irq allocation, the related mapped IRQ will be found through irq_resolve_mapping() in the IRQ domain of EIOINTC. So, the IRQ number set in HT vector register should be fixed to be a zero-based number. Cc: stable@vger.kernel.org Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Co-developed-by: liuyun <liuyun@loongson.cn> Signed-off-by: liuyun <liuyun@loongson.cn> Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230614115936.5950-2-lvjianmin@loongson.cn
2023-06-16	regulator: helper: Document ramp_delay parameter of ↵	ChiYuan Huang
	regulator_set_ramp_delay_regmap() With W=1: drivers/regulator/helpers.c:947: warning: Function parameter or member 'ramp_delay' not described in 'regulator_set_ramp_delay_regmap' Fix it by documenting the parameter. Fixes: fb8fee9efdcf ("regulator: Add regmap helper for ramp-delay setting") Signed-off-by: ChiYuan Huang <cy_huang@richtek.com> Link: https://lore.kernel.org/r/1686881298-28333-1-git-send-email-cy_huang@richtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-16	regmap: Drop early readability check	Mark Brown
	We have some drivers that have a use case for cached write only registers, doing read/modify/writes on read only registers in order to work more easily with bitfields. Go back to trying the cache before we check if we can read from the device. Fixes: eab5abdeb79f0 ("regmap: Check for register readability before checking cache during read") Reported-by: Konrad Dybcio <konradybcio@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20230615-regmap-drop-early-readability-v1-1-8135094362de@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
2023-06-16	docs: perf: Add new description for HiSilicon UC PMU	Junhao He
	A new function is added on HiSilicon uncore UC PMU. The UC PMU support to filter statistical information based on the specified tx request uring channel. Make user configuration through "uring_channel" parameter. Document them to provide guidance on how to use them. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonthan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230615125926.29832-4-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16	drivers/perf: hisi: Add support for HiSilicon UC PMU driver	Junhao He
	On HiSilicon Hip09 platform, there are 4 UC (unified cache) modules on each chip CCL (CPU Cluster). UC is a cache that provides coherence between NUMA and UMA domains. It is located between L2 and Memory System. Many PMU events are supported. Let's support the UC PMU driver using the HiSilicon uncore PMU framework. * rd_req_en : rd_req_en is the abbreviation of read request tracetag enable and allows user to count only read operations. Details are listed in the hisi-pmu document at Documentation/admin-guide/perf/hisi-pmu.rst * srcid_en & srcid: Allows users to filter statistical information based on specific CPU/ICL by srcid. srcid_en depends on rd_req_en being enabled. * uring_channel: Allows users to filter statistical information based on the specified tx request uring channel. uring_channel only supported events: [0x47 ~ 0x59]. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230615125926.29832-3-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16	drivers/perf: hisi: Add support for HiSilicon H60PA and PAv3 PMU driver	Junhao He
	Compared to the original PA device, H60PA offers higher bandwidth. The H60PA is a new device and we use HID to differentiate them. The events supported by PAv3 and PAv2 are different. The PAv3 PMU removed some events which are supported by PAv2 PMU. The older PA PMU driver will probe v3 as v2. Therefore PA events displayed by "perf list" cannot work properly. We add the HISI0275 HID for PAv3 PMU to distinguish different. For each H60PA PMU, except for the overflow interrupt register, other functions of the H60PA PMU are the same as the original PA PMU module. It has 8-programable counters and each counter is free-running. Interrupt is supported to handle counter (64-bits) overflow. Signed-off-by: Junhao He <hejunhao3@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230615125926.29832-2-hejunhao3@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16	Merge branch irq/lpi-resend into irq/irqchip-next	Marc Zyngier
	* irq/lpi-resend: : . : Patch series from James Gowans, working around an issue with : GICv3 LPIs that can fire concurrently on multiple CPUs. : . irqchip/gic-v3-its: Enable RESEND_WHEN_IN_PROGRESS for LPIs genirq: Allow fasteoi handler to resend interrupts on concurrent handling genirq: Expand doc for PENDING and REPLAY flags genirq: Use BIT() for the IRQD_* state flags Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-06-16	irqchip/gic-v3-its: Enable RESEND_WHEN_IN_PROGRESS for LPIs	James Gowans
	GICv3 LPIs are impacted by an architectural design issue: they do not have a global active state and as such a given LPI can be delivered to a new CPU after an affinity change while the previous instance of the same LPI handler has not yet completed on the original CPU. If LPIs had an active state, this second LPI would not be delivered until the first CPU deactivated the initial LPI, just like SPIs. To solve this issue, use the newly introduced IRQD_RESEND_WHEN_IN_PROGRESS flag, ensuring that we do not lose an LPI being delivered during that window by getting the GIC to resend it. This workaround gets enabled for all LPIs, including the VPE doorbells. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Gowans <jgowans@amazon.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Cc: KarimAllah Raslan <karahmed@amazon.com> Cc: Yipeng Zou <zouyipeng@huawei.com> Cc: Zhang Jianhua <chris.zjh@huawei.com> [maz: massaged commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230608120021.3273400-4-jgowans@amazon.com
2023-06-16	genirq: Allow fasteoi handler to resend interrupts on concurrent handling	James Gowans
	There is a class of interrupt controllers out there that, once they have signalled a given interrupt number, will still signal incoming instances of the same interrupt despite the original interrupt not having been EOIed yet. As long as the new interrupt reaches the same CPU, nothing bad happens, as that CPU still has its interrupts globally disabled, and we will only take the new interrupt once the interrupt has been EOIed. However, things become more "interesting" if an affinity change comes in while the interrupt is being handled. More specifically, while the per-irq lock is being dropped. This results in the affinity change taking place immediately. At this point, there is nothing that prevents the interrupt from firing on the new target CPU. We end-up with the interrupt running concurrently on two CPUs, which isn't a good thing. And that's where things become worse: the new CPU notices that the interrupt handling is in progress (irq_may_run() return false), and drops the interrupt on the floor. The whole race looks like this: CPU 0 \| CPU 1 -----------------------------\|----------------------------- interrupt start \| handle_fasteoi_irq \| set_affinity(CPU 1) handler \| ... \| interrupt start ... \| handle_fasteoi_irq -> early out handle_fasteoi_irq return \| interrupt end interrupt end \| If the interrupt was an edge, too bad. The interrupt is lost, and the system will eventually die one way or another. Not great. A way to avoid this situation is to detect this problem at the point we handle the interrupt on the new target. Instead of dropping the interrupt, use the resend mechanism to force it to be replayed. Also, in order to limit the impact of this workaround to the pathetic architectures that require it, gate it behind a new irq flag aptly named IRQD_RESEND_WHEN_IN_PROGRESS. Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: James Gowans <jgowans@amazon.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Cc: KarimAllah Raslan <karahmed@amazon.com> Cc: Yipeng Zou <zouyipeng@huawei.com> Cc: Zhang Jianhua <chris.zjh@huawei.com> [maz: reworded commit mesage] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230608120021.3273400-3-jgowans@amazon.com
2023-06-16	genirq: Expand doc for PENDING and REPLAY flags	James Gowans
	Adding a bit more info about what the flags are used for may help future code readers. Signed-off-by: James Gowans <jgowans@amazon.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Marc Zyngier <maz@kernel.org> Cc: Liao Chang <liaochang1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230608120021.3273400-2-jgowans@amazon.com
2023-06-16	genirq: Use BIT() for the IRQD_* state flags	Marc Zyngier
	As we're about to use the last bit available in the IRQD_* state flags, rewrite these flags with BIT(), which ensures that these constant do not represent a signed value. Signed-off-by: Marc Zyngier <maz@kernel.org>
2023-06-16	perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE	Ilkka Koskinen
	Add missing MODULE_DEVICE_TABLE definition to generate modalias, which enables module autoloading. Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/20230615232630.304870-1-ilkka@os.amperecomputing.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16	perf/arm-cmn: Add sysfs identifier	Robin Murphy
	Expose a sysfs identifier encapsulating the CMN part number and revision so that jevents can narrow down a fundamental set of possible events for calculating metrics. Configuration-dependent aspects - such as whether a given node type is present, and/or a given node ID is valid - are still not covered, and in general it's hard to see how userspace could handle them, so we won't be removing any data or validation logic from the driver any time soon, but at least it's a step in a useful direction. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Tested-by: Jing Zhang <renyu.zj@linux.alibaba.com> Link: https://lore.kernel.org/r/b8a14c14fcdf028939ebf57849863e8ae01743de.1686588640.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16	perf/arm-cmn: Revamp model detection	Robin Murphy
	CMN implements a set of CoreSight-format peripheral ID registers which in principle we should be able to use to identify the hardware. However so far we have avoided trying to use the part number field since the TRMs have all described it as "configuration dependent". It turns out, though, that this is a quirk of the documentation generation process, and in fact the part number should always be a stable well-defined field which we can trust. To that end, revamp our model detection to rely less on ACPI/DT, and pave the way towards further using the hardware information as an identifier for userspace jevent metrics. This includes renaming the revision constants to maximise readability. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-and-tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com> Link: https://lore.kernel.org/r/3c791eaae814b0126f9adbd5419bfb4a600dade7.1686588640.git.robin.murphy@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16	perf/arm_dmc620: Add cpumask	Xin Yang
	Add a cpumask for the DMC620 PMU. As it is an uncore PMU, perf userspace tool only needs to open a single counter on the CPU specified by the CPU mask for each event on a given DMC620 device. Signed-off-by: Xin Yang <xin.yang@arm.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20230613013423.2078397-1-xin.yang@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2023-06-16	x86/xen: Set default memory type for PV guests to WB	Juergen Gross
	When running as an unprivileged PV guest under Xen (not dom0), the default MTRR memory type should be write-back. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Link: https://lore.kernel.org/r/20230615123959.12298-1-jgross@suse.com
2023-06-16	x86/mm: Remove unused current_untag_mask()	Borislav Petkov (AMD)
	e0bddc19ba95 ("x86/mm: Reduce untagged_addr() overhead for systems without LAM") removed its only usage site so drop it. Move the tlbstate_untag_mask up in the header and drop the ugly ifdeffery as the unused declaration should be properly discarded. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Link: https://lore.kernel.org/r/20230614174148.5439-1-bp@alien8.de
2023-06-16	xfrm: Linearize the skb after offloading if needed.	Sebastian Andrzej Siewior
	With offloading enabled, esp_xmit() gets invoked very late, from within validate_xmit_xfrm() which is after validate_xmit_skb() validates and linearizes the skb if the underlying device does not support fragments. esp_output_tail() may add a fragment to the skb while adding the auth tag/ IV. Devices without the proper support will then send skb->data points to with the correct length so the packet will have garbage at the end. A pcap sniffer will claim that the proper data has been sent since it parses the skb properly. It is not affected with INET_ESP_OFFLOAD disabled. Linearize the skb after offloading if the sending hardware requires it. It was tested on v4, v6 has been adopted. Fixes: 7785bba299a8d ("esp: Add a software GRO codepath") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2023-06-16	x86/fpu: Move FPU initialization into arch_cpu_finalize_init()	Thomas Gleixner
	Initializing the FPU during the early boot process is a pointless exercise. Early boot is convoluted and fragile enough. Nothing requires that the FPU is set up early. It has to be initialized before fork_init() because the task_struct size depends on the FPU register buffer size. Move the initialization to arch_cpu_finalize_init() which is the perfect place to do so. No functional change. This allows to remove quite some of the custom early command line parsing, but that's subject to the next installment. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.902376621@linutronix.de
2023-06-16	x86/fpu: Mark init functions __init	Thomas Gleixner
	No point in keeping them around. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.841685728@linutronix.de
2023-06-16	x86/fpu: Remove cpuinfo argument from init functions	Thomas Gleixner
	Nothing in the call chain requires it Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.783704297@linutronix.de
2023-06-16	x86/init: Initialize signal frame size late	Thomas Gleixner
	No point in doing this during really early boot. Move it to an early initcall so that it is set up before possible user mode helpers are started during device initialization. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.727330699@linutronix.de
2023-06-16	init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init()	Thomas Gleixner
	Invoke the X86ism mem_encrypt_init() from X86 arch_cpu_finalize_init() and remove the weak fallback from the core code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.670360645@linutronix.de
2023-06-16	init: Invoke arch_cpu_finalize_init() earlier	Thomas Gleixner
	X86 is reworking the boot process so that initializations which are not required during early boot can be moved into the late boot process and out of the fragile and restricted initial boot phase. arch_cpu_finalize_init() is the obvious place to do such initializations, but arch_cpu_finalize_init() is invoked too late in start_kernel() e.g. for initializing the FPU completely. fork_init() requires that the FPU is initialized as the size of task_struct on X86 depends on the size of the required FPU register buffer. Fortunately none of the init calls between calibrate_delay() and arch_cpu_finalize_init() is relevant for the functionality of arch_cpu_finalize_init(). Invoke it right after calibrate_delay() where everything which is relevant for arch_cpu_finalize_init() has been set up already. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://lore.kernel.org/r/20230613224545.612182854@linutronix.de
2023-06-16	init: Remove check_bugs() leftovers	Thomas Gleixner
	Everything is converted over to arch_cpu_finalize_init(). Remove the check_bugs() leftovers including the empty stubs in asm-generic, alpha, parisc, powerpc and xtensa. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20230613224545.553215951@linutronix.de
2023-06-16	um/cpu: Switch to arch_cpu_finalize_init()	Thomas Gleixner
	check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20230613224545.493148694@linutronix.de
2023-06-16	sparc/cpu: Switch to arch_cpu_finalize_init()	Thomas Gleixner
	check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://lore.kernel.org/r/20230613224545.431995857@linutronix.de
2023-06-16	sh/cpu: Switch to arch_cpu_finalize_init()	Thomas Gleixner
	check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.371697797@linutronix.de
2023-06-16	mips/cpu: Switch to arch_cpu_finalize_init()	Thomas Gleixner
	check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.312438573@linutronix.de
2023-06-16	m68k/cpu: Switch to arch_cpu_finalize_init()	Thomas Gleixner
	check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20230613224545.254342916@linutronix.de
2023-06-16	loongarch/cpu: Switch to arch_cpu_finalize_init()	Thomas Gleixner
	check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.195288218@linutronix.de