summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-09-04arm64: enable the Permission Overlay Extension for EL0Joey Gouly
Expose a HWCAP and ID_AA64MMFR3_EL1_S1POE to userspace, so they can be used to check if the CPU supports the feature. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-12-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04mm: use ARCH_PKEY_BITS to define VM_PKEY_BITNJoey Gouly
Use the new CONFIG_ARCH_PKEY_BITS to simplify setting these bits for different architectures. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-mm@kvack.org Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-4-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04x86/mm: add ARCH_PKEY_BITS to KconfigJoey Gouly
The new config option specifies how many bits are in each PKEY. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: x86@kernel.org Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20240822151113.1479789-3-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04powerpc/mm: add ARCH_PKEY_BITS to KconfigJoey Gouly
The new config option specifies how many bits are in each PKEY. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Acked-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20240822151113.1479789-2-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04KVM: selftests: get-reg-list: add Permission Overlay registersJoey Gouly
Add new system registers: - POR_EL1 - POR_EL0 Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Shuah Khan <shuah@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240822151113.1479789-31-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04KVM: arm64: Sanitise ID_AA64MMFR3_EL1Joey Gouly
Add the missing sanitisation of ID_AA64MMFR3_EL1, making sure we solely expose S1POE and TCRX (we currently don't support anything else). [joey: Took Marc's patch for S1PIE, and changed it for S1POE] Signed-off-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-11-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04KVM: arm64: use `at s1e1a` for POEJoey Gouly
FEAT_ATS1E1A introduces a new instruction: `at s1e1a`. This is an address translation, without permission checks. POE allows read permissions to be removed from S1 by the guest. This means that an `at` instruction could fail, and not get the IPA. Switch to using `at s1e1a` so that KVM can get the IPA regardless of S1 permissions. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240822151113.1479789-10-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04KVM: arm64: Save/restore POE registersJoey Gouly
Define the new system registers that POE introduces and context switch them. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20240822151113.1479789-8-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04arm64: context switch POR_EL0 registerJoey Gouly
POR_EL0 is a register that can be modified by userspace directly, so it must be context switched. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-7-joey.gouly@arm.com [will: Dropped unnecessary isb()s] Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04arm64: cpufeature: add Permission Overlay Extension cpucapJoey Gouly
This indicates if the system supports POE. This is a CPUCAP_BOOT_CPU_FEATURE as the boot CPU will enable POE if it has it, so secondary CPUs must also have this feature. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-6-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04arm64: disable trapping of POR_EL0 to EL2Joey Gouly
Allow EL0 or EL1 to access POR_EL0 without being trapped to EL2. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-5-joey.gouly@arm.com [will: Rename Lset_poe_fgt to Lskip_pie_fgt to ease merge with for-next/misc] Signed-off-by: Will Deacon <will@kernel.org>
2024-09-04cpufreq: intel_pstate: Set asymmetric CPU capacity on hybrid systemsRafael J. Wysocki
Make intel_pstate use the HWP_HIGHEST_PERF values from MSR_HWP_CAPABILITIES to set asymmetric CPU capacity information via the previously introduced arch_set_cpu_capacity() on hybrid systems without SMT. Setting asymmetric CPU capacity is generally necessary to allow the scheduler to compute task sizes in a consistent way across all CPUs in a system where they differ by capacity. That, in turn, should help to improve scheduling decisions. It is also necessary for the schedutil cpufreq governor to operate as expected on hybrid systems where tasks migrate between CPUs of different capacities. The underlying observation is that intel_pstate already uses MSR_HWP_CAPABILITIES to get CPU performance information which is exposed by it via sysfs and CPU performance scaling is based on it. Thus using this information for setting asymmetric CPU capacity is consistent with what the driver has been doing already. Moreover, HWP_HIGHEST_PERF reflects the maximum capacity of a given CPU including both the instructions-per-cycle (IPC) factor and the maximum turbo frequency and the units in which that value is expressed are the same for all CPUs in the system, so the maximum capacity ratio between two CPUs can be obtained by computing the ratio of their HWP_HIGHEST_PERF values. Of course, in principle that capacity ratio need not be directly applicable at lower frequencies, so using it for providing the asymmetric CPU capacity information to the scheduler is a rough approximation, but it is as good as it gets. Also, measurements indicate that this approximation is not too bad in practice. If the given system is hybrid and non-SMT, the new code disables ITMT support in the scheduler (because it may get in the way of asymmetric CPU capacity code in the scheduler that automatically gets enabled by setting asymmetric CPU capacity) after initializing all online CPUs and finds the one with the maximum HWP_HIGHEST_PERF value. Next, it computes the capacity number for each (online) CPU by dividing the product of its HWP_HIGHEST_PERF and SCHED_CAPACITY_SCALE by the maximum HWP_HIGHEST_PERF. When a CPU goes offline, its capacity is reset to SCHED_CAPACITY_SCALE and if it is the one with the maximum HWP_HIGHEST_PERF value, the capacity numbers for all of the other online CPUs are recomputed. This also takes care of a cleanup during driver operation mode changes. Analogously, when a new CPU goes online, its capacity number is updated and if its HWP_HIGHEST_PERF value is greater than the current maximum one, the capacity numbers for all of the other online CPUs are recomputed. The case when the driver is notified of a CPU capacity change, either through the HWP interrupt or through an ACPI notification, is handled similarly to the CPU online case above, except that if the target CPU is the current highest-capacity one and its capacity is reduced, the capacity numbers for all of the other online CPUs need to be recomputed either. If the driver's "no_trubo" sysfs attribute is updated, all of the CPU capacity information is computed from scratch to reflect the new turbo status. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Tested-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> # scale invariance Link: https://patch.msgid.link/1979653.PYKUYFuaPT@rjwysocki.net [ rjw: Fixed a typo in the changelog ] [ rjw: Renamed 3 new functions and added a comment ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-09-04Documentation: livepatch: Correct release locks antonymBagas Sanjaya
"get" doesn't properly fit as an antonym for "release" in the context of locking. Correct it with "acquire". Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240903024753.104609-1-bagasdotme@gmail.com Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-04x86/sched: Add basic support for CPU capacity scalingRafael J. Wysocki
In order be able to compute the sizes of tasks consistently across all CPUs in a hybrid system, it is necessary to provide CPU capacity scaling information to the scheduler via arch_scale_cpu_capacity(). Moreover, the value returned by arch_scale_freq_capacity() for the given CPU must correspond to the arch_scale_cpu_capacity() return value for it, or utilization computations will be inaccurate. Add support for it through per-CPU variables holding the capacity and maximum-to-base frequency ratio (times SCHED_CAPACITY_SCALE) that will be returned by arch_scale_cpu_capacity() and used by scale_freq_tick() to compute arch_freq_scale for the current CPU, respectively. In order to avoid adding measurable overhead for non-hybrid x86 systems, which are the vast majority in the field, whether or not the new hybrid CPU capacity scaling will be in effect is controlled by a static key. This static key is set by calling arch_enable_hybrid_capacity_scale() which also allocates memory for the per-CPU data and initializes it. Next, arch_set_cpu_capacity() is used to set the per-CPU variables mentioned above for each CPU and arch_rebuild_sched_domains() needs to be called for the scheduler to realize that capacity-aware scheduling can be used going forward. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> Tested-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com> # scale invariance Link: https://patch.msgid.link/10523497.nUPlyArG6x@rjwysocki.net [ rjw: Added parens to function kerneldoc comments ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-09-04ASoC: mediatek: mt2701-cs42448: Optimize redundant code in ↵Liu Jing
mt2701_cs42448_machine_probe Utilize the defined parameter 'dev' to make the code cleaner. Signed-off-by: Liu Jing <liujing@cmss.chinamobile.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Link: https://patch.msgid.link/20240903093623.7120-1-liujing@cmss.chinamobile.com Signed-off-by: Mark Brown <broonie@kernel.org>
2024-09-04ALSA: hda/realtek: Support mute LED on HP Laptop 14-dq2xxxMaximilien Perreault
The mute LED on this HP laptop uses ALC236 and requires a quirk to function. This patch enables the existing quirk for the device. Signed-off-by: Maximilien Perreault <maximilienperreault@gmail.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20240904031013.21220-1-maximilienperreault@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-09-04ALSA: hda/realtek: Enable Mute Led for HP Victus 15-fb1xxxAdam Queler
The mute led is controlled by ALC245. This patch enables the already existing quirk for this device. Signed-off-by: Adam Queler <queler+k@gmail.com> Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20240903202419.31433-1-queler+k@gmail.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2024-09-04Merge branch 'sparx5-fdma-part-one'David S. Miller
Daniel Machon says: ==================== net: microchip: add FDMA library and use it for Sparx5 This patch series is the first of a 2-part series, that adds a new common FDMA library for Microchip switch chips Sparx5 and lan966x. These chips share the same FDMA engine, and as such will benefit from a common library with a common implementation. This also has the benefit of removing a lot open-coded bookkeeping and duplicate code for the two drivers. Additionally, upstreaming efforts for a third chip, lan969x, will begin in the near future. This chip will use the new library too. In this first series, the FDMA library is introduced and used by the Sparx5 switch driver. ################### # Example of use: # ################### - Initialize the rx and tx fdma structs with values for: number of DCB's, number of DB's, channel ID, DB size (data buffer size), and total size of the requested memory. Also provide two callbacks: nextptr_cb() and dataptr_cb() for getting the nextptr and dataptr. - Allocate memory using fdma_alloc_phys() or fdma_alloc_coherent(). - Initialize the DCB's with fdma_dcb_init(). - Add new DCB's with fdma_dcb_add(). - Free memory with fdma_free_phys() or fdma_free_coherent(). ##################### # Patch breakdown: # ##################### Patch #1: introduces library and selects it for Sparx5. Patch #2: includes the fdma_api.h header and removes old symbols. Patch #3: replaces old rx and tx variables with equivalent ones from the fdma struct. Only the variables that can be changed without breaking traffic is changed in this patch. Patch #4: uses the library for allocation of rx buffers. This requires quite a bit of refactoring in this single patch. Patch #5: uses the library for adding DCB's in the rx path. Patch #6: uses the library for freeing rx buffers. Patch #7: uses the library helpers in the rx path. Patch #8: uses the library for allocation of tx buffers. This requires quite a bit of refactoring in this single patch. Patch #9: uses the library for adding DCB's in the tx path. Patch #10: uses the library helpers in the tx path. Patch #11: ditches the existing linked list for storing buffer addresses, and instead uses offsets into contiguous memory. Patch #12: modifies existing rx and tx functions to be direction independent. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: ditch sparx5_fdma_rx/tx_reload() functionsDaniel Machon
These direction specific functions can be ditched in favor of a single function: sparx5_fdma_reload(), which retrieves the channel id from the fdma struct instead. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: use contiguous memory for tx buffersDaniel Machon
Currently, the driver uses a linked list for storing the tx buffer addresses. This requires a good amount of extra bookkeeping code. Ditch the linked list in favor of tx buffers being in the same contiguous memory space as the DCB's and the DB's. The FDMA library has a helper for this - so use that. The tx buffer addresses are now retrieved as an offset into the FDMA memory space. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: use library helper for freeing tx buffersDaniel Machon
The library has the helper fdma_free_phys() for freeing physical FDMA memory. Use it in the exit path. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: use FDMA library for adding DCB's in the tx pathDaniel Machon
Use the fdma_dcb_add() function to add DCB's in the tx path. This gets rid of the open-coding of nextptr and dataptr handling and leaves it to the library. Also, make sure the fdma indexes are advanced using: fdma_dcb_advance(), so that the correct nextptr and dataptr offsets are retrieved. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: use the FDMA library for allocation of tx buffersDaniel Machon
Use the two functions: fdma_alloc_phys() and fdma_dcb_init() for tx buffer allocation and use the new buffers throughout. In order to replace the old buffers with the new ones, we have to do the following refactoring: - use fdma_alloc_phys() and fdma_dcb_init() - replace the variables: tx->dma, tx->first_entry and tx->curr_entry with the equivalents from the FDMA struct. - replace uses of sparx5_db_hw and sparx5_tx_dcb_hw with fdma_db and fdma_dcb. - add sparx5_fdma_tx_dataptr_cb callback for obtaining the dataptr. - Initialize FDMA struct values. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: use a few FDMA helpers in the rx pathDaniel Machon
The library provides helpers for a number of DCB and DB operations. Use these in the rx path. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: use library helper for freeing rx buffersDaniel Machon
The library has the helper fdma_free_phys() for freeing physical FDMA memory. Use it in the exit path. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: use FDMA library for adding DCB's in the rx pathDaniel Machon
Use the fdma_dcb_add() function to add DCB's in the rx path. This gets rid of the open-coding of nextptr and dataptr handling and leaves it to the library. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: use the FDMA library for allocation of rx buffersDaniel Machon
Use the two functions: fdma_alloc_phys() and fdma_dcb_init() for rx buffer allocation and use the new buffers throughout. In order to replace the old buffers with the new ones, we have to do the following refactoring: - use fdma_alloc_phys() and fdma_dcb_init() - replace the variables: rx->dma, rx->dcb_entries and rx->last_entry with the equivalents from the FDMA struct. - replace uses of sparx5_db_hw and sparx5_rx_dcb_hw with fdma_db and fdma_dcb. - add sparx5_fdma_rx_dataptr_cb callback for obtaining the dataptr. - Initialize FDMA struct values. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: replace a few variables with new equivalent onesDaniel Machon
Replace the old rx and tx variables: channel_id, FDMA_DCB_MAX, FDMA_RX_DCB_MAX_DBS, FDMA_TX_DCB_MAX_DBS, dcb_index and db_index with the equivalents from the FDMA rx and tx structs. These variables are not entangled in any buffer allocation and can therefore be replaced in advance. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: sparx5: use FDMA library symbolsDaniel Machon
Include and use the new FDMA header, which now provides the required masks and bit offsets for operating on the DCB's and DB's. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: microchip: add FDMA libraryDaniel Machon
Add new FDMA library for interacting with the FDMA engine on Microchip Sparx5 and lan966x switch chips, in an effort to reduce duplicate code and provide a common set of symbols and functions. Signed-off-by: Daniel Machon <daniel.machon@microchip.com> Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com> Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com> Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04net: mana: Fix error handling in mana_create_txq/rxq's NAPI cleanupSouradeep Chakrabarti
Currently napi_disable() gets called during rxq and txq cleanup, even before napi is enabled and hrtimer is initialized. It causes kernel panic. ? page_fault_oops+0x136/0x2b0 ? page_counter_cancel+0x2e/0x80 ? do_user_addr_fault+0x2f2/0x640 ? refill_obj_stock+0xc4/0x110 ? exc_page_fault+0x71/0x160 ? asm_exc_page_fault+0x27/0x30 ? __mmdrop+0x10/0x180 ? __mmdrop+0xec/0x180 ? hrtimer_active+0xd/0x50 hrtimer_try_to_cancel+0x2c/0xf0 hrtimer_cancel+0x15/0x30 napi_disable+0x65/0x90 mana_destroy_rxq+0x4c/0x2f0 mana_create_rxq.isra.0+0x56c/0x6d0 ? mana_uncfg_vport+0x50/0x50 mana_alloc_queues+0x21b/0x320 ? skb_dequeue+0x5f/0x80 Cc: stable@vger.kernel.org Fixes: e1b5683ff62e ("net: mana: Move NAPI from EQ to CQ") Signed-off-by: Souradeep Chakrabarti <schakrabarti@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-09-04printk: nbcon: Use raw_cpu_ptr() instead of open codingJohn Ogness
There is no need to open code a non-migration-checking this_cpu_ptr(). That is exactly what raw_cpu_ptr() is. Signed-off-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/87plpum4jw.fsf@jogness.linutronix.de Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-04cpu: Fix W=1 build kernel-doc warningThorsten Blum
Building the kernel with W=1 generates the following warning: kernel/cpu.c:2693: warning: This comment starts with '/**', but isn't a kernel-doc comment. The function topology_is_core_online() is a simple helper function and doesn't need a kernel-doc comment. Use a normal comment instead. Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20240825221152.71951-2-thorsten.blum@toblux.com
2024-09-04Merge remote-tracking branch 'kvmarm/arm64-shared-6.12' into for-next/poeWill Deacon
Pull in the AT instruction conversion patch from the KVM arm64 tree, as this is a shared dependency between the POE series from Joey and the AT emulation series for Nested Virtualisation from Marc.
2024-09-04Merge branch 'linus' into smp/coreThomas Gleixner
Pull in upstream changes so further patches don't conflict.
2024-09-04timers: Annotate possible non critical data race of next_expiryAnna-Maria Behnsen
Global timers could be expired remotely when the target CPU is idle. After a remote timer expiry, the remote timer_base->next_expiry value is updated while holding the timer_base->lock. When the formerly idle CPU becomes active at the same time and checks whether timers need to expire, this check is done lockless as it is on the local CPU. This could lead to a data race, which was reported by sysbot: https://lore.kernel.org/r/000000000000916e55061f969e14@google.com When the value is read lockless but changed by the remote CPU, only two non critical scenarios could happen: 1) The already update value is read -> everything is perfect 2) The old value is read -> a superfluous timer soft interrupt is raised The same situation could happen when enqueueing a new first pinned timer by a remote CPU also with non critical scenarios: 1) The already update value is read -> everything is perfect 2) The old value is read -> when the CPU is idle, an IPI is executed nevertheless and when the CPU isn't idle, the updated value will be visible on the next tick and the timer might be late one jiffie. As this is very unlikely to happen, the overhead of doing the check under the lock is a way more effort, than a superfluous timer soft interrupt or a possible 1 jiffie delay of the timer. Document and annotate this non critical behavior in the code by using READ/WRITE_ONCE() pair when accessing timer_base->next_expiry. Reported-by: syzbot+bf285fcc0a048e028118@syzkaller.appspotmail.com Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20240829154305.19259-1-anna-maria@linutronix.de Closes: https://lore.kernel.org/lkml/000000000000916e55061f969e14@google.com
2024-09-04printk: Use the BITS_PER_LONG macroJinjie Ruan
sizeof(unsigned long) * 8 is the number of bits in an unsigned long variable, replace it with BITS_PER_LONG macro to make it simpler. Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: John Ogness <john.ogness@linutronix.de> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://lore.kernel.org/r/20240903035358.308482-1-ruanjinjie@huawei.com Signed-off-by: Petr Mladek <pmladek@suse.com>
2024-09-04Merge tag 'ti-driver-soc-for-v6.12' of ↵Arnd Bergmann
https://git.kernel.org/pub/scm/linux/kernel/git/ti/linux into soc/drivers TI SoC driver updates for v6.12 - pm33xx/knav_qmss_queue/pruss: Cleanups around device_node scope based cleanups - knav: Additional fixes around of property - k3-ringacc: Optimizations for data structure * tag 'ti-driver-soc-for-v6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/ti/linux: soc: ti: pm33xx: do device_node auto cleanup soc: ti: knav_qmss_queue: do device_node auto cleanup soc: ti: pruss: do device_node auto cleanup soc: ti: pruss: factor out memories setup soc: ti: knav: Use of_property_read_variable_u32_array() soc: ti: knav: Drop unnecessary check for property presence soc: ti: k3-ringacc: Constify struct k3_ring_ops Link: https://lore.kernel.org/r/20240903155632.525twuumykwnfkiz@subtitle Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-09-04iommu/amd: Do not set the D bit on AMD v2 table entriesJason Gunthorpe
The manual says that bit 6 is IGN for all Page-Table Base Address pointers, don't set it. Fixes: aaac38f61487 ("iommu/amd: Initial support for AMD IOMMU v2 page table") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/14-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Correct the reported page sizes from the V1 tableJason Gunthorpe
The HW only has 52 bits of physical address support, the supported page sizes should not have bits set beyond this. Further the spec says that the 6th level does not support any "default page size for translation entries" meaning leafs in the 6th level are not allowed too. Rework the definition to use GENMASK to build the range of supported pages from the top of physical to 4k. Nothing ever uses such large pages, so this is a cosmetic/documentation improvement only. Reported-by: Joao Martins <joao.m.martins@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/13-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove the confusing dummy iommu_flush_ops tlb opsJason Gunthorpe
The iommu driver is supposed to provide these ops to its io_pgtable implementation so that it can hook the invalidations and do the right thing. They are called by wrapper functions like io_pgtable_tlb_add_page() etc, which the AMD code never calls. Instead it directly calls the AMD IOMMU invalidation functions by casting to the struct protection_domain. Remove it all. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/12-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Fix typo of , instead of ;Jason Gunthorpe
Generates the same code, but is not the expected C style. Fixes: aaac38f61487 ("iommu/amd: Initial support for AMD IOMMU v2 page table") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/11-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove conditions from domain free pathsJason Gunthorpe
Don't use tlb as some flag to indicate if protection_domain_alloc() completed. Have protection_domain_alloc() unwind itself in the normal kernel style and require protection_domain_free() only be called on successful results of protection_domain_alloc(). Also, the amd_iommu_domain_free() op is never called by the core code with a NULL argument, so remove all the NULL tests as well. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/10-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Narrow the use of struct protection_domain to invalidationJason Gunthorpe
The AMD io_pgtable stuff doesn't implement the tlb ops callbacks, instead it invokes the invalidation ops directly on the struct protection_domain. Narrow the use of struct protection_domain to only those few code paths. Make everything else properly use struct amd_io_pgtable through the call chains, which is the correct modular type for an io-pgtable module. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/9-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Store the nid in io_pgtable_cfg instead of the domainJason Gunthorpe
We already have memory in the union here that is being wasted in AMD's case, use it to store the nid. Putting the nid here further isolates the io_pgtable code from the struct protection_domain. Fixup protection_domain_alloc so that the NID from the device is provided, at this point dev is never NULL for AMD so this will now allocate the first table pointer on the correct NUMA node. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Link: https://lore.kernel.org/r/8-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove amd_io_pgtable::pgtbl_cfgJason Gunthorpe
This struct is already in iop.cfg, we don't need two. AMD is using this API sort of wrong, the cfg is supposed to be passed in and then the allocation function will allocate ops memory and copy the passed config into the new memory. Keep it kind of wrong and pass in the cfg memory that is already part of the pagetable struct. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Rename struct amd_io_pgtable iopt to pgtblJason Gunthorpe
There is struct protection_domain iopt and struct amd_io_pgtable iopt. Next patches are going to want to write domain.iopt.iopt.xx which is quite unnatural to read. Give one of them a different name, amd_io_pgtable has fewer references so call it pgtbl, to match pgtbl_cfg, instead. Suggested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove the amd_iommu_domain_set_pt_root() and relatedJason Gunthorpe
Looks like many refactorings here have left this confused. There is only one storage of the root/mode, it is in the iop struct. increase_address_space() calls amd_iommu_domain_set_pgtable() with values that it already stored in iop a few lines above. amd_iommu_domain_clr_pt_root() is zero'ing memory we are about to free. It used to protect against a double free of root, but that is gone now. Remove amd_iommu_domain_set_pgtable(), amd_iommu_domain_set_pt_root(), amd_iommu_domain_clr_pt_root() as they are all pointless. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Remove amd_iommu_domain_update() from page table freeingJason Gunthorpe
It is a serious bug if the domain is still mapped to any DTEs when it is freed as we immediately start freeing page table memory, so any remaining HW touch will UAF. If it is not mapped then dev_list is empty and amd_iommu_domain_update() does nothing. Remove it and add a WARN_ON() to catch this class of bug. Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2024-09-04iommu/amd: Set the pgsize_bitmap correctlyJason Gunthorpe
When using io_pgtable the correct pgsize_bitmap is stored in the cfg, both v1_alloc_pgtable() and v2_alloc_pgtable() set it correctly. This fixes a bug where the v2 pgtable had the wrong pgsize as protection_domain_init_v2() would set it and then do_iommu_domain_alloc() immediately resets it. Remove the confusing ops.pgsize_bitmap since that is not used if the driver sets domain.pgsize_bitmap. Fixes: 134288158a41 ("iommu/amd: Add domain_alloc_user based domain allocation") Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v2-831cdc4d00f3+1a315-amd_iopgtbl_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>