summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-05-16x86/sgx: Ensure no data in PCMD page after truncateReinette Chatre
A PCMD (Paging Crypto MetaData) page contains the PCMD structures of enclave pages that have been encrypted and moved to the shmem backing store. When all enclave pages sharing a PCMD page are loaded in the enclave, there is no need for the PCMD page and it can be truncated from the backing store. A few issues appeared around the truncation of PCMD pages. The known issues have been addressed but the PCMD handling code could be made more robust by loudly complaining if any new issue appears in this area. Add a check that will complain with a warning if the PCMD page is not actually empty after it has been truncated. There should never be data in the PCMD page at this point since it is was just checked to be empty and truncated with enclave mutex held and is updated with the enclave mutex held. Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Haitao Huang <haitao.huang@intel.com> Link: https://lkml.kernel.org/r/6495120fed43fafc1496d09dd23df922b9a32709.1652389823.git.reinette.chatre@intel.com
2022-05-16x86/sgx: Fix race between reclaimer and page fault handlerReinette Chatre
Haitao reported encountering a WARN triggered by the ENCLS[ELDU] instruction faulting with a #GP. The WARN is encountered when the reclaimer evicts a range of pages from the enclave when the same pages are faulted back right away. Consider two enclave pages (ENCLAVE_A and ENCLAVE_B) sharing a PCMD page (PCMD_AB). ENCLAVE_A is in the enclave memory and ENCLAVE_B is in the backing store. PCMD_AB contains just one entry, that of ENCLAVE_B. Scenario proceeds where ENCLAVE_A is being evicted from the enclave while ENCLAVE_B is faulted in. sgx_reclaim_pages() { ... /* * Reclaim ENCLAVE_A */ mutex_lock(&encl->lock); /* * Get a reference to ENCLAVE_A's * shmem page where enclave page * encrypted data will be stored * as well as a reference to the * enclave page's PCMD data page, * PCMD_AB. * Release mutex before writing * any data to the shmem pages. */ sgx_encl_get_backing(...); encl_page->desc |= SGX_ENCL_PAGE_BEING_RECLAIMED; mutex_unlock(&encl->lock); /* * Fault ENCLAVE_B */ sgx_vma_fault() { mutex_lock(&encl->lock); /* * Get reference to * ENCLAVE_B's shmem page * as well as PCMD_AB. */ sgx_encl_get_backing(...) /* * Load page back into * enclave via ELDU. */ /* * Release reference to * ENCLAVE_B' shmem page and * PCMD_AB. */ sgx_encl_put_backing(...); /* * PCMD_AB is found empty so * it and ENCLAVE_B's shmem page * are truncated. */ /* Truncate ENCLAVE_B backing page */ sgx_encl_truncate_backing_page(); /* Truncate PCMD_AB */ sgx_encl_truncate_backing_page(); mutex_unlock(&encl->lock); ... } mutex_lock(&encl->lock); encl_page->desc &= ~SGX_ENCL_PAGE_BEING_RECLAIMED; /* * Write encrypted contents of * ENCLAVE_A to ENCLAVE_A shmem * page and its PCMD data to * PCMD_AB. */ sgx_encl_put_backing(...) /* * Reference to PCMD_AB is * dropped and it is truncated. * ENCLAVE_A's PCMD data is lost. */ mutex_unlock(&encl->lock); } What happens next depends on whether it is ENCLAVE_A being faulted in or ENCLAVE_B being evicted - but both end up with ENCLS[ELDU] faulting with a #GP. If ENCLAVE_A is faulted then at the time sgx_encl_get_backing() is called a new PCMD page is allocated and providing the empty PCMD data for ENCLAVE_A would cause ENCLS[ELDU] to #GP If ENCLAVE_B is evicted first then a new PCMD_AB would be allocated by the reclaimer but later when ENCLAVE_A is faulted the ENCLS[ELDU] instruction would #GP during its checks of the PCMD value and the WARN would be encountered. Noting that the reclaimer sets SGX_ENCL_PAGE_BEING_RECLAIMED at the time it obtains a reference to the backing store pages of an enclave page it is in the process of reclaiming, fix the race by only truncating the PCMD page after ensuring that no page sharing the PCMD page is in the process of being reclaimed. Cc: stable@vger.kernel.org Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page") Reported-by: Haitao Huang <haitao.huang@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Haitao Huang <haitao.huang@intel.com> Link: https://lkml.kernel.org/r/ed20a5db516aa813873268e125680041ae11dfcf.1652389823.git.reinette.chatre@intel.com
2022-05-16x86/sgx: Obtain backing storage page with enclave mutex heldReinette Chatre
Haitao reported encountering a WARN triggered by the ENCLS[ELDU] instruction faulting with a #GP. The WARN is encountered when the reclaimer evicts a range of pages from the enclave when the same pages are faulted back right away. The SGX backing storage is accessed on two paths: when there are insufficient free pages in the EPC the reclaimer works to move enclave pages to the backing storage and as enclaves access pages that have been moved to the backing storage they are retrieved from there as part of page fault handling. An oversubscribed SGX system will often run the reclaimer and page fault handler concurrently and needs to ensure that the backing store is accessed safely between the reclaimer and the page fault handler. This is not the case because the reclaimer accesses the backing store without the enclave mutex while the page fault handler accesses the backing store with the enclave mutex. Consider the scenario where a page is faulted while a page sharing a PCMD page with the faulted page is being reclaimed. The consequence is a race between the reclaimer and page fault handler, the reclaimer attempting to access a PCMD at the same time it is truncated by the page fault handler. This could result in lost PCMD data. Data may still be lost if the reclaimer wins the race, this is addressed in the following patch. The reclaimer accesses pages from the backing storage without holding the enclave mutex and runs the risk of concurrently accessing the backing storage with the page fault handler that does access the backing storage with the enclave mutex held. In the scenario below a PCMD page is truncated from the backing store after all its pages have been loaded in to the enclave at the same time the PCMD page is loaded from the backing store when one of its pages are reclaimed: sgx_reclaim_pages() { sgx_vma_fault() { ... mutex_lock(&encl->lock); ... __sgx_encl_eldu() { ... if (pcmd_page_empty) { /* * EPC page being reclaimed /* * shares a PCMD page with an * PCMD page truncated * enclave page that is being * while requested from * faulted in. * reclaimer. */ */ sgx_encl_get_backing() <----------> sgx_encl_truncate_backing_page() } mutex_unlock(&encl->lock); } } In this scenario there is a race between the reclaimer and the page fault handler when the reclaimer attempts to get access to the same PCMD page that is being truncated. This could result in the reclaimer writing to the PCMD page that is then truncated, causing the PCMD data to be lost, or in a new PCMD page being allocated. The lost PCMD data may still occur after protecting the backing store access with the mutex - this is fixed in the next patch. By ensuring the backing store is accessed with the mutex held the enclave page state can be made accurate with the SGX_ENCL_PAGE_BEING_RECLAIMED flag accurately reflecting that a page is in the process of being reclaimed. Consistently protect the reclaimer's backing store access with the enclave's mutex to ensure that it can safely run concurrently with the page fault handler. Cc: stable@vger.kernel.org Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer") Reported-by: Haitao Huang <haitao.huang@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Haitao Huang <haitao.huang@intel.com> Link: https://lkml.kernel.org/r/fa2e04c561a8555bfe1f4e7adc37d60efc77387b.1652389823.git.reinette.chatre@intel.com
2022-05-16x86/sgx: Mark PCMD page as dirty when modifying contentsReinette Chatre
Recent commit 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page") expanded __sgx_encl_eldu() to clear an enclave page's PCMD (Paging Crypto MetaData) from the PCMD page in the backing store after the enclave page is restored to the enclave. Since the PCMD page in the backing store is modified the page should be marked as dirty to ensure the modified data is retained. Cc: stable@vger.kernel.org Fixes: 08999b2489b4 ("x86/sgx: Free backing memory after faulting the enclave page") Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Haitao Huang <haitao.huang@intel.com> Link: https://lkml.kernel.org/r/00cd2ac480db01058d112e347b32599c1a806bc4.1652389823.git.reinette.chatre@intel.com
2022-05-16x86/sgx: Disconnect backing page references from dirty statusReinette Chatre
SGX uses shmem backing storage to store encrypted enclave pages and their crypto metadata when enclave pages are moved out of enclave memory. Two shmem backing storage pages are associated with each enclave page - one backing page to contain the encrypted enclave page data and one backing page (shared by a few enclave pages) to contain the crypto metadata used by the processor to verify the enclave page when it is loaded back into the enclave. sgx_encl_put_backing() is used to release references to the backing storage and, optionally, mark both backing store pages as dirty. Managing references and dirty status together in this way results in both backing store pages marked as dirty, even if only one of the backing store pages are changed. Additionally, waiting until the page reference is dropped to set the page dirty risks a race with the page fault handler that may load outdated data into the enclave when a page is faulted right after it is reclaimed. Consider what happens if the reclaimer writes a page to the backing store and the page is immediately faulted back, before the reclaimer is able to set the dirty bit of the page: sgx_reclaim_pages() { sgx_vma_fault() { ... sgx_encl_get_backing(); ... ... sgx_reclaimer_write() { mutex_lock(&encl->lock); /* Write data to backing store */ mutex_unlock(&encl->lock); } mutex_lock(&encl->lock); __sgx_encl_eldu() { ... /* * Enclave backing store * page not released * nor marked dirty - * contents may not be * up to date. */ sgx_encl_get_backing(); ... /* * Enclave data restored * from backing store * and PCMD pages that * are not up to date. * ENCLS[ELDU] faults * because of MAC or PCMD * checking failure. */ sgx_encl_put_backing(); } ... /* set page dirty */ sgx_encl_put_backing(); ... mutex_unlock(&encl->lock); } } Remove the option to sgx_encl_put_backing() to set the backing pages as dirty and set the needed pages as dirty right after receiving important data while enclave mutex is held. This ensures that the page fault handler can get up to date data from a page and prepares the code for a following change where only one of the backing pages need to be marked as dirty. Cc: stable@vger.kernel.org Fixes: 1728ab54b4be ("x86/sgx: Add a page reclaimer") Suggested-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Tested-by: Haitao Huang <haitao.huang@intel.com> Link: https://lore.kernel.org/linux-sgx/8922e48f-6646-c7cc-6393-7c78dcf23d23@intel.com/ Link: https://lkml.kernel.org/r/fa9f98986923f43e72ef4c6702a50b2a0b3c42e3.1652389823.git.reinette.chatre@intel.com
2022-05-16integrity: Fix sparse warnings in keyring_handlerStefan Berger
Fix the following sparse warnings: CHECK security/integrity/platform_certs/keyring_handler.c security/integrity/platform_certs/keyring_handler.c:76:16: warning: Using plain integer as NULL pointer security/integrity/platform_certs/keyring_handler.c:91:16: warning: Using plain integer as NULL pointer security/integrity/platform_certs/keyring_handler.c:106:16: warning: Using plain integer as NULL pointer Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2022-05-16Revert "PCI: aardvark: Rewrite IRQ code to chained IRQ handler"Pali Rohár
This reverts commit 1571d67dc190e50c6c56e8f88cdc39f7cc53166e. This commit broke support for setting interrupt affinity. It looks like that it is related to the chained IRQ handler. Revert this commit until issue with setting interrupt affinity is fixed. Fixes: 1571d67dc190 ("PCI: aardvark: Rewrite IRQ code to chained IRQ handler") Link: https://lore.kernel.org/r/20220515125815.30157-1-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-05-16ptp: ocp: have adjtime handle negative delta_ns correctlyJonathan Lemon
delta_ns is a s64, but it was being passed ptp_ocp_adjtime_coarse as an u64. Also, it turns out that timespec64_add_ns() only handles positive values, so perform the math with set_normalized_timespec(). Fixes: 90f8f4c0e3ce ("ptp: ocp: Add ptp_ocp_adjtime_coarse for large adjustments") Suggested-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com> Acked-by: Vadim Fedorenko <vfedorenko@novek.ru> Link: https://lore.kernel.org/r/20220513225231.1412-1-jonathan.lemon@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-16arm64: kdump: Do not allocate crash low memory if not neededZhen Lei
When "crashkernel=X,high" is specified, the specified "crashkernel=Y,low" memory is not required in the following corner cases: 1. If both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, it means that the devices can access any memory. 2. If the system memory is small, the crash high memory may be allocated from the DMA zones. If that happens, there's no need to allocate another crash low memory because there's already one. Add condition '(crash_base >= CRASH_ADDR_LOW_MAX)' to determine whether the 'high' memory is allocated above DMA zones. Note: when both CONFIG_ZONE_DMA and CONFIG_ZONE_DMA32 are disabled, the entire physical memory is DMA accessible, CRASH_ADDR_LOW_MAX equals 'PHYS_MASK + 1'. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: Baoquan He <bhe@redhat.com> Link: https://lore.kernel.org/r/20220511032033.426-1-thunder.leizhen@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sve: Generate ZCR definitionsMark Brown
Convert the various ZCR instances to automatic generation, no functional changes expected. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220510161208.631259-13-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sme: Generate defintions for SVCRMark Brown
Convert SVCR to automatic generation, no functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220510161208.631259-12-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sme: Generate SMPRI_EL1 definitionsMark Brown
Convert SMPRI_EL1 to be generated. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220510161208.631259-11-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sme: Automatically generate SMPRIMAP_EL2 definitionsMark Brown
No functional change should be seen from converting SMPRIMAP_EL2 to be generated. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220510161208.631259-10-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sme: Automatically generate SMIDR_EL1 definesMark Brown
Automatically generate the defines for SMIDR_EL1, no functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220510161208.631259-9-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sme: Automatically generate defines for SMCRMark Brown
Convert SMCR to use the register definition code, no functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220510161208.631259-8-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sysreg: Support generation of RAZ fieldsMark Brown
Add a statement for RAZ bitfields to the automatic register generation script. Nothing is emitted to the header for these fields. Signed-off-by: Mark Brown <broonie@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220510161208.631259-7-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sme: Remove _EL0 from name of SVCR - FIXME sysreg.hMark Brown
The defines for SVCR call it SVCR_EL0 however the architecture calls the register SVCR with no _EL0 suffix. In preparation for generating the sysreg definitions rename to match the architecture, no functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220510161208.631259-6-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sme: Standardise bitfield names for SVCRMark Brown
The bitfield definitions for SVCR have a SYS_ added to the names of the constant which will be a problem for automatic generation. Remove the prefixes, no functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220510161208.631259-5-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sme: Drop SYS_ from SMIDR_EL1 definesMark Brown
We currently have a non-standard SYS_ prefix in the constants generated for SMIDR_EL1 bitfields. Drop this in preparation for automatic register definition generation, no functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220510161208.631259-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/fp: Rename SVE and SME LEN field name to _WIDTHMark Brown
The SVE and SVE length configuration field LEN have constants specifying their width called _SIZE rather than the more normal _WIDTH, in preparation for automatic generation rename to _WIDTH. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220510161208.631259-3-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/fp: Make SVE and SME length register definition match architectureMark Brown
Currently (as of DDI0487H.a) the architecture defines the vector length control field in ZCR and SMCR as being 4 bits wide with an additional 5 bits reserved above it marked as RAZ/WI for future expansion. The kernel currently attempts to anticipate such expansion by treating these extra bits as part of the LEN field but this will be inconvenient when we start generating the defines and would cause problems in the event that the architecture goes a different direction with these fields. Let's instead change the defines to reflect the currently defined architecture, we can update in future as needed. No change in behaviour should be seen in any system, even emulated systems using the maximum allowed vector length for the current architecture. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220510161208.631259-2-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16Merge branch 'for-next/sme' into for-next/sysreg-genCatalin Marinas
* for-next/sme: (29 commits) : Scalable Matrix Extensions support. arm64/sve: Make kernel FPU protection RT friendly arm64/sve: Delay freeing memory in fpsimd_flush_thread() arm64/sme: More sensibly define the size for the ZA register set arm64/sme: Fix NULL check after kzalloc arm64/sme: Add ID_AA64SMFR0_EL1 to __read_sysreg_by_encoding() arm64/sme: Provide Kconfig for SME KVM: arm64: Handle SME host state when running guests KVM: arm64: Trap SME usage in guest KVM: arm64: Hide SME system registers from guests arm64/sme: Save and restore streaming mode over EFI runtime calls arm64/sme: Disable streaming mode and ZA when flushing CPU state arm64/sme: Add ptrace support for ZA arm64/sme: Implement ptrace support for streaming mode SVE registers arm64/sme: Implement ZA signal handling arm64/sme: Implement streaming SVE signal handling arm64/sme: Disable ZA and streaming mode when handling signals arm64/sme: Implement traps and syscall handling for SME arm64/sme: Implement ZA context switching arm64/sme: Implement streaming SVE context switching arm64/sme: Implement SVCR context switching ...
2022-05-16kselftest/arm64: Explicitly build no BTI tests with BTI disabledMark Brown
In case a distribution enables branch protection by default do as we do for the main kernel and explicitly disable branch protection when building the test case for having BTI disabled to ensure it doesn't get turned on by the toolchain defaults. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220516182213.727589-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16spi: spi-mem: Convert Aspeed SMC driver to spi-memMark Brown
Merge series from Cédric Le Goater <clg@kaod.org>: This series adds a new SPI driver using the spi-mem interface for the Aspeed static memory controllers of the AST2600, AST2500 and AST2400 SoCs. * AST2600 Firmware SPI Memory Controller (FMC) * AST2600 SPI Flash Controller (SPI1 and SPI2) * AST2500 Firmware SPI Memory Controller (FMC) * AST2500 SPI Flash Controller (SPI1 and SPI2) * AST2400 New Static Memory Controller (also referred as FMC) * AST2400 SPI Flash Controller (SPI) It is based on the current OpenBMC kernel driver [1], using directly the MTD SPI-NOR interface and on a patchset [2] previously proposed adding support for the AST2600 only. This driver takes a slightly different approach to cover all 6 controllers. It does not make use of the controller register disabling Address and Data byte lanes because is not available on the AST2400 SoC. We could introduce a specific handler for new features available on recent SoCs if needed. As there is not much difference on performance, the driver chooses the common denominator: "User mode" which has been heavily tested in [1]. "User mode" is also used as a fall back method when flash device mapping window is too small. Problems to address with spi-mem were the configuration of the mapping windows and the calibration of the read timings. The driver handles them in the direct mapping handler when some knowledge on the size of the flash device is know. It is not perfect but not incorrect either. The algorithm is one from [1] because it doesn't require the DMA registers which are not available on all controllers. Direct mapping for writes is not supported (yet). I have seen some corruption with writes and I preferred to use the safer and proven method of the initial driver [1]. We can improve that later. The driver supports Quad SPI RX transfers on the AST2600 SoC but it didn't have the expected results. Therefore it is not activated yet. There are some issues on the pinctrl to investigate first. Tested on: * OpenPOWER Palmetto (AST2400) * Facebook Wedge 100 BMC (AST2400) by Tao Ren <rentao.bupt@gmail.com> * Evaluation board (AST2500) * Inspur FP5280G2 BMC (AST2500) by John Wang <wangzq.jn@gmail.com> * Facebook Backpack CMM BMC (AST2500) by Tao Ren <rentao.bupt@gmail.com> * OpenPOWER Witherspoon (AST2500) * Evaluation board (AST2600 A0 and A3) * Rainier board (AST2600) [1] https://github.com/openbmc/linux/blob/dev-5.15/drivers/mtd/spi-nor/controllers/aspeed-smc.c [2] https://patchwork.ozlabs.org/project/linux-aspeed/list/?series=212394
2022-05-16arm64/sve: Make kernel FPU protection RT friendlySebastian Andrzej Siewior
Non RT kernels need to protect FPU against preemption and bottom half processing. This is achieved by disabling bottom halves via local_bh_disable() which implictly disables preemption. On RT kernels this protection mechanism is not sufficient because local_bh_disable() does not disable preemption. It serializes bottom half related processing via a CPU local lock. As bottom halves are running always in thread context on RT kernels disabling preemption is the proper choice as it implicitly prevents bottom half processing. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220505163207.85751-3-bigeasy@linutronix.de Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64/sve: Delay freeing memory in fpsimd_flush_thread()Sebastian Andrzej Siewior
fpsimd_flush_thread() invokes kfree() via sve_free()+sme_free() within a preempt disabled section which is not working on -RT. Delay freeing of memory until preemption is enabled again. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220505163207.85751-2-bigeasy@linutronix.de Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64: mm: Make arch_faults_on_old_pte() check for migratabilityValentin Schneider
arch_faults_on_old_pte() relies on the calling context being non-preemptible. CONFIG_PREEMPT_RT turns the PTE lock into a sleepable spinlock, which doesn't disable preemption once acquired, triggering the warning in arch_faults_on_old_pte(). It does however disable migration, ensuring the task remains on the same CPU during the entirety of the critical section, making the read of cpu_has_hw_af() safe and stable. Make arch_faults_on_old_pte() check cant_migrate() instead of preemptible(). Cc: Valentin Schneider <vschneid@redhat.com> Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Valentin Schneider <valentin.schneider@arm.com> Link: https://lore.kernel.org/r/20220127192437.1192957-1-valentin.schneider@arm.com Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220505163207.85751-4-bigeasy@linutronix.de Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16arm64: mte: Clean up user tag accessorsRobin Murphy
Invoking user_ldst to explicitly add a post-increment of 0 is silly. Just use a normal USER() annotation and save the redundant instruction. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Tong Tiangen <tongtiangen@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20220420030418.3189040-6-tongtiangen@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16kselftest/arm64: bti: force static linkingAndre Przywara
The "bti" selftests are built with -nostdlib, which apparently automatically creates a statically linked binary, which is what we want and need for BTI (to avoid interactions with the dynamic linker). However this is not true when building a PIE binary, which some toolchains (Ubuntu) configure as the default. When compiling btitest with such a toolchain, it will create a dynamically linked binary, which will probably fail some tests, as the dynamic linker might not support BTI: =================== TAP version 13 1..18 not ok 1 nohint_func/call_using_br_x0 not ok 2 nohint_func/call_using_br_x16 not ok 3 nohint_func/call_using_blr .... =================== To make sure we create static binaries, add an explicit -static on the linker command line. This forces static linking even if the toolchain defaults to PIE builds, and fixes btitest runs on BTI enabled machines. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Fixes: 314bcbf09f14 ("kselftest: arm64: Add BTI tests") Link: https://lore.kernel.org/r/20220511172129.2078337-1-andre.przywara@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-16block, bfq: make bfq_has_work() more accurateYu Kuai
bfq_has_work() is using busy_queues currently, which is not accurate because bfq_queue is busy doesn't represent that it has requests. Since bfqd aready has a counter 'queued' to record how many requests are in bfq, use it instead of busy_queues. Noted that bfq_has_work() can be called with 'bfqd->lock' held, thus the lock can't be held in bfq_has_work() to protect 'bfqd->queued'. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20220513023507.2625717-3-yukuai3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-16block, bfq: protect 'bfqd->queued' by 'bfqd->lock'Yu Kuai
If bfq_schedule_dispatch() is called from bfq_idle_slice_timer_body(), then 'bfqd->queued' is read without holding 'bfqd->lock'. This is wrong since it can be wrote concurrently. Fix the problem by holding 'bfqd->lock' in such case. Signed-off-by: Yu Kuai <yukuai3@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Link: https://lore.kernel.org/r/20220513023507.2625717-2-yukuai3@huawei.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-16block: cleanup the VM accounting in submit_bioChristoph Hellwig
submit_bio uses some extremely convoluted checks and confusing comments to only account REQ_OP_READ/REQ_OP_WRITE comments. Just switch to the plain obvious checks instead. Signed-off-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20220516063654.2782792-1-hch@lst.de [axboe: fixup WRITE -> REQ_OP_WRITE] Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-16spi: spi-au1550: replace ternary operator with min()Guo Zhengkui
Fix the following coccicheck warnings: drivers/spi/spi-au1550.c:408:21-22: WARNING opportunity for min() drivers/spi/spi-au1550.c:542:21-22: WARNING opportunity for min() min() macro is defined in include/linux/minmax.h. It avoids multiple evaluations of the arguments when non-constant and performs strict type-checking. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Link: https://lore.kernel.org/r/20220513130333.58379-1-guozhengkui@vivo.com Signed-off-by: Mark Brown <broonie@kernel.org>
2022-05-16dt-bindings: mtd: partitions: Extend fixed-partitions bindingMikhail Zhilkin
Extend fixed-partitions binding for support of Sercomm partition parser (add "sercomm,sc-partitions" compatible). Signed-off-by: Mikhail Zhilkin <csharper2005@gmail.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220516151725.885427-1-csharper2005@gmail.com
2022-05-16dt-bindings: Add Sercomm (Suzhou) Corporation vendor prefixMikhail Zhilkin
Add "sercomm" vendor prefix for "Sercomm (Suzhou) Corporation". Company website: Link: https://www.sercomm.com/ Signed-off-by: Mikhail Zhilkin <csharper2005@gmail.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220516151637.885324-1-csharper2005@gmail.com
2022-05-16mtd: phram: Allow cached mappingsVincent Whitchurch
Currently phram always uses ioremap(), but this is unnecessary when normal memory is used. If the reserved-memory node does not specify the no-map property, indicating it should be mapped as system RAM and ioremap() cannot be used on it, use a cached mapping using memremap(MEMREMAP_WB) instead. On one of my systems this improves read performance by ~70%. (Note that this driver has always used normal memcpy/memset functions on memory obtained from ioremap(), which sparse doesn't like. There is no memremap() variant which maps exactly to ioremap() on all architectures, so that behaviour of the driver is not changed to avoid affecting existing users, but the sparse warnings are suppressed in the moved code with __force.) Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220510151822.1809278-1-vincent.whitchurch@axis.com
2022-05-16mtd: call of_platform_populate() for MTD partitionsRafał Miłecki
Until this change MTD subsystem supported handling partitions only with MTD partitions parsers. That's a specific / limited API designed around partitions. Some MTD partitions may however require different handling. They may contain specific data that needs to be parsed and somehow extracted. For that purpose MTD subsystem should allow binding of standard platform drivers. An example can be U-Boot (sub)partition with environment variables. There exist a "u-boot,env" DT binding for MTD (sub)partition that requires an NVMEM driver. Ref: 5db1c2dbc04c ("dt-bindings: nvmem: add U-Boot environment variables binding") Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220510131259.555-1-zajec5@gmail.com
2022-05-16mtd: rawnand: renesas: Use runtime PM instead of the raw clock APIMiquel Raynal
This NAND controller is part of a well defined power domain handled by the runtime PM core. Let's keep the harmony with the other RZ/N1 drivers and exclusively use the runtime PM API to enable/disable the clocks. We still need to retrieve the external clock rate in order to derive the NAND timings, but that is not a big deal, we can still do that in the probe and just save this value to reuse it later. Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://lore.kernel.org/linux-mtd/20220513104957.257721-3-miquel.raynal@bootlin.com
2022-05-16dt-bindings: mtd: renesas: Fix the NAND controller descriptionMiquel Raynal
Add the missing power-domain property which is needed on all the RZ/N1 SoC IPs. Suggested-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/linux-mtd/20220513104957.257721-2-miquel.raynal@bootlin.com
2022-05-16mtd: rawnand: mpc5121: Check before clk_disable_unprepare() not neededPhil Edworthy
All code in clk_disable_unprepare() already checks the clk ptr using IS_ERR_OR_NULL so there is no need to check it again before calling it. A lot of other drivers already rely on this behaviour, so it's safe to do so here. Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220512185033.46901-1-phil.edworthy@renesas.com
2022-05-16mtd: rawnand: rockchip: Check before clk_disable_unprepare() not neededPhil Edworthy
All code in clk_disable_unprepare() already checks the clk ptr using IS_ERR_OR_NULL so there is no need to check it again before calling it. A lot of other drivers already rely on this behaviour, so it's safe to do so here. Signed-off-by: Phil Edworthy <phil.edworthy@renesas.com> Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/linux-mtd/20220512184558.45966-1-phil.edworthy@renesas.com
2022-05-16mmc: sdhci-of-arasan: Add NULL check for data fieldSai Krishna Potthuri
Add NULL check for data field retrieved from of_device_get_match_data() before dereferencing the data. Addresses-coverity: CID 305057:Dereference null return value (NULL_RETURNS) Signed-off-by: Sai Krishna Potthuri <lakshmi.sai.krishna.potthuri@xilinx.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/1652339993-27280-1-git-send-email-lakshmi.sai.krishna.potthuri@xilinx.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-05-16nbd: Fix hung on disconnect request if socket is closed beforeXie Yongji
When userspace closes the socket before sending a disconnect request, the following I/O requests will be blocked in wait_for_reconnect() until dead timeout. This will cause the following disconnect request also hung on blk_mq_quiesce_queue(). That means we have no way to disconnect a nbd device if there are some I/O requests waiting for reconnecting until dead timeout. It's not expected. So let's wake up the thread waiting for reconnecting directly when a disconnect request is sent. Reported-by: Xu Jianhai <zero.xu@bytedance.com> Signed-off-by: Xie Yongji <xieyongji@bytedance.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Link: https://lore.kernel.org/r/20220322080639.142-1-xieyongji@bytedance.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-16evm: Clean up some variablesStefan Berger
Make hmac_tfm static since it's not used anywhere else besides the file it is in. Remove declaration of hash_tfm since it doesn't exist. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2022-05-16evm: Return INTEGRITY_PASS for enum integrity_status value '0'Stefan Berger
Return INTEGRITY_PASS for the enum integrity_status rather than 0. Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2022-05-16mtd: spi-nor: aspeed: set the decoding size to at least 2MB for AST2600Potin Lai
In AST2600, the unit of SPI CEx decoding range register is 1MB, and end address offset is set to the acctual offset - 1MB. If the flash only has 1MB, the end address will has same value as start address, which will causing unexpected errors. This patch set the decoding size to at least 2MB to avoid decoding errors. Tested: root@bletchley:~# dmesg | grep "aspeed-smc 1e631000.spi: CE0 window" [ 59.328134] aspeed-smc 1e631000.spi: CE0 window resized to 2MB (AST2600 Decoding) [ 59.343001] aspeed-smc 1e631000.spi: CE0 window [ 0x50000000 - 0x50200000 ] 2MB root@bletchley:~# devmem 0x1e631030 0x00100000 Tested-by: Jae Hyun Yoo <quic_jaehyoo@quicinc.com> Signed-off-by: Potin Lai <potin.lai@quantatw.com> [ clg : Ported on new spi-mem driver ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Link: https://lore.kernel.org/r/20220509175616.1089346-12-clg@kaod.org Signed-off-by: Mark Brown <broonie@kernel.org>
2022-05-16spi: aspeed: Calibrate read timingsCédric Le Goater
To accommodate the different response time of SPI transfers on different boards and different SPI NOR devices, the Aspeed controllers provide a set of Read Timing Compensation registers to tune the timing delays depending on the frequency being used. The AST2600 SoC has one of these registers per device. On the AST2500 and AST2400 SoCs, the timing register is shared by all devices which is problematic to get good results other than for one device. The algorithm first reads a golden buffer at low speed and then performs reads with different clocks and delay cycle settings to find a breaking point. This selects a default good frequency for the CEx control register. The current settings are a bit optimistic as we pick the first delay giving good results. A safer approach would be to determine an interval and choose the middle value. Calibration is performed when the direct mapping for reads is created. Since the underlying spi-nor object needs to be initialized to create the spi_mem operation for direct mapping, we should be fine. Having a specific API would clarify the requirements though. Cc: Pratyush Yadav <p.yadav@ti.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Tested-by: Joel Stanley <joel@jms.id.au> Tested-by: Tao Ren <rentao.bupt@gmail.com> Tested-by: Jae Hyun Yoo <quic_jaehyoo@quicinc.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Link: https://lore.kernel.org/r/20220509175616.1089346-9-clg@kaod.org Signed-off-by: Mark Brown <broonie@kernel.org>
2022-05-16spi: aspeed: Add support for the AST2400 SPI controllerCédric Le Goater
Extend the driver for the AST2400 SPI Flash Controller (SPI). This controller has a slightly different interface which requires adaptation of the 4B handling. Summary of features : . host Firmware . 1 chip select pin (CE0) . slightly different register set, between AST2500 and the legacy controller . no segment registers . single, dual mode. Reviewed-by: Joel Stanley <joel@jms.id.au> Tested-by: Joel Stanley <joel@jms.id.au> Tested-by: Tao Ren <rentao.bupt@gmail.com> Tested-by: Jae Hyun Yoo <quic_jaehyoo@quicinc.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Link: https://lore.kernel.org/r/20220509175616.1089346-8-clg@kaod.org Signed-off-by: Mark Brown <broonie@kernel.org>
2022-05-16spi: aspeed: Workaround AST2500 limitationsCédric Le Goater
It is not possible to configure a full 128MB window for a chip of the same size on the AST2500 SPI controller. For this case, the maximum window size is restricted to 120MB for CE0. Reviewed-by: Joel Stanley <joel@jms.id.au> Tested-by: Joel Stanley <joel@jms.id.au> Tested-by: Tao Ren <rentao.bupt@gmail.com> Tested-by: Jae Hyun Yoo <quic_jaehyoo@quicinc.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Link: https://lore.kernel.org/r/20220509175616.1089346-7-clg@kaod.org Signed-off-by: Mark Brown <broonie@kernel.org>
2022-05-16spi: aspeed: Adjust direct mapping to device sizeCédric Le Goater
The segment registers of the FMC/SPI controllers provide a way to configure the mapping window of the flash device contents on the AHB bus. Adjust this window to the size of the spi-mem mapping. Things get more complex with multiple devices. The driver needs to also adjust the window of the next device to make sure that there is no overlap, even if there is no available device. The proposal below is not perfect but it is covering all the cases we have seen on different boards with one and two devices on the same bus. Reviewed-by: Joel Stanley <joel@jms.id.au> Tested-by: Joel Stanley <joel@jms.id.au> Tested-by: Tao Ren <rentao.bupt@gmail.com> Tested-by: Jae Hyun Yoo <quic_jaehyoo@quicinc.com> Signed-off-by: Cédric Le Goater <clg@kaod.org> Link: https://lore.kernel.org/r/20220509175616.1089346-6-clg@kaod.org Signed-off-by: Mark Brown <broonie@kernel.org>