diff options
Diffstat (limited to 'Documentation/arch/powerpc')
-rw-r--r-- | Documentation/arch/powerpc/booting.rst | 4 | ||||
-rw-r--r-- | Documentation/arch/powerpc/cpu_families.rst | 18 | ||||
-rw-r--r-- | Documentation/arch/powerpc/cxl.rst | 469 | ||||
-rw-r--r-- | Documentation/arch/powerpc/cxlflash.rst | 433 | ||||
-rw-r--r-- | Documentation/arch/powerpc/dexcr.rst | 141 | ||||
-rw-r--r-- | Documentation/arch/powerpc/elf_hwcaps.rst | 1 | ||||
-rw-r--r-- | Documentation/arch/powerpc/firmware-assisted-dump.rst | 113 | ||||
-rw-r--r-- | Documentation/arch/powerpc/htm.rst | 104 | ||||
-rw-r--r-- | Documentation/arch/powerpc/index.rst | 2 | ||||
-rw-r--r-- | Documentation/arch/powerpc/kvm-nested.rst | 44 | ||||
-rw-r--r-- | Documentation/arch/powerpc/papr_hcalls.rst | 11 | ||||
-rw-r--r-- | Documentation/arch/powerpc/ultravisor.rst | 2 |
12 files changed, 355 insertions, 987 deletions
diff --git a/Documentation/arch/powerpc/booting.rst b/Documentation/arch/powerpc/booting.rst index 11aa440f98cc..472e97891aef 100644 --- a/Documentation/arch/powerpc/booting.rst +++ b/Documentation/arch/powerpc/booting.rst @@ -93,8 +93,8 @@ given platform based on the content of the device-tree. Thus, you should: a) add your platform support as a _boolean_ option in - arch/powerpc/Kconfig, following the example of PPC_PSERIES, - PPC_PMAC and PPC_MAPLE. The latter is probably a good + arch/powerpc/Kconfig, following the example of PPC_PSERIES + and PPC_PMAC. The latter is probably a good example of a board support to start from. b) create your main platform file as diff --git a/Documentation/arch/powerpc/cpu_families.rst b/Documentation/arch/powerpc/cpu_families.rst index eb7e60649b43..f55433c6b8f3 100644 --- a/Documentation/arch/powerpc/cpu_families.rst +++ b/Documentation/arch/powerpc/cpu_families.rst @@ -128,24 +128,6 @@ IBM BookE - All 32 bit:: +--------------+ - | 401 | - +--------------+ - | - | - v - +--------------+ - | 403 | - +--------------+ - | - | - v - +--------------+ - | 405 | - +--------------+ - | - | - v - +--------------+ | 440 | +--------------+ | diff --git a/Documentation/arch/powerpc/cxl.rst b/Documentation/arch/powerpc/cxl.rst deleted file mode 100644 index d2d77057610e..000000000000 --- a/Documentation/arch/powerpc/cxl.rst +++ /dev/null @@ -1,469 +0,0 @@ -==================================== -Coherent Accelerator Interface (CXL) -==================================== - -Introduction -============ - - The coherent accelerator interface is designed to allow the - coherent connection of accelerators (FPGAs and other devices) to a - POWER system. These devices need to adhere to the Coherent - Accelerator Interface Architecture (CAIA). - - IBM refers to this as the Coherent Accelerator Processor Interface - or CAPI. In the kernel it's referred to by the name CXL to avoid - confusion with the ISDN CAPI subsystem. - - Coherent in this context means that the accelerator and CPUs can - both access system memory directly and with the same effective - addresses. - - -Hardware overview -================= - - :: - - POWER8/9 FPGA - +----------+ +---------+ - | | | | - | CPU | | AFU | - | | | | - | | | | - | | | | - +----------+ +---------+ - | PHB | | | - | +------+ | PSL | - | | CAPP |<------>| | - +---+------+ PCIE +---------+ - - The POWER8/9 chip has a Coherently Attached Processor Proxy (CAPP) - unit which is part of the PCIe Host Bridge (PHB). This is managed - by Linux by calls into OPAL. Linux doesn't directly program the - CAPP. - - The FPGA (or coherently attached device) consists of two parts. - The POWER Service Layer (PSL) and the Accelerator Function Unit - (AFU). The AFU is used to implement specific functionality behind - the PSL. The PSL, among other things, provides memory address - translation services to allow each AFU direct access to userspace - memory. - - The AFU is the core part of the accelerator (eg. the compression, - crypto etc function). The kernel has no knowledge of the function - of the AFU. Only userspace interacts directly with the AFU. - - The PSL provides the translation and interrupt services that the - AFU needs. This is what the kernel interacts with. For example, if - the AFU needs to read a particular effective address, it sends - that address to the PSL, the PSL then translates it, fetches the - data from memory and returns it to the AFU. If the PSL has a - translation miss, it interrupts the kernel and the kernel services - the fault. The context to which this fault is serviced is based on - who owns that acceleration function. - - - POWER8 and PSL Version 8 are compliant to the CAIA Version 1.0. - - POWER9 and PSL Version 9 are compliant to the CAIA Version 2.0. - - This PSL Version 9 provides new features such as: - - * Interaction with the nest MMU on the P9 chip. - * Native DMA support. - * Supports sending ASB_Notify messages for host thread wakeup. - * Supports Atomic operations. - * etc. - - Cards with a PSL9 won't work on a POWER8 system and cards with a - PSL8 won't work on a POWER9 system. - -AFU Modes -========= - - There are two programming modes supported by the AFU. Dedicated - and AFU directed. AFU may support one or both modes. - - When using dedicated mode only one MMU context is supported. In - this mode, only one userspace process can use the accelerator at - time. - - When using AFU directed mode, up to 16K simultaneous contexts can - be supported. This means up to 16K simultaneous userspace - applications may use the accelerator (although specific AFUs may - support fewer). In this mode, the AFU sends a 16 bit context ID - with each of its requests. This tells the PSL which context is - associated with each operation. If the PSL can't translate an - operation, the ID can also be accessed by the kernel so it can - determine the userspace context associated with an operation. - - -MMIO space -========== - - A portion of the accelerator MMIO space can be directly mapped - from the AFU to userspace. Either the whole space can be mapped or - just a per context portion. The hardware is self describing, hence - the kernel can determine the offset and size of the per context - portion. - - -Interrupts -========== - - AFUs may generate interrupts that are destined for userspace. These - are received by the kernel as hardware interrupts and passed onto - userspace by a read syscall documented below. - - Data storage faults and error interrupts are handled by the kernel - driver. - - -Work Element Descriptor (WED) -============================= - - The WED is a 64-bit parameter passed to the AFU when a context is - started. Its format is up to the AFU hence the kernel has no - knowledge of what it represents. Typically it will be the - effective address of a work queue or status block where the AFU - and userspace can share control and status information. - - - - -User API -======== - -1. AFU character devices -^^^^^^^^^^^^^^^^^^^^^^^^ - - For AFUs operating in AFU directed mode, two character device - files will be created. /dev/cxl/afu0.0m will correspond to a - master context and /dev/cxl/afu0.0s will correspond to a slave - context. Master contexts have access to the full MMIO space an - AFU provides. Slave contexts have access to only the per process - MMIO space an AFU provides. - - For AFUs operating in dedicated process mode, the driver will - only create a single character device per AFU called - /dev/cxl/afu0.0d. This will have access to the entire MMIO space - that the AFU provides (like master contexts in AFU directed). - - The types described below are defined in include/uapi/misc/cxl.h - - The following file operations are supported on both slave and - master devices. - - A userspace library libcxl is available here: - - https://github.com/ibm-capi/libcxl - - This provides a C interface to this kernel API. - -open ----- - - Opens the device and allocates a file descriptor to be used with - the rest of the API. - - A dedicated mode AFU only has one context and only allows the - device to be opened once. - - An AFU directed mode AFU can have many contexts, the device can be - opened once for each context that is available. - - When all available contexts are allocated the open call will fail - and return -ENOSPC. - - Note: - IRQs need to be allocated for each context, which may limit - the number of contexts that can be created, and therefore - how many times the device can be opened. The POWER8 CAPP - supports 2040 IRQs and 3 are used by the kernel, so 2037 are - left. If 1 IRQ is needed per context, then only 2037 - contexts can be allocated. If 4 IRQs are needed per context, - then only 2037/4 = 509 contexts can be allocated. - - -ioctl ------ - - CXL_IOCTL_START_WORK: - Starts the AFU context and associates it with the current - process. Once this ioctl is successfully executed, all memory - mapped into this process is accessible to this AFU context - using the same effective addresses. No additional calls are - required to map/unmap memory. The AFU memory context will be - updated as userspace allocates and frees memory. This ioctl - returns once the AFU context is started. - - Takes a pointer to a struct cxl_ioctl_start_work - - :: - - struct cxl_ioctl_start_work { - __u64 flags; - __u64 work_element_descriptor; - __u64 amr; - __s16 num_interrupts; - __s16 reserved1; - __s32 reserved2; - __u64 reserved3; - __u64 reserved4; - __u64 reserved5; - __u64 reserved6; - }; - - flags: - Indicates which optional fields in the structure are - valid. - - work_element_descriptor: - The Work Element Descriptor (WED) is a 64-bit argument - defined by the AFU. Typically this is an effective - address pointing to an AFU specific structure - describing what work to perform. - - amr: - Authority Mask Register (AMR), same as the powerpc - AMR. This field is only used by the kernel when the - corresponding CXL_START_WORK_AMR value is specified in - flags. If not specified the kernel will use a default - value of 0. - - num_interrupts: - Number of userspace interrupts to request. This field - is only used by the kernel when the corresponding - CXL_START_WORK_NUM_IRQS value is specified in flags. - If not specified the minimum number required by the - AFU will be allocated. The min and max number can be - obtained from sysfs. - - reserved fields: - For ABI padding and future extensions - - CXL_IOCTL_GET_PROCESS_ELEMENT: - Get the current context id, also known as the process element. - The value is returned from the kernel as a __u32. - - -mmap ----- - - An AFU may have an MMIO space to facilitate communication with the - AFU. If it does, the MMIO space can be accessed via mmap. The size - and contents of this area are specific to the particular AFU. The - size can be discovered via sysfs. - - In AFU directed mode, master contexts are allowed to map all of - the MMIO space and slave contexts are allowed to only map the per - process MMIO space associated with the context. In dedicated - process mode the entire MMIO space can always be mapped. - - This mmap call must be done after the START_WORK ioctl. - - Care should be taken when accessing MMIO space. Only 32 and 64-bit - accesses are supported by POWER8. Also, the AFU will be designed - with a specific endianness, so all MMIO accesses should consider - endianness (recommend endian(3) variants like: le64toh(), - be64toh() etc). These endian issues equally apply to shared memory - queues the WED may describe. - - -read ----- - - Reads events from the AFU. Blocks if no events are pending - (unless O_NONBLOCK is supplied). Returns -EIO in the case of an - unrecoverable error or if the card is removed. - - read() will always return an integral number of events. - - The buffer passed to read() must be at least 4K bytes. - - The result of the read will be a buffer of one or more events, - each event is of type struct cxl_event, of varying size:: - - struct cxl_event { - struct cxl_event_header header; - union { - struct cxl_event_afu_interrupt irq; - struct cxl_event_data_storage fault; - struct cxl_event_afu_error afu_error; - }; - }; - - The struct cxl_event_header is defined as - - :: - - struct cxl_event_header { - __u16 type; - __u16 size; - __u16 process_element; - __u16 reserved1; - }; - - type: - This defines the type of event. The type determines how - the rest of the event is structured. These types are - described below and defined by enum cxl_event_type. - - size: - This is the size of the event in bytes including the - struct cxl_event_header. The start of the next event can - be found at this offset from the start of the current - event. - - process_element: - Context ID of the event. - - reserved field: - For future extensions and padding. - - If the event type is CXL_EVENT_AFU_INTERRUPT then the event - structure is defined as - - :: - - struct cxl_event_afu_interrupt { - __u16 flags; - __u16 irq; /* Raised AFU interrupt number */ - __u32 reserved1; - }; - - flags: - These flags indicate which optional fields are present - in this struct. Currently all fields are mandatory. - - irq: - The IRQ number sent by the AFU. - - reserved field: - For future extensions and padding. - - If the event type is CXL_EVENT_DATA_STORAGE then the event - structure is defined as - - :: - - struct cxl_event_data_storage { - __u16 flags; - __u16 reserved1; - __u32 reserved2; - __u64 addr; - __u64 dsisr; - __u64 reserved3; - }; - - flags: - These flags indicate which optional fields are present in - this struct. Currently all fields are mandatory. - - address: - The address that the AFU unsuccessfully attempted to - access. Valid accesses will be handled transparently by the - kernel but invalid accesses will generate this event. - - dsisr: - This field gives information on the type of fault. It is a - copy of the DSISR from the PSL hardware when the address - fault occurred. The form of the DSISR is as defined in the - CAIA. - - reserved fields: - For future extensions - - If the event type is CXL_EVENT_AFU_ERROR then the event structure - is defined as - - :: - - struct cxl_event_afu_error { - __u16 flags; - __u16 reserved1; - __u32 reserved2; - __u64 error; - }; - - flags: - These flags indicate which optional fields are present in - this struct. Currently all fields are Mandatory. - - error: - Error status from the AFU. Defined by the AFU. - - reserved fields: - For future extensions and padding - - -2. Card character device (powerVM guest only) -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ - - In a powerVM guest, an extra character device is created for the - card. The device is only used to write (flash) a new image on the - FPGA accelerator. Once the image is written and verified, the - device tree is updated and the card is reset to reload the updated - image. - -open ----- - - Opens the device and allocates a file descriptor to be used with - the rest of the API. The device can only be opened once. - -ioctl ------ - -CXL_IOCTL_DOWNLOAD_IMAGE / CXL_IOCTL_VALIDATE_IMAGE: - Starts and controls flashing a new FPGA image. Partial - reconfiguration is not supported (yet), so the image must contain - a copy of the PSL and AFU(s). Since an image can be quite large, - the caller may have to iterate, splitting the image in smaller - chunks. - - Takes a pointer to a struct cxl_adapter_image:: - - struct cxl_adapter_image { - __u64 flags; - __u64 data; - __u64 len_data; - __u64 len_image; - __u64 reserved1; - __u64 reserved2; - __u64 reserved3; - __u64 reserved4; - }; - - flags: - These flags indicate which optional fields are present in - this struct. Currently all fields are mandatory. - - data: - Pointer to a buffer with part of the image to write to the - card. - - len_data: - Size of the buffer pointed to by data. - - len_image: - Full size of the image. - - -Sysfs Class -=========== - - A cxl sysfs class is added under /sys/class/cxl to facilitate - enumeration and tuning of the accelerators. Its layout is - described in Documentation/ABI/testing/sysfs-class-cxl - - -Udev rules -========== - - The following udev rules could be used to create a symlink to the - most logical chardev to use in any programming mode (afuX.Yd for - dedicated, afuX.Ys for afu directed), since the API is virtually - identical for each:: - - SUBSYSTEM=="cxl", ATTRS{mode}=="dedicated_process", SYMLINK="cxl/%b" - SUBSYSTEM=="cxl", ATTRS{mode}=="afu_directed", \ - KERNEL=="afu[0-9]*.[0-9]*s", SYMLINK="cxl/%b" diff --git a/Documentation/arch/powerpc/cxlflash.rst b/Documentation/arch/powerpc/cxlflash.rst deleted file mode 100644 index e8f488acfa41..000000000000 --- a/Documentation/arch/powerpc/cxlflash.rst +++ /dev/null @@ -1,433 +0,0 @@ -================================ -Coherent Accelerator (CXL) Flash -================================ - -Introduction -============ - - The IBM Power architecture provides support for CAPI (Coherent - Accelerator Power Interface), which is available to certain PCIe slots - on Power 8 systems. CAPI can be thought of as a special tunneling - protocol through PCIe that allow PCIe adapters to look like special - purpose co-processors which can read or write an application's - memory and generate page faults. As a result, the host interface to - an adapter running in CAPI mode does not require the data buffers to - be mapped to the device's memory (IOMMU bypass) nor does it require - memory to be pinned. - - On Linux, Coherent Accelerator (CXL) kernel services present CAPI - devices as a PCI device by implementing a virtual PCI host bridge. - This abstraction simplifies the infrastructure and programming - model, allowing for drivers to look similar to other native PCI - device drivers. - - CXL provides a mechanism by which user space applications can - directly talk to a device (network or storage) bypassing the typical - kernel/device driver stack. The CXL Flash Adapter Driver enables a - user space application direct access to Flash storage. - - The CXL Flash Adapter Driver is a kernel module that sits in the - SCSI stack as a low level device driver (below the SCSI disk and - protocol drivers) for the IBM CXL Flash Adapter. This driver is - responsible for the initialization of the adapter, setting up the - special path for user space access, and performing error recovery. It - communicates directly the Flash Accelerator Functional Unit (AFU) - as described in Documentation/arch/powerpc/cxl.rst. - - The cxlflash driver supports two, mutually exclusive, modes of - operation at the device (LUN) level: - - - Any flash device (LUN) can be configured to be accessed as a - regular disk device (i.e.: /dev/sdc). This is the default mode. - - - Any flash device (LUN) can be configured to be accessed from - user space with a special block library. This mode further - specifies the means of accessing the device and provides for - either raw access to the entire LUN (referred to as direct - or physical LUN access) or access to a kernel/AFU-mediated - partition of the LUN (referred to as virtual LUN access). The - segmentation of a disk device into virtual LUNs is assisted - by special translation services provided by the Flash AFU. - -Overview -======== - - The Coherent Accelerator Interface Architecture (CAIA) introduces a - concept of a master context. A master typically has special privileges - granted to it by the kernel or hypervisor allowing it to perform AFU - wide management and control. The master may or may not be involved - directly in each user I/O, but at the minimum is involved in the - initial setup before the user application is allowed to send requests - directly to the AFU. - - The CXL Flash Adapter Driver establishes a master context with the - AFU. It uses memory mapped I/O (MMIO) for this control and setup. The - Adapter Problem Space Memory Map looks like this:: - - +-------------------------------+ - | 512 * 64 KB User MMIO | - | (per context) | - | User Accessible | - +-------------------------------+ - | 512 * 128 B per context | - | Provisioning and Control | - | Trusted Process accessible | - +-------------------------------+ - | 64 KB Global | - | Trusted Process accessible | - +-------------------------------+ - - This driver configures itself into the SCSI software stack as an - adapter driver. The driver is the only entity that is considered a - Trusted Process to program the Provisioning and Control and Global - areas in the MMIO Space shown above. The master context driver - discovers all LUNs attached to the CXL Flash adapter and instantiates - scsi block devices (/dev/sdb, /dev/sdc etc.) for each unique LUN - seen from each path. - - Once these scsi block devices are instantiated, an application - written to a specification provided by the block library may get - access to the Flash from user space (without requiring a system call). - - This master context driver also provides a series of ioctls for this - block library to enable this user space access. The driver supports - two modes for accessing the block device. - - The first mode is called a virtual mode. In this mode a single scsi - block device (/dev/sdb) may be carved up into any number of distinct - virtual LUNs. The virtual LUNs may be resized as long as the sum of - the sizes of all the virtual LUNs, along with the meta-data associated - with it does not exceed the physical capacity. - - The second mode is called the physical mode. In this mode a single - block device (/dev/sdb) may be opened directly by the block library - and the entire space for the LUN is available to the application. - - Only the physical mode provides persistence of the data. i.e. The - data written to the block device will survive application exit and - restart and also reboot. The virtual LUNs do not persist (i.e. do - not survive after the application terminates or the system reboots). - - -Block library API -================= - - Applications intending to get access to the CXL Flash from user - space should use the block library, as it abstracts the details of - interfacing directly with the cxlflash driver that are necessary for - performing administrative actions (i.e.: setup, tear down, resize). - The block library can be thought of as a 'user' of services, - implemented as IOCTLs, that are provided by the cxlflash driver - specifically for devices (LUNs) operating in user space access - mode. While it is not a requirement that applications understand - the interface between the block library and the cxlflash driver, - a high-level overview of each supported service (IOCTL) is provided - below. - - The block library can be found on GitHub: - http://github.com/open-power/capiflash - - -CXL Flash Driver LUN IOCTLs -=========================== - - Users, such as the block library, that wish to interface with a flash - device (LUN) via user space access need to use the services provided - by the cxlflash driver. As these services are implemented as ioctls, - a file descriptor handle must first be obtained in order to establish - the communication channel between a user and the kernel. This file - descriptor is obtained by opening the device special file associated - with the scsi disk device (/dev/sdb) that was created during LUN - discovery. As per the location of the cxlflash driver within the - SCSI protocol stack, this open is actually not seen by the cxlflash - driver. Upon successful open, the user receives a file descriptor - (herein referred to as fd1) that should be used for issuing the - subsequent ioctls listed below. - - The structure definitions for these IOCTLs are available in: - uapi/scsi/cxlflash_ioctl.h - -DK_CXLFLASH_ATTACH ------------------- - - This ioctl obtains, initializes, and starts a context using the CXL - kernel services. These services specify a context id (u16) by which - to uniquely identify the context and its allocated resources. The - services additionally provide a second file descriptor (herein - referred to as fd2) that is used by the block library to initiate - memory mapped I/O (via mmap()) to the CXL flash device and poll for - completion events. This file descriptor is intentionally installed by - this driver and not the CXL kernel services to allow for intermediary - notification and access in the event of a non-user-initiated close(), - such as a killed process. This design point is described in further - detail in the description for the DK_CXLFLASH_DETACH ioctl. - - There are a few important aspects regarding the "tokens" (context id - and fd2) that are provided back to the user: - - - These tokens are only valid for the process under which they - were created. The child of a forked process cannot continue - to use the context id or file descriptor created by its parent - (see DK_CXLFLASH_VLUN_CLONE for further details). - - - These tokens are only valid for the lifetime of the context and - the process under which they were created. Once either is - destroyed, the tokens are to be considered stale and subsequent - usage will result in errors. - - - A valid adapter file descriptor (fd2 >= 0) is only returned on - the initial attach for a context. Subsequent attaches to an - existing context (DK_CXLFLASH_ATTACH_REUSE_CONTEXT flag present) - do not provide the adapter file descriptor as it was previously - made known to the application. - - - When a context is no longer needed, the user shall detach from - the context via the DK_CXLFLASH_DETACH ioctl. When this ioctl - returns with a valid adapter file descriptor and the return flag - DK_CXLFLASH_APP_CLOSE_ADAP_FD is present, the application _must_ - close the adapter file descriptor following a successful detach. - - - When this ioctl returns with a valid fd2 and the return flag - DK_CXLFLASH_APP_CLOSE_ADAP_FD is present, the application _must_ - close fd2 in the following circumstances: - - + Following a successful detach of the last user of the context - + Following a successful recovery on the context's original fd2 - + In the child process of a fork(), following a clone ioctl, - on the fd2 associated with the source context - - - At any time, a close on fd2 will invalidate the tokens. Applications - should exercise caution to only close fd2 when appropriate (outlined - in the previous bullet) to avoid premature loss of I/O. - -DK_CXLFLASH_USER_DIRECT ------------------------ - This ioctl is responsible for transitioning the LUN to direct - (physical) mode access and configuring the AFU for direct access from - user space on a per-context basis. Additionally, the block size and - last logical block address (LBA) are returned to the user. - - As mentioned previously, when operating in user space access mode, - LUNs may be accessed in whole or in part. Only one mode is allowed - at a time and if one mode is active (outstanding references exist), - requests to use the LUN in a different mode are denied. - - The AFU is configured for direct access from user space by adding an - entry to the AFU's resource handle table. The index of the entry is - treated as a resource handle that is returned to the user. The user - is then able to use the handle to reference the LUN during I/O. - -DK_CXLFLASH_USER_VIRTUAL ------------------------- - This ioctl is responsible for transitioning the LUN to virtual mode - of access and configuring the AFU for virtual access from user space - on a per-context basis. Additionally, the block size and last logical - block address (LBA) are returned to the user. - - As mentioned previously, when operating in user space access mode, - LUNs may be accessed in whole or in part. Only one mode is allowed - at a time and if one mode is active (outstanding references exist), - requests to use the LUN in a different mode are denied. - - The AFU is configured for virtual access from user space by adding - an entry to the AFU's resource handle table. The index of the entry - is treated as a resource handle that is returned to the user. The - user is then able to use the handle to reference the LUN during I/O. - - By default, the virtual LUN is created with a size of 0. The user - would need to use the DK_CXLFLASH_VLUN_RESIZE ioctl to adjust the grow - the virtual LUN to a desired size. To avoid having to perform this - resize for the initial creation of the virtual LUN, the user has the - option of specifying a size as part of the DK_CXLFLASH_USER_VIRTUAL - ioctl, such that when success is returned to the user, the - resource handle that is provided is already referencing provisioned - storage. This is reflected by the last LBA being a non-zero value. - - When a LUN is accessible from more than one port, this ioctl will - return with the DK_CXLFLASH_ALL_PORTS_ACTIVE return flag set. This - provides the user with a hint that I/O can be retried in the event - of an I/O error as the LUN can be reached over multiple paths. - -DK_CXLFLASH_VLUN_RESIZE ------------------------ - This ioctl is responsible for resizing a previously created virtual - LUN and will fail if invoked upon a LUN that is not in virtual - mode. Upon success, an updated last LBA is returned to the user - indicating the new size of the virtual LUN associated with the - resource handle. - - The partitioning of virtual LUNs is jointly mediated by the cxlflash - driver and the AFU. An allocation table is kept for each LUN that is - operating in the virtual mode and used to program a LUN translation - table that the AFU references when provided with a resource handle. - - This ioctl can return -EAGAIN if an AFU sync operation takes too long. - In addition to returning a failure to user, cxlflash will also schedule - an asynchronous AFU reset. Should the user choose to retry the operation, - it is expected to succeed. If this ioctl fails with -EAGAIN, the user - can either retry the operation or treat it as a failure. - -DK_CXLFLASH_RELEASE -------------------- - This ioctl is responsible for releasing a previously obtained - reference to either a physical or virtual LUN. This can be - thought of as the inverse of the DK_CXLFLASH_USER_DIRECT or - DK_CXLFLASH_USER_VIRTUAL ioctls. Upon success, the resource handle - is no longer valid and the entry in the resource handle table is - made available to be used again. - - As part of the release process for virtual LUNs, the virtual LUN - is first resized to 0 to clear out and free the translation tables - associated with the virtual LUN reference. - -DK_CXLFLASH_DETACH ------------------- - This ioctl is responsible for unregistering a context with the - cxlflash driver and release outstanding resources that were - not explicitly released via the DK_CXLFLASH_RELEASE ioctl. Upon - success, all "tokens" which had been provided to the user from the - DK_CXLFLASH_ATTACH onward are no longer valid. - - When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on a successful - attach, the application _must_ close the fd2 associated with the context - following the detach of the final user of the context. - -DK_CXLFLASH_VLUN_CLONE ----------------------- - This ioctl is responsible for cloning a previously created - context to a more recently created context. It exists solely to - support maintaining user space access to storage after a process - forks. Upon success, the child process (which invoked the ioctl) - will have access to the same LUNs via the same resource handle(s) - as the parent, but under a different context. - - Context sharing across processes is not supported with CXL and - therefore each fork must be met with establishing a new context - for the child process. This ioctl simplifies the state management - and playback required by a user in such a scenario. When a process - forks, child process can clone the parents context by first creating - a context (via DK_CXLFLASH_ATTACH) and then using this ioctl to - perform the clone from the parent to the child. - - The clone itself is fairly simple. The resource handle and lun - translation tables are copied from the parent context to the child's - and then synced with the AFU. - - When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on a successful - attach, the application _must_ close the fd2 associated with the source - context (still resident/accessible in the parent process) following the - clone. This is to avoid a stale entry in the file descriptor table of the - child process. - - This ioctl can return -EAGAIN if an AFU sync operation takes too long. - In addition to returning a failure to user, cxlflash will also schedule - an asynchronous AFU reset. Should the user choose to retry the operation, - it is expected to succeed. If this ioctl fails with -EAGAIN, the user - can either retry the operation or treat it as a failure. - -DK_CXLFLASH_VERIFY ------------------- - This ioctl is used to detect various changes such as the capacity of - the disk changing, the number of LUNs visible changing, etc. In cases - where the changes affect the application (such as a LUN resize), the - cxlflash driver will report the changed state to the application. - - The user calls in when they want to validate that a LUN hasn't been - changed in response to a check condition. As the user is operating out - of band from the kernel, they will see these types of events without - the kernel's knowledge. When encountered, the user's architected - behavior is to call in to this ioctl, indicating what they want to - verify and passing along any appropriate information. For now, only - verifying a LUN change (ie: size different) with sense data is - supported. - -DK_CXLFLASH_RECOVER_AFU ------------------------ - This ioctl is used to drive recovery (if such an action is warranted) - of a specified user context. Any state associated with the user context - is re-established upon successful recovery. - - User contexts are put into an error condition when the device needs to - be reset or is terminating. Users are notified of this error condition - by seeing all 0xF's on an MMIO read. Upon encountering this, the - architected behavior for a user is to call into this ioctl to recover - their context. A user may also call into this ioctl at any time to - check if the device is operating normally. If a failure is returned - from this ioctl, the user is expected to gracefully clean up their - context via release/detach ioctls. Until they do, the context they - hold is not relinquished. The user may also optionally exit the process - at which time the context/resources they held will be freed as part of - the release fop. - - When the DK_CXLFLASH_APP_CLOSE_ADAP_FD flag was returned on a successful - attach, the application _must_ unmap and close the fd2 associated with the - original context following this ioctl returning success and indicating that - the context was recovered (DK_CXLFLASH_RECOVER_AFU_CONTEXT_RESET). - -DK_CXLFLASH_MANAGE_LUN ----------------------- - This ioctl is used to switch a LUN from a mode where it is available - for file-system access (legacy), to a mode where it is set aside for - exclusive user space access (superpipe). In case a LUN is visible - across multiple ports and adapters, this ioctl is used to uniquely - identify each LUN by its World Wide Node Name (WWNN). - - -CXL Flash Driver Host IOCTLs -============================ - - Each host adapter instance that is supported by the cxlflash driver - has a special character device associated with it to enable a set of - host management function. These character devices are hosted in a - class dedicated for cxlflash and can be accessed via `/dev/cxlflash/*`. - - Applications can be written to perform various functions using the - host ioctl APIs below. - - The structure definitions for these IOCTLs are available in: - uapi/scsi/cxlflash_ioctl.h - -HT_CXLFLASH_LUN_PROVISION -------------------------- - This ioctl is used to create and delete persistent LUNs on cxlflash - devices that lack an external LUN management interface. It is only - valid when used with AFUs that support the LUN provision capability. - - When sufficient space is available, LUNs can be created by specifying - the target port to host the LUN and a desired size in 4K blocks. Upon - success, the LUN ID and WWID of the created LUN will be returned and - the SCSI bus can be scanned to detect the change in LUN topology. Note - that partial allocations are not supported. Should a creation fail due - to a space issue, the target port can be queried for its current LUN - geometry. - - To remove a LUN, the device must first be disassociated from the Linux - SCSI subsystem. The LUN deletion can then be initiated by specifying a - target port and LUN ID. Upon success, the LUN geometry associated with - the port will be updated to reflect new number of provisioned LUNs and - available capacity. - - To query the LUN geometry of a port, the target port is specified and - upon success, the following information is presented: - - - Maximum number of provisioned LUNs allowed for the port - - Current number of provisioned LUNs for the port - - Maximum total capacity of provisioned LUNs for the port (4K blocks) - - Current total capacity of provisioned LUNs for the port (4K blocks) - - With this information, the number of available LUNs and capacity can be - can be calculated. - -HT_CXLFLASH_AFU_DEBUG ---------------------- - This ioctl is used to debug AFUs by supporting a command pass-through - interface. It is only valid when used with AFUs that support the AFU - debug capability. - - With exception of buffer management, AFU debug commands are opaque to - cxlflash and treated as pass-through. For debug commands that do require - data transfer, the user supplies an adequately sized data buffer and must - specify the data transfer direction with respect to the host. There is a - maximum transfer size of 256K imposed. Note that partial read completions - are not supported - when errors are experienced with a host read data - transfer, the data buffer is not copied back to the user. diff --git a/Documentation/arch/powerpc/dexcr.rst b/Documentation/arch/powerpc/dexcr.rst index 615a631f51fa..ab0724212fcd 100644 --- a/Documentation/arch/powerpc/dexcr.rst +++ b/Documentation/arch/powerpc/dexcr.rst @@ -36,8 +36,145 @@ state for a process. Configuration ============= -The DEXCR is currently unconfigurable. All threads are run with the -NPHIE aspect enabled. +prctl +----- + +A process can control its own userspace DEXCR value using the +``PR_PPC_GET_DEXCR`` and ``PR_PPC_SET_DEXCR`` pair of +:manpage:`prctl(2)` commands. These calls have the form:: + + prctl(PR_PPC_GET_DEXCR, unsigned long which, 0, 0, 0); + prctl(PR_PPC_SET_DEXCR, unsigned long which, unsigned long ctrl, 0, 0); + +The possible 'which' and 'ctrl' values are as follows. Note there is no relation +between the 'which' value and the DEXCR aspect's index. + +.. flat-table:: + :header-rows: 1 + :widths: 2 7 1 + + * - ``prctl()`` which + - Aspect name + - Aspect index + + * - ``PR_PPC_DEXCR_SBHE`` + - Speculative Branch Hint Enable (SBHE) + - 0 + + * - ``PR_PPC_DEXCR_IBRTPD`` + - Indirect Branch Recurrent Target Prediction Disable (IBRTPD) + - 3 + + * - ``PR_PPC_DEXCR_SRAPD`` + - Subroutine Return Address Prediction Disable (SRAPD) + - 4 + + * - ``PR_PPC_DEXCR_NPHIE`` + - Non-Privileged Hash Instruction Enable (NPHIE) + - 5 + +.. flat-table:: + :header-rows: 1 + :widths: 2 8 + + * - ``prctl()`` ctrl + - Meaning + + * - ``PR_PPC_DEXCR_CTRL_EDITABLE`` + - This aspect can be configured with PR_PPC_SET_DEXCR (get only) + + * - ``PR_PPC_DEXCR_CTRL_SET`` + - This aspect is set / set this aspect + + * - ``PR_PPC_DEXCR_CTRL_CLEAR`` + - This aspect is clear / clear this aspect + + * - ``PR_PPC_DEXCR_CTRL_SET_ONEXEC`` + - This aspect will be set after exec / set this aspect after exec + + * - ``PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC`` + - This aspect will be clear after exec / clear this aspect after exec + +Note that + +* which is a plain value, not a bitmask. Aspects must be worked with individually. + +* ctrl is a bitmask. ``PR_PPC_GET_DEXCR`` returns both the current and onexec + configuration. For example, ``PR_PPC_GET_DEXCR`` may return + ``PR_PPC_DEXCR_CTRL_EDITABLE | PR_PPC_DEXCR_CTRL_SET | + PR_PPC_DEXCR_CTRL_CLEAR_ONEXEC``. This would indicate the aspect is currently + set, it will be cleared when you run exec, and you can change this with the + ``PR_PPC_SET_DEXCR`` prctl. + +* The set/clear terminology refers to setting/clearing the bit in the DEXCR. + For example:: + + prctl(PR_PPC_SET_DEXCR, PR_PPC_DEXCR_IBRTPD, PR_PPC_DEXCR_CTRL_SET, 0, 0); + + will set the IBRTPD aspect bit in the DEXCR, causing indirect branch prediction + to be disabled. + +* The status returned by ``PR_PPC_GET_DEXCR`` represents what value the process + would like applied. It does not include any alternative overrides, such as if + the hypervisor is enforcing the aspect be set. To see the true DEXCR state + software should read the appropriate SPRs directly. + +* The aspect state when starting a process is copied from the parent's state on + :manpage:`fork(2)`. The state is reset to a fixed value on + :manpage:`execve(2)`. The PR_PPC_SET_DEXCR prctl() can control both of these + values. + +* The ``*_ONEXEC`` controls do not change the current process's DEXCR. + +Use ``PR_PPC_SET_DEXCR`` with one of ``PR_PPC_DEXCR_CTRL_SET`` or +``PR_PPC_DEXCR_CTRL_CLEAR`` to edit a given aspect. + +Common error codes for both getting and setting the DEXCR are as follows: + +.. flat-table:: + :header-rows: 1 + :widths: 2 8 + + * - Error + - Meaning + + * - ``EINVAL`` + - The DEXCR is not supported by the kernel. + + * - ``ENODEV`` + - The aspect is not recognised by the kernel or not supported by the + hardware. + +``PR_PPC_SET_DEXCR`` may also report the following error codes: + +.. flat-table:: + :header-rows: 1 + :widths: 2 8 + + * - Error + - Meaning + + * - ``EINVAL`` + - The ctrl value contains unrecognised flags. + + * - ``EINVAL`` + - The ctrl value contains mutually conflicting flags (e.g., + ``PR_PPC_DEXCR_CTRL_SET | PR_PPC_DEXCR_CTRL_CLEAR``) + + * - ``EPERM`` + - This aspect cannot be modified with prctl() (check for the + PR_PPC_DEXCR_CTRL_EDITABLE flag with PR_PPC_GET_DEXCR). + + * - ``EPERM`` + - The process does not have sufficient privilege to perform the operation. + For example, clearing NPHIE on exec is a privileged operation (a process + can still clear its own NPHIE aspect without privileges). + +This interface allows a process to control its own DEXCR aspects, and also set +the initial DEXCR value for any children in its process tree (up to the next +child to use an ``*_ONEXEC`` control). This allows fine-grained control over the +default value of the DEXCR, for example allowing containers to run with different +default values. coredump and ptrace diff --git a/Documentation/arch/powerpc/elf_hwcaps.rst b/Documentation/arch/powerpc/elf_hwcaps.rst index 4c896cf077c2..fce7489877b5 100644 --- a/Documentation/arch/powerpc/elf_hwcaps.rst +++ b/Documentation/arch/powerpc/elf_hwcaps.rst @@ -91,6 +91,7 @@ PPC_FEATURE_HAS_MMU PPC_FEATURE_HAS_4xxMAC The processor is 40x or 44x family. + Unused in the kernel since 732b32daef80 ("powerpc: Remove core support for 40x") PPC_FEATURE_UNIFIED_CACHE The processor has a unified L1 cache for instructions and data, as diff --git a/Documentation/arch/powerpc/firmware-assisted-dump.rst b/Documentation/arch/powerpc/firmware-assisted-dump.rst index e363fc48529a..7e266e749cd5 100644 --- a/Documentation/arch/powerpc/firmware-assisted-dump.rst +++ b/Documentation/arch/powerpc/firmware-assisted-dump.rst @@ -120,6 +120,28 @@ to ensure that crash data is preserved to process later. e.g. # echo 1 > /sys/firmware/opal/mpipl/release_core +-- Support for Additional Kernel Arguments in Fadump + Fadump has a feature that allows passing additional kernel arguments + to the fadump kernel. This feature was primarily designed to disable + kernel functionalities that are not required for the fadump kernel + and to reduce its memory footprint while collecting the dump. + + Command to Add Additional Kernel Parameters to Fadump: + e.g. + # echo "nr_cpus=16" > /sys/kernel/fadump/bootargs_append + + The above command is sufficient to add additional arguments to fadump. + An explicit service restart is not required. + + Command to Retrieve the Additional Fadump Arguments: + e.g. + # cat /sys/kernel/fadump/bootargs_append + +Note: Additional kernel arguments for fadump with HASH MMU is only + supported if the RMA size is greater than 768 MB. If the RMA + size is less than 768 MB, the kernel does not export the + /sys/kernel/fadump/bootargs_append sysfs node. + Implementation details: ----------------------- @@ -134,12 +156,12 @@ that are run. If there is dump data, then the memory is held. If there is no waiting dump data, then only the memory required to -hold CPU state, HPTE region, boot memory dump, FADump header and -elfcore header, is usually reserved at an offset greater than boot -memory size (see Fig. 1). This area is *not* released: this region -will be kept permanently reserved, so that it can act as a receptacle -for a copy of the boot memory content in addition to CPU state and -HPTE region, in the case a crash does occur. +hold CPU state, HPTE region, boot memory dump, and FADump header is +usually reserved at an offset greater than boot memory size (see Fig. 1). +This area is *not* released: this region will be kept permanently +reserved, so that it can act as a receptacle for a copy of the boot +memory content in addition to CPU state and HPTE region, in the case +a crash does occur. Since this reserved memory area is used only after the system crash, there is no point in blocking this significant chunk of memory from @@ -153,22 +175,22 @@ that were present in CMA region:: o Memory Reservation during first kernel - Low memory Top of memory - 0 boot memory size |<--- Reserved dump area --->| | - | | | Permanent Reservation | | - V V | | V - +-----------+-----/ /---+---+----+-------+-----+-----+----+--+ - | | |///|////| DUMP | HDR | ELF |////| | - +-----------+-----/ /---+---+----+-------+-----+-----+----+--+ - | ^ ^ ^ ^ ^ - | | | | | | - \ CPU HPTE / | | - ------------------------------ | | - Boot memory content gets transferred | | - to reserved area by firmware at the | | - time of crash. | | - FADump Header | - (meta area) | + Low memory Top of memory + 0 boot memory size |<------ Reserved dump area ----->| | + | | | Permanent Reservation | | + V V | | V + +-----------+-----/ /---+---+----+-----------+-------+----+-----+ + | | |///|////| DUMP | HDR |////| | + +-----------+-----/ /---+---+----+-----------+-------+----+-----+ + | ^ ^ ^ ^ ^ + | | | | | | + \ CPU HPTE / | | + -------------------------------- | | + Boot memory content gets transferred | | + to reserved area by firmware at the | | + time of crash. | | + FADump Header | + (meta area) | | | Metadata: This area holds a metadata structure whose @@ -186,13 +208,20 @@ that were present in CMA region:: 0 boot memory size | | |<------------ Crash preserved area ------------>| V V |<--- Reserved dump area --->| | - +-----------+-----/ /---+---+----+-------+-----+-----+----+--+ - | | |///|////| DUMP | HDR | ELF |////| | - +-----------+-----/ /---+---+----+-------+-----+-----+----+--+ - | | - V V - Used by second /proc/vmcore - kernel to boot + +----+---+--+-----/ /---+---+----+-------+-----+-----+-------+ + | |ELF| | |///|////| DUMP | HDR |/////| | + +----+---+--+-----/ /---+---+----+-------+-----+-----+-------+ + | | | | | | + ----- ------------------------------ --------------- + \ | | + \ | | + \ | | + \ | ---------------------------- + \ | / + \ | / + \ | / + /proc/vmcore + +---+ |///| -> Regions (CPU, HPTE & Metadata) marked like this in the above @@ -200,6 +229,12 @@ that were present in CMA region:: does not have CPU & HPTE regions while Metadata region is not supported on pSeries currently. + +---+ + |ELF| -> elfcorehdr, it is created in second kernel after crash. + +---+ + + Note: Memory from 0 to the boot memory size is used by second kernel + Fig. 2 @@ -353,26 +388,6 @@ TODO: - Need to come up with the better approach to find out more accurate boot memory size that is required for a kernel to boot successfully when booted with restricted memory. - - The FADump implementation introduces a FADump crash info structure - in the scratch area before the ELF core header. The idea of introducing - this structure is to pass some important crash info data to the second - kernel which will help second kernel to populate ELF core header with - correct data before it gets exported through /proc/vmcore. The current - design implementation does not address a possibility of introducing - additional fields (in future) to this structure without affecting - compatibility. Need to come up with the better approach to address this. - - The possible approaches are: - - 1. Introduce version field for version tracking, bump up the version - whenever a new field is added to the structure in future. The version - field can be used to find out what fields are valid for the current - version of the structure. - 2. Reserve the area of predefined size (say PAGE_SIZE) for this - structure and have unused area as reserved (initialized to zero) - for future field additions. - - The advantage of approach 1 over 2 is we don't need to reserve extra space. Author: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com> diff --git a/Documentation/arch/powerpc/htm.rst b/Documentation/arch/powerpc/htm.rst new file mode 100644 index 000000000000..fcb4eb6306b1 --- /dev/null +++ b/Documentation/arch/powerpc/htm.rst @@ -0,0 +1,104 @@ +.. SPDX-License-Identifier: GPL-2.0 +.. _htm: + +=================================== +HTM (Hardware Trace Macro) +=================================== + +Athira Rajeev, 2 Mar 2025 + +.. contents:: + :depth: 3 + + +Basic overview +============== + +H_HTM is used as an interface for executing Hardware Trace Macro (HTM) +functions, including setup, configuration, control and dumping of the HTM data. +For using HTM, it is required to setup HTM buffers and HTM operations can +be controlled using the H_HTM hcall. The hcall can be invoked for any core/chip +of the system from within a partition itself. To use this feature, a debugfs +folder called "htmdump" is present under /sys/kernel/debug/powerpc. + + +HTM debugfs example usage +========================= + +.. code-block:: sh + + # ls /sys/kernel/debug/powerpc/htmdump/ + coreindexonchip htmcaps htmconfigure htmflags htminfo htmsetup + htmstart htmstatus htmtype nodalchipindex nodeindex trace + +Details on each file: + +* nodeindex, nodalchipindex, coreindexonchip specifies which partition to configure the HTM for. +* htmtype: specifies the type of HTM. Supported target is hardwareTarget. +* trace: is to read the HTM data. +* htmconfigure: Configure/Deconfigure the HTM. Writing 1 to the file will configure the trace, writing 0 to the file will do deconfigure. +* htmstart: start/Stop the HTM. Writing 1 to the file will start the tracing, writing 0 to the file will stop the tracing. +* htmstatus: get the status of HTM. This is needed to understand the HTM state after each operation. +* htmsetup: set the HTM buffer size. Size of HTM buffer is in power of 2 +* htminfo: provides the system processor configuration details. This is needed to understand the appropriate values for nodeindex, nodalchipindex, coreindexonchip. +* htmcaps : provides the HTM capabilities like minimum/maximum buffer size, what kind of tracing the HTM supports etc. +* htmflags : allows to pass flags to hcall. Currently supports controlling the wrapping of HTM buffer. + +To see the system processor configuration details: + +.. code-block:: sh + + # cat /sys/kernel/debug/powerpc/htmdump/htminfo > htminfo_file + +The result can be interpreted using hexdump. + +To collect HTM traces for a partition represented by nodeindex as +zero, nodalchipindex as 1 and coreindexonchip as 12 + +.. code-block:: sh + + # cd /sys/kernel/debug/powerpc/htmdump/ + # echo 2 > htmtype + # echo 33 > htmsetup ( sets 8GB memory for HTM buffer, number is size in power of 2 ) + +This requires a CEC reboot to get the HTM buffers allocated. + +.. code-block:: sh + + # cd /sys/kernel/debug/powerpc/htmdump/ + # echo 2 > htmtype + # echo 0 > nodeindex + # echo 1 > nodalchipindex + # echo 12 > coreindexonchip + # echo 1 > htmflags # to set noWrap for HTM buffers + # echo 1 > htmconfigure # Configure the HTM + # echo 1 > htmstart # Start the HTM + # echo 0 > htmstart # Stop the HTM + # echo 0 > htmconfigure # Deconfigure the HTM + # cat htmstatus # Dump the status of HTM entries as data + +Above will set the htmtype and core details, followed by executing respective HTM operation. + +Read the HTM trace data +======================== + +After starting the trace collection, run the workload +of interest. Stop the trace collection after required period +of time, and read the trace file. + +.. code-block:: sh + + # cat /sys/kernel/debug/powerpc/htmdump/trace > trace_file + +This trace file will contain the relevant instruction traces +collected during the workload execution. And can be used as +input file for trace decoders to understand data. + +Benefits of using HTM debugfs interface +======================================= + +It is now possible to collect traces for a particular core/chip +from within any partition of the system and decode it. Through +this enablement, a small partition can be dedicated to collect the +trace data and analyze to provide important information for Performance +analysis, Software tuning, or Hardware debug. diff --git a/Documentation/arch/powerpc/index.rst b/Documentation/arch/powerpc/index.rst index 9749f6dc258f..0560cbae5fa1 100644 --- a/Documentation/arch/powerpc/index.rst +++ b/Documentation/arch/powerpc/index.rst @@ -12,8 +12,6 @@ powerpc bootwrapper cpu_families cpu_features - cxl - cxlflash dawr-power9 dexcr dscr diff --git a/Documentation/arch/powerpc/kvm-nested.rst b/Documentation/arch/powerpc/kvm-nested.rst index 630602a8aa00..574592505604 100644 --- a/Documentation/arch/powerpc/kvm-nested.rst +++ b/Documentation/arch/powerpc/kvm-nested.rst @@ -208,13 +208,9 @@ associated values for each ID in the GSB:: flags: Bit 0: getGuestWideState: Request state of the Guest instead of an individual VCPU. - Bit 1: takeOwnershipOfVcpuState Indicate the L1 is taking - over ownership of the VCPU state and that the L0 can free - the storage holding the state. The VCPU state will need to - be returned to the Hypervisor via H_GUEST_SET_STATE prior - to H_GUEST_RUN_VCPU being called for this VCPU. The data - returned in the dataBuffer is in a Hypervisor internal - format. + Bit 1: getHostWideState: Request stats of the Host. This causes + the guestId and vcpuId parameters to be ignored and attempting + to get the VCPU/Guest state will cause an error. Bits 2-63: Reserved guestId: ID obtained from H_GUEST_CREATE vcpuId: ID of the vCPU pass to H_GUEST_CREATE_VCPU @@ -406,9 +402,10 @@ the partition like the timebase offset and partition scoped page table information. +--------+-------+----+--------+----------------------------------+ -| ID | Size | RW | Thread | Details | -| | Bytes | | Guest | | -| | | | Scope | | +| ID | Size | RW |(H)ost | Details | +| | Bytes | |(G)uest | | +| | | |(T)hread| | +| | | |Scope | | +========+=======+====+========+==================================+ | 0x0000 | | RW | TG | NOP element | +--------+-------+----+--------+----------------------------------+ @@ -434,6 +431,29 @@ table information. | | | | |- 0x8 Table size. | +--------+-------+----+--------+----------------------------------+ | 0x0007-| | | | Reserved | +| 0x07FF | | | | | ++--------+-------+----+--------+----------------------------------+ +| 0x0800 | 0x08 | R | H | Current usage in bytes of the | +| | | | | L0's Guest Management Space | +| | | | | for an L1-Lpar. | ++--------+-------+----+--------+----------------------------------+ +| 0x0801 | 0x08 | R | H | Max bytes available in the | +| | | | | L0's Guest Management Space for | +| | | | | an L1-Lpar | ++--------+-------+----+--------+----------------------------------+ +| 0x0802 | 0x08 | R | H | Current usage in bytes of the | +| | | | | L0's Guest Page Table Management | +| | | | | Space for an L1-Lpar | ++--------+-------+----+--------+----------------------------------+ +| 0x0803 | 0x08 | R | H | Max bytes available in the L0's | +| | | | | Guest Page Table Management | +| | | | | Space for an L1-Lpar | ++--------+-------+----+--------+----------------------------------+ +| 0x0804 | 0x08 | R | H | Cumulative Reclaimed bytes from | +| | | | | L0 Guest's Page Table Management | +| | | | | Space due to overcommit | ++--------+-------+----+--------+----------------------------------+ +| 0x0805-| | | | Reserved | | 0x0BFF | | | | | +--------+-------+----+--------+----------------------------------+ | 0x0C00 | 0x10 | RW | T |Run vCPU Input Buffer: | @@ -546,7 +566,9 @@ table information. +--------+-------+----+--------+----------------------------------+ | 0x1052 | 0x08 | RW | T | CTRL | +--------+-------+----+--------+----------------------------------+ -| 0x1053-| | | | Reserved | +| 0x1053 | 0x08 | RW | T | DPDES | ++--------+-------+----+--------+----------------------------------+ +| 0x1054-| | | | Reserved | | 0x1FFF | | | | | +--------+-------+----+--------+----------------------------------+ | 0x2000 | 0x04 | RW | T | CR | diff --git a/Documentation/arch/powerpc/papr_hcalls.rst b/Documentation/arch/powerpc/papr_hcalls.rst index 80d2c0aadab5..805e1cb9bab9 100644 --- a/Documentation/arch/powerpc/papr_hcalls.rst +++ b/Documentation/arch/powerpc/papr_hcalls.rst @@ -289,6 +289,17 @@ to be issued multiple times in order to be completely serviced. The subsequent hcalls to the hypervisor until the hcall is completely serviced at which point H_SUCCESS or other error is returned by the hypervisor. +**H_HTM** + +| Input: flags, target, operation (op), op-param1, op-param2, op-param3 +| Out: *dumphtmbufferdata* +| Return Value: *H_Success,H_Busy,H_LongBusyOrder,H_Partial,H_Parameter, + H_P2,H_P3,H_P4,H_P5,H_P6,H_State,H_Not_Available,H_Authority* + +H_HTM supports setup, configuration, control and dumping of Hardware Trace +Macro (HTM) function and its data. HTM buffer stores tracing data for functions +like core instruction, core LLAT and nest. + References ========== .. [1] "Power Architecture Platform Reference" diff --git a/Documentation/arch/powerpc/ultravisor.rst b/Documentation/arch/powerpc/ultravisor.rst index ba6b1bf1cc44..6d0407b2f5a1 100644 --- a/Documentation/arch/powerpc/ultravisor.rst +++ b/Documentation/arch/powerpc/ultravisor.rst @@ -134,7 +134,7 @@ Hardware * PTCR and partition table entries (partition table is in secure memory). An attempt to write to PTCR will cause a Hypervisor - Emulation Assitance interrupt. + Emulation Assistance interrupt. * LDBAR (LD Base Address Register) and IMC (In-Memory Collection) non-architected registers. An attempt to write to them will cause a |