psl / afu interface - accelerator work group...

35
www.openpowerfoundation.org

Upload: others

Post on 20-Aug-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Page 2: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation iiWorkgroup Specification

Standard Track

PSL / AFU Interface : Accelerator Work Group SpecificationVersion 1.0 (2016-02-18)Copyright © 2015 OpenPOWER Foundation

All capitalized terms in the following text have the meanings assigned to them in the OpenPOWER Intellectual Property Rights Poli-cy (the "OpenPOWER IPR Policy"). The full Policy may be found at the OpenPOWER website or are available upon request.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise ex-plain it or assist in its implementation may be prepared, copied, published, and distributed, in whole or in part, without restriction ofany kind, provided that the above copyright notice and this section are included on all such copies and derivative works. Howev-er, this document itself may not be modified in any way, including by removing the copyright notice or references to OpenPOWER,except as needed for the purpose of developing any document or deliverable produced by an OpenPOWER Work Group (in whichcase the rules applicable to copyrights, as set forth in the OpenPOWER IPR Policy, must be followed) or as required to translate itinto languages other than English.

The limited permissions granted above are perpetual and will not be revoked by OpenPOWER or its successors or assigns.

This document and the information contained herein is provided on an "AS IS" basis AND TO THE MAXIMUM EXTENT PERMIT-TED BY APPLICABLE LAW, THE OpenPOWER Foundation AS WELL AS THE AUTHORS AND DEVELOPERS OF THIS STAN-DARDS FINAL DELIVERABLE OR OTHER DOCUMENT HEREBY DISCLAIM ALL OTHER WARRANTIES AND CONDITIONS,EITHER EXPRESS, IMPLIED OR STATUTORY, INCLUDING BUT NOT LIMITED TO, ANY IMPLIED WARRANTIES, DUTIES ORCONDITIONS OF MERCHANTABILITY, OF FITNESS FOR A PARTICULAR PURPOSE, OF ACCURACY OR COMPLETENESSOF RESPONSES, OF RESULTS, OF WORKMANLIKE EFFORT, OF LACK OF VIRUSES, OF LACK OF NEGLIGENCE OR NON-INFRINGEMENT.

OpenPOWER, the OpenPOWER logo, and openpowerfoundation.org are trademarks or registered trademarks of OpenPOWERFoundation, Inc., registered in many jurisdictions worldwide. Other company, product, and service names may be trademarks or ser-vice marks of others.

This document is the workproduct of the OpenPOWER Foundation Accelerator Workgroup. Acknowl-edgement to members of the workgroup for there contributions

Page 3: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation iiiWorkgroup Specification

Standard Track

Table of ContentsPreface .......................................................................................................................................... vi

1. Conventions ...................................................................................................................... vi2. Document change history .................................................................................................. vi

1. PSL AFU Interface ..................................................................................................................... 11.1. AFU Command Interface ................................................................................................ 11.2. AFU Buffer Interface ....................................................................................................... 71.3. PSL Response Interface ................................................................................................. 81.4. AFU MMIO Interface ..................................................................................................... 101.5. AFU Control Interface ................................................................................................... 11

2. Timing Diagram Examples ....................................................................................................... 173. Conformance to this Specification ............................................................................................ 19

3.1. AFU Command Interface ............................................................................................... 193.2. AFU Buffer Interface ..................................................................................................... 193.3. PSL Response Interface ............................................................................................... 203.4. AFU MMIO Interface ..................................................................................................... 203.5. AFU Control Interface ................................................................................................... 21

Glossary ....................................................................................................................................... 22A. OpenPOWER Foundation overview ......................................................................................... 28

A.1. Foundation documentation ............................................................................................ 28A.2. Technical resources ...................................................................................................... 28A.3. Contact the foundation .................................................................................................. 29

Page 4: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation ivWorkgroup Specification

Standard Track

List of Figures1.1. PSL Command/Response Flow ............................................................................................. 101.2. PSL AFU Control Interface Flow in Non-Shared Mode .......................................................... 132.1. Control Interface, Reset ........................................................................................................ 172.2. Control Interface, Start .......................................................................................................... 172.3. Command Interface, Read_cl_na .......................................................................................... 172.4. Buffer Interface, Write of buffer from Read_cl_na .................................................................. 172.5. Response Interface, Read_cl_na complete ........................................................................... 18

Page 5: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation vWorkgroup Specification

Standard Track

List of Tables1.1. AFU Command Interface ........................................................................................................ 11.2. PSL Command Opcodes Directed at the PSL Cache .............................................................. 21.3. PSL Command Opcodes That Do Not Allocate in the PSL Cache ........................................... 31.4. PSL Command Opcodes Reserved for Scratch Pad ............................................................... 31.5. PSL Command Opcodes Reserved for LPC Services ............................................................. 31.6. PSL Command Opcodes for Management .............................................................................. 41.7. ah_cabt Translation Ordering Behavior ................................................................................... 41.8. AFU Buffer Interface ............................................................................................................... 71.9. PSL Response Interface ......................................................................................................... 81.10. PSL Response Codes ........................................................................................................... 91.11. AFU MMIO Interface ........................................................................................................... 111.12. AFU Control Interface ......................................................................................................... 121.13. PSL Control Commands on ha_jcom .................................................................................. 121.14. Process Element Entry Format ............................................................................................ 15

Page 6: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation viWorkgroup Specification

Standard Track

Preface1. ConventionsThe OpenPOWER Foundation documentation uses several typesetting conventions.

NoticesNotices take these forms:

Note

A handy tip or reminder.

Important

Something you must be aware of before proceeding.

Warning

Critical information about the risk of data loss or security issues.

Command prompts$ prompt Any user, including the root user, can run commands that are prefixed with the $

prompt.

# prompt The root user must run commands that are prefixed with the # prompt. You can alsoprefix these commands with the sudo command, if available, to run them.

2. Document change historyThis version of the guide replaces and obsoletes all earlier versions.

The following table describes the most recent changes:

Revision Date Summary of Changes

October 20, 2015 • v1.0 WorkGroup Specification

April 14, 2015 • Creation based on IBM CAPI Workbook documentation• Updates from Accelerator WG review of original submission• Non-material formating and typographic edits

Page 7: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 1Workgroup Specification

Standard Track

1. PSL AFU InterfaceThe POWER Service Layer (PSL) to Accelerator Functional Unit (AFU) interface communicatesto the acceleration logic running on the FPGA. Through this interface, the PSL offers services tothe AFU. The services offered are cache-line oriented and allow the AFU to make buffering versusthroughput trade-offs. The interface to the AFU is composed of five independent interfaces:

• AFU Command Interface is the interface through which the AFU sends service requests to thePSL.

• AFU Buffer Interface is the interface through which the PSL moves data to and from the AFU.

• PSL Response Interface is the interface through which the PSL reports status about service re-quests.

• AFU MMIO Interface is the interface through which software reads and writes can access registerswithin the AFU.

• AFU Control Interface allows the PSL job management functions to control the state of the AFU.

Together these interfaces allow software to control the AFU state and allow the AFU to access datain the system.

1.1. AFU Command InterfaceThe AFU command interface provides the AFU logic with the ability to send commands to the PSL.The interface is a credit-based interface; the bus ha_croom informs the AFU of the number of com-mands it can accept from the AFU. The number of commands allocated to the AFU might changebased on job management policies. The interface is a synchronous interface; Xh_valid must be validfor only one cycle per command, and the other command descriptor signals must also be valid dur-ing that cycle. Each command is assigned a tag by the AFU. This tag is used by the PSL during sub-sequent phases of the transaction to identify the command. AFU Command Interface lists the com-mands that can be sent to the PSL by the application.

Note There are references to PSL internal register mnemonics within this section. These reg-isters are mentioned to provide additional content clarity. These registers are set by sys-tem software during initialization or library calls to the AFU. However, the format of theseregisters is not information required by an AFU designer.

Table 1.1. AFU Command InterfaceSignal Name Bits Source Description

ah_cvalid 1 AFU A valid command is present on the interface. This signal is asserted for a single cy-cle for each command that is to be accepted.

Design recommendation: make this a registered interface to the PSL.

• This signal can be driven for multiple cycles. That is, different commands can bedriven back-to-back, as long as there is an adequate number of credits outstand-ing.

ah_ctag 8 AFU AFU generated ID for the request. This is used as an array address on the AFUBuffer interface and for status notification.

Page 8: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 2Workgroup Specification

Standard Track

Signal Name Bits Source Description

ah_ctagpar 1 AFU Odd parity for ah_ctag, ah_paren = ‘1’.

ah_com 13 AFU Indicates which command the PSL will execute. Opcodes are defined in PSL Com-mand Opcodes Directed at the PSL Cache.

ah_compar 1 AFU Odd parity for ah_com, ah_aparen = ‘1’.

ah_cabt 3 AFU PSL translation ordering behavior. See ah_cabt Translation Ordering Behavior.

ah_cea 64 AFU Effective byte address for the command. Addresses for “cl” commands must be sentas 128-byte aligned addresses, Addresses for write_ must be naturally aligned ac-cording to the given ah_csize.

ah_ceapar 1 AFU Odd parity for ah_cea, ah_aparen = ‘1’.

ah_cch 16 AFU Context handle used to augment ah_cea in AFU-directed context mode.

Drive to ‘0’ in other modes.

ah_csize 12 AFU Number of bytes for partial line commands.

Read/write commands require the size to be a power of 2 (1, 2, 4, 8, 16, 32,

64, 128).

The ah_csize is binary encoded.

ha_croom 8 PSL Number of commands that the PSL is prepared to accept and that must be capturedby the AFU when it is enabled on the AFU Control interface.

This signal is not meant to be a dynamic count from the PSL to the AFU, it repre-sents the maximum number of commands the PSL can accept from the AFU.

Table 1.2. PSL Command Opcodes Directed at the PSL CacheMnemonic Opcode Description

Read_cl_s x‘0A50’ Read a cache line and allocate the cache line in the precise cache in the shared state. Thiscommand must be used when there is an expectation of temporal locality. Ah_csize mustbe 128 bytes, and ah_cea must be 128-byte line aligned.

Read_cl_m x‘0A60’ Read a cache line and allocate the cache line in the precise cache in the modified state.This command must be used when there is an expectation that data within the line will bewritten in the near future. Ah_csize must be 128 bytes, and ah_cea must be 128-byte linealigned.

Read_cl_lck x‘0A6B’ Read a cache line and allocate the cache line in the precise cache in the locked and mod-ified state. This command must be used as part of an atomic read-modify-write sequence.Ah_csize must be 128 bytes, and ah_cea must be 128-byte line aligned.

Read_cl_res x‘0A67’ Read a cache line and allocate the cache line in the precise cache and acquire a reserva-tion. Ah_Csize must be 128 bytes, and ah_cea must be 128-byte line aligned.

Read_pe x‘0A52’ Read the process element cache line from the context indicated by ah_cch. This com-mand is supported only when the PSL is configured in AFU-directed mode and thePSL_SPAP_An Register is initialized. The format for the process element is specified inProcess Element Entry.

touch_i x‘0240’ Bring a cache line into the precise cache in the IHPC state without reading data in prepara-tion for a cache line write. Ah_csize must be 128 bytes, and ah_cea must be 128-byte linealigned.

IHPC - The owner of the line is the highest point of coherency but it is holding the line in anI state.

touch_s x‘0250’ Bring a cache line into the precise cache in the shared state. Ah_csize must be 128 bytes,and ah_cea must be 128-byte line aligned.

touch_m x‘0260’ Bring a cache line into the precise cache in modified state. Ah_csize must be 128 bytes,and ah_cea must be 128-byte line aligned.

Write_mi x‘0D60’ Write all or part of a cache line and allocate the cache line in the precise cache in modifiedstate. The line goes invalid if a snoop read hits it. This command must be used when thereis an expectation of temporal locality, followed by a use by another processor. Ah_csizemust be a power of 2, and ah_cea must be naturally aligned according to size.

Page 9: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 3Workgroup Specification

Standard Track

Mnemonic Opcode Description

Write_ms x‘0D70’ Write all or part of a cache line and allocate the cache line in the precise cache in modifiedstate. The line goes to a shared state if a snoop read hits it. This command must be usedwhen there is an expectation of temporal locality in a producer-consumer model. Ah_csizemust be a power of 2, and ah_cea must be naturally aligned according to size.

Write_unlock x‘0D6B’ If a lock is present, write all or part of a cache line and clear the line’s lock status back toa modified state. It will fail if the lock is not present. Ah_csize must be a power of 2, andah_cea must be naturally aligned according to size.

Write_c x‘0D67’ If a reservation is present, write all or part of a cache line and clear the reservation status. Ifa reservation is not present, it will fail. Ah_csize must be a power of 2, and ah_cea must benaturally aligned according to size.

push_i x‘0140’ Attempt to accelerate the subsequent writing of a line, previously written by the AFU orby another processor. Ah_csize must be 128 bytes, and ah_cea must be 128-byte linealigned. This command is a no-op if the line is not modified.

push_s x‘0150’ Attempt to accelerate the subsequent reading of a line, previously written by the AFU orby another processor. Ah_csize must be 128 bytes, and ah_cea must be 128-byte linealigned. This command is a no-op if the line is not modified.

evict_i x‘1140’ Force a line out of the precise cache. Modified lines are castout to system memory.Ah_csize must be 128 bytes, and ah_cea must be 128-byte line aligned.

reserved x‘1260’ Reserved for future use.

lock x‘016B’ Request that a cache line be present in the precise cache in a locked and modified state.This command must be used as part of a atomic read-modify-write sequence. Ah_csizemust be 128 bytes, and ah_cea must be 128-byte line aligned.

unlock x‘017B’ Clear the lock state associated with a line. Ah_csize must be 128 bytes, and ah_cea mustbe 128-byte line aligned.

Table 1.3. PSL Command Opcodes That Do Not Allocate in the PSL Cache

Mnemonic Opcode Description

Read_cl_na 0x0A00 Read a cache line, but do not allocate the cache line into a cache. This command must beused during streaming operations when there is no expectation that the data will be re-usedbefore it is cast out of the cache. Ah_csize must be 128 bytes, and ah_cea must be 128-byte line aligned.

Read_pna 0x0E00 Read all or part of a line without allocation. This command must be used for MMIO.Ah_csize must be a power of 2, and ah_cea must be naturally aligned according to size.

Write_na 0x0D00 Write all or part of a cache line, but do not allocate the cache line into a cache. This com-mand must be used during streaming operations when there is no expectation that the da-ta will be re-used before it is cast out of the cache. Ah_csize must be a power of 2, andah_cea must be naturally aligned according to size.

Write_inj 0x0D10 Write all or part of a cache line. Do not allocate the cache line into a cache; attempt to in-ject the data into the highest point of coherency (HPC). Ah_csize must be a power of 2,and ah_cea must be naturally aligned according to size.

Table 1.4. PSL Command Opcodes Reserved for Scratch Pad

Mnemonic Opcode Description

reserved 0x0A10 Reserved for future scratchpad support.

reserved 0x0D30 Reserved for future scratchpad support.

Table 1.5. PSL Command Opcodes Reserved for LPC Services

Mnemonic Opcode Description

reserved 0x0A02 Reserved for future LPC services support.

reserved 0x0D02 Reserved for future LPC services support.

Page 10: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 4Workgroup Specification

Standard Track

Table 1.6. PSL Command Opcodes for ManagementMnemonic Opcode Description

reserved 0x0100 Reserved

reserved 0x0102 Reserved for future LPC services support.

intreq 0x0000 Request interrupt service. See Request for Interrupt Service.

restart 0x0001 Stop flushing commands after error. Ah_cea is ignored.

NotePSL Implementation: New requests that hit in the same ERAT page entry asthe request with the translation error response must not continue to be is-sued until the restart command has received a DONE response.

1.1.1. Command OrderingIn general, the PSL processes commands in a high-performance order. If a particular ordering is re-quired between two commands, the AFU must submit the first command and wait for its completionbefore submitting the second command. For example, the AFU might want to write results and thenwrite a flag, indicating to other threads the data is ready. It must submit the result write commands,wait for all of the completion responses, and then submit the flag write. This way, when the otherthreads read the flag value, they can subsequently correctly read the results.

The PSL has multiple stages of execution, each of which can have an impact on the order in whichcommands are completed.

1.1.1.1. Translation OrderingTranslation ordering is affected by the state of the ahX_cabt input to the PSL.This control is an im-portant way to control the behavior and performance of the PSL.

ah_cabt Translation Ordering Behavior lists the translation ordering behavior.

Table 1.7. ah_cabt Translation Ordering Behaviorah_cabt Mnemonic Description

000 Strict Translation proceeds in order relative to other ah_cabt = Strict operations.Strict means that effec-tive-to-real address translation (ERAT) misses and protection violations stall subsequent ah_cabt =Strict operations before translation efforts.

This ensures that the order of translation interrupts is the same as the order of command submis-sion; and loads and stores that follow a translation event have not been executed if the state needsto be saved and restored during the handling of a translation interrupt.

• If translation for the command results in a protection violation or the table walk process fails thecommand, an interrupt is sent. If the translation interrupt response is CONTINUE, the commandreceives the PAGED response and all subsequent commands get FLUSHED responses until arestart command is received.

• If the translation interrupt response is Address Error, the command receives the

AERROR response and all subsequent commands get FLUSHED responses until a restart com-mand is received.

• If the translation detects an internal error or data error, the command receives the DERROR re-sponse and all subsequent commands get FLUSHED responses until a restart command is re-ceived.

• If the translation detects an error in the Context, the command will receive the CONTEXT re-sponse and all subsequent commands with the same context are captured in a PSL queue until

Page 11: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 5Workgroup Specification

Standard Track

ah_cabt Mnemonic Descriptionthe queue is emptied. The queued commands receive a FLUSHED response. After the queue isemptied, new commands with the same context will be evaluated. A restart command is not re-quired.

PSL Implementation Note: When a protection violation occurs and before the translation interruptresponse is received, subsequent commands that hit the same 16 MB page are held in a queueand marked as a protection violation. Once the translation response is received, the queued com-mands are processed and provide a PSL response according to their individual CABT mode. Re-quests, that are received after the translation response is received with CABT = Abort, Pref, orSpec, are processed immediately and provide a PSL response according to their individual CABTmode. Requests received with CABT = Strict or Page are added to the queue until the queue isemptied. When the queue is emptied, any Restart command from the AFU is honored. Continuingto send requests with CABT = Strict or Page before the queue is emptied will delay the honoringof the Restart command for that ERAT entry. It is recommended that new requests that hit the ER-AT entry are halted until a response is received for the Restart command.

001 Abort Accesses to different pages proceed in high-performance order. If translation for the command re-sults in a protection violation or the table walk process fails, the command receives the FAULT re-sponse and an interrupt is sent. Only this command is terminated.

• If the translation for the command results in a DERROR, only this command is terminated with aFAULT response.

No FLUSHED response is generated.

010 Page Translation is in order for addresses in the same effective page that maps into a 4 KB, 16 KB, and16 MB ERAT. Accesses to different pages exit translation in a high-performance order.

If translation for the command results in a protection violation or the table walk process fails thecommand, an interrupt is sent. If the interrupt response is CONTINUE, the command receives aPAGED response and all subsequent commands that hit this page receive a FLUSHED responseuntil a command restart for an address in the same effective page is received. Commands outside ofthis effective page are not affected.

• If the translation interrupt response is Address Error, the command receives the AERROR re-sponse and all subsequent commands that hit this page get FLUSHED responses until a restartcommand is received. Commands outside of this effective page are not affected.

• If the translation detects an internal error or Data Error, the command receives the DERROR re-sponse and all subsequent commands that hit this page get FLUSHED responses until a restartcommand is received. Commands outside of this effective page are not affected.

• If the translation detects an error in the Context, the command will receive the CONTEXT re-sponse and all subsequent commands with the same context are captured in a PSL queue untilthe queue is emptied. The queued commands receive a FLUSHED response. After the queue isemptied, new commands with the same context will be evaluated. A restart command is not re-quired.

PSL Implementation Note: When a protection violation occurs and before the translation interruptresponse is received, subsequent commands that hit the same 16 MB page are held in a queueand marked as a protection violation. Once the translation response is received, the queued com-mands are processed and provide a PSL response according to their individual CABT mode. Re-quests, that are received after the translation response is received with CAB T =Abort, Pref, orSpec, are processed immediately and provide a PSL response according to their individual CABTmode. Requests received with CABT = Strict or Page are added to the queue until the queue isemptied. When the queue is emptied, any Restart command from the AFU is honored. Continuingto send requests with CABT = Strict or Page before the queue is emptied will delay the honoringof the Restart command for that ERAT entry. It is recommended that new request that hit the 16MB page are halted until a response is received for the Restart command.

011 Pref Checks if the translation for the address is already available in the ERAT or can be determined witha read of the PTE and/or STE from system memory. If the translation can complete without softwareassistance, the command completes.

• If translation for the command results in a protection violation or the table walk process fails, thecommand receives the FAULT response. Only this command will be terminated. No interrupt isgenerated.

Page 12: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 6Workgroup Specification

Standard Track

ah_cabt Mnemonic Description• If the translation for the command results in a DERROR or CONTEXT response, only this com-

mand is terminated with a FAULT response.

No FLUSHED response is generated.

111 Spec Checks if the translation for the address is already available in the ERAT. If it is in the ERAT, thecommand completes.

• If translation for the command results in a protection violation or an ERAT miss, the command willreceive the FAULT response. No new translation is performed. Only this command will be termi-nated. No interrupt is generated.

• If the translation for the command results in a DERROR or CONTEXT response, only this com-mand is terminated with a FAULT response.

No FLUSHED response is generated

1.1.1.2. Strict Address Ordering Pages

AFU designs might need to delay accesses until prior accesses are completed, if they need to in-ter-operate with POWER applications with pages in strict address ordering (SAO) mode. PSL opera-tion ordering is affected by accesses to pages with WIMG = SAO.

1.1.1.3. Execution Ordering

After commands have proceeded past address translation, the PSL orders only on a cache-line ad-dress basis. Commands to an address are performed after earlier commands to that address andbefore later commands to that address. Order between commands involving different addresses isunpredictable.

1.1.2. ReservationThe operations read_cl_res and write_c manipulate the reservation. There is one reservation forthe AFU. This reservation can be active on an address or inactive. Read_cl_res reads an addressand acquires the reservation, after which the reservation is active on the address of the read. Whilethe reservation is active, the PSL snoops for writes performed to the address. If the PSL detects awrite to the address by another processor, it deactivates the reservation. Write_c inspects the stateof the reservation during execution. If the reservation is active on the write_c line address, write_cwill write data to the line, deactivate the reservation, and return DONE. If the reservation is active ona different address, write_c deactivates the reservation and returns NRES. If the reservation is notactive, write_c returns NRES.

Note

While it is not an error to submit multiple read_cl_res and write_c commands to differ-ent line addresses, the order they execute in is not defined and therefore, the state of thereservation is unpredictable.

1.1.3. LocksCache lines can be locked, and while they are locked no other read or write access is permitted byany other processor in the system. This capability allows an AFU to implement complex atomic oper-ations on shared memory.

Page 13: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 7Workgroup Specification

Standard Track

Lock requests are made with either the read_cl_lck or the lock command. If the PSL grants thelock, it responds with DONE. If the PSL declines the lock request, it responds with NLOCK. The PSLcan decline a lock request based on configuration, available resources, and cache state. After thelock is in effect, it remains in effect until a subsequent write_unlock or unlock request.

Locks cannot be held indefinitely. The PSL automatically unlocks lines after a certain amount of timeto allow the system to make forward progress. Write_unlock or unlock returns NLOCK if they are at-tempted when an address is not locked.

An AFU holding a lock is required to release its lock and wait for the write_unlock or unlock com-mand to complete before it can proceed with commands to other addresses. While a lock is active,commands to other addresses can be terminated with the response NLOCK.

1.1.4. Request for Interrupt ServiceThe intreq command is used to generate an interrupt request to the system. Address bits [53:63] in-dicate the source of the interrupt. Only values 1 - 2043 are supported. A second interrupt request us-ing the same source must not be generated to the system until the first request has been serviced.The PSL generates a PSL response DONE when the interrupt request has been presented to theupstream logic. The response provides no indication of interrupt service. The PSL generates a PSLresponse FAILED, if an invalid source number is used as defined in PSL_IVTE_LIMIT_An.

Note

PSL_IVTE_LIMIT_An is a PSL internal register mneumonic. Reference to the CAIA spec-ification (actual URL Link needed) for more information on this and other registers.

1.1.5. Parity Handling for the Command InterfaceParity inputs are provided for important fields in the command interface. The command, tag, and ad-dress are protected by odd parity. Bad parity on any of these buses causes the PSL to return the er-ror status for the command. All parity signals on the command interface are valid in the same cycleas ah_cvalid.

1.2. AFU Buffer InterfaceData is moved between the PSL and the AFU through the buffer interfaces. When a command is giv-en to the PSL, it assumes that it can read or write data to the AFU with the ah_ctag contained in thecommand. Data is read or written before the command is completed, and it can be read or writtenmore than once before the command is completed. There are two buffer interfaces present, one forreading during a write operation and one for writing during a read operation. Each read/write movesa half of a line of data (64 bytes). Requests can arrive at any time on either interface. Each interfaceis synchronous, pipelined, and non-blocking. Read requests are serviced, after a small brlat fixed de-lay, in a pipelined fashion in the order that they are received, so that data can be directly sent to thePCIe write stream without PSL buffering.

Table 1.8. AFU Buffer InterfaceSignal Name Bits Source Description

ha_brvalid 1 PSL This signal is asserted for a single cycle, when a valid read data transfer is presenton the interface. The ha_br* signals are valid during the cycle ha_brvalid is asserted.

Page 14: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 8Workgroup Specification

Standard Track

Signal Name Bits Source Description The buffer read interface is used for AFU write requests, and the buffer write inter-face is used for AFU read requests.

• This signal can be on for multiple cycles, indicating that data is being returned onback-to-back cycles.

ha_brtag 8 PSL AFU generated ID for the AFU write request.

ha_brtagpar 1 PSL Odd parity for ha_brtag valid with ha_brvalid.

ha_brad 6 PSL Half-line index of read data within the transaction.

Cache lines are 128 bytes so that only the LSB is modulated.

ah_brlat 4 AFU Read buffer latency. This bus is a static indicator of the access latency of the readbuffer. It must not change while there are commands that have been submitted onthe command interface that have not been acknowledged on the response interface.

It is sampled continuously. However, after a reset, the PSL assumes this is a con-stant and that it is static for any particular AFU.

1 Data is ready the second cycle after ha_brvalid is asserted.

3 Data is ready the fourth cycle after ha_brvalid is asserted.

ah_brdata 512 AFU Read data.

ah_brpar 8 AFU Odd parity for each 64-bit doubleword of read data. ah_brpar must be provided onthe same cycle as ah_brdata. A parity check fail results in a DERROR response andSUE data written.

ha_bwvalid 1 PSL This signal is asserted for a single cycle when a valid write data transfer is presenton the interface. The ha_bw* signals (except for ha_bwpar) are valid during the cy-cle that ha_bwvalid is asserted.

• This signal can be on for multiple cycles indicating that data is being driven onback to back cycles.

ha_bwtag 8 PSL AFU generated ID for the read request.

ha_bwtagpar 1 PSL Odd parity for ha_bwtag valid with ha_bwvalid.

ha_bwad 6 PSL Half-line index of write data within the transaction.

Cache lines are 128 bytes, so that only the LSB is modulated.

ha_bwdata 512 PSL Data to be written.

ha_bwpar 8 PSL Odd parity for each 64-bit doubleword of ha_bwdata. ha_bwpar is presented to theAFU one PSL cycle after ha_bwdata.

1.3. PSL Response InterfaceThe PSL uses the response interface to indicate the completion status of each command and tomanage the command flow control credits. Each command completion can return credits back to theAFU, so that further commands can be sent.

Table 1.9. PSL Response Interface Signal Name Bits Source Description

ha_rvalid 1 PSL This signal is asserted for a single cycle when a valid response is present on the in-terface. The ha_r* signals are valid during the cycle that ha_rvalid is asserted

• This signal can be on for multiple cycles indicating that the responses are beingreturned back to back.

ha_rtag 8 PSL AFU generated ID for the request.

ha_rtagpar 1 PSL Odd parity for ha_rtag valid with ha_rvalid.

ha_response 8 PSL Response code. See PSL Response Codes.

Page 15: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 9Workgroup Specification

Standard Track

Signal Name Bits Source Description

ha_rcredits 9 PSL Two’s compliment number of credits returned.

ha_rcachestate 2 PSL Reserved.

ha_rcachepos 13 PSL Reserved.

Table 1.10. PSL Response Codes

Mnemonic Code Description

DONE 0x00 Command is complete. Any and all data requests have been made for the request to/from thebuffer interface. Data movement between the AFU and the PSL for these requests is complete.

AERROR 0x01 Command has resulted in an address translation error. All further commands are flushed until arestart command is accepted on the command interface.

DERROR 0x03 Command has resulted in a data error. All further commands are flushed until a restart commandis accepted on the command interface.

NLOCK 0x04 Command requires a lock status that is not present. Command issued is unrelated to an outstand-ing lock.

NRES 0x05 Command requires a reservation that is not present.

FLUSHED 0x06 Command follows a command that failed and is flushed. See ah_cabt Translation Ordering Be-havior for additional information.

FAULT 0x07 Command address could not be quickly translated. Interrupt has been sent to the operating sys-tem or hypervisor for ah_cabt mode ABORT. The command has been terminated.

FAILED 0x08 Command could not be completed because:

• An interrupt service request that receives this response contained an invalid source number.

• Parity error detected on command request; therefore, the command was ignored.

• Command issued that is not supported in the configured PSL_SCNTL_An[PSL Model Type].

PAGED 0x0A Command address could not be translated. The operating system has requested that the AFUcontinue. The command has been terminated. All further commands are flushed until a restartcommand is accepted on the command interface.

Context 0x0B The process element addressed by the command context handle is not valid.

1.3.1. Command/Response FlowPSL Command/Response Flow illustrates the PSL command and response flow.

Page 16: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 10Workgroup Specification

Standard Track

Figure 1.1. PSL Command/Response Flow

1.4. AFU MMIO InterfaceThe MMIO interface can be used to read and write MMIO registers and AFU descriptor space reg-isters inside the AFU. The PSL is the command master. It performs a single read or write and waitsfor an acknowledgment before beginning another MMIO. MMIO requests that are not acknowledgedcause an application hang to be detected and an error condition to be reported.

Page 17: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 11Workgroup Specification

Standard Track

Note

MMIO interface requests to valid registers in the AFU must complete with no dependen-cies on the completion of any other command.

An MMIO request is sent to the AFU only when the AFU is enabled as indicated by theAFU_CNTL_An[ES] field. Otherwise, an error condition is reported. Note that the MMIO addresscontains a word (4-byte) address; therefore, the last 2 bits of the true address are dropped at the in-terface. For an address of 0x300_1080, ha_mmad equals 0xC0_0042.

1.4.1. AFU DescriptorAn AFU is required to have an AFU descriptor for System Software to recognize it. The descriptorformat is described in the CAIA Specification.

Table 1.11. AFU MMIO Interface

Signal Name Bits Source Description

ha_mmval 1 PSL This signal is asserted for a single cycle when an MMIO transfer is present onthe interface. The ha_mm* signals are valid during the cycle that hX_mmval isasserted.

ha_mmcfg 1 PSL The MMIO represents an AFU descriptor space access.

ha_mmrnw 1 PSL 0 Write

1 Read

ha_mmdw 1 PSL 0 Word (32 bits)

1 Doubleword (64 bits)

ha_mmad 24 PSL MMIO word address. For doubleword access, the address is even.

ha_mmadpar 1 PSL Odd parity for ha_mmad valid with ha_mmval.

ha_mmdata 64 PSL Write data. For word writes, data is replicated onto both halves of the bus.

ha_mmdatapar 1 PSL Odd parity for ha_mmdata valid with ha_mmval and ha_mmrnw equal to ‘0’.

Not valid during an MMIO read (ha_mmrnw = 1).

ah_mmack 1 AFU This signal must be asserted for a single cycle to acknowledge that the writeis complete or the read data is valid.

ah_mmdata 64 AFU Read data. For word reads, data must be supplied on both halves of the bus.

ah_mmdatapar 1 AFU Odd parity for ah_mmdata, valid with ah_mmack.

1.5. AFU Control InterfaceThe AFU control interface is used to control the state of the AFU and sense change in the state ofthe AFU as execution ends on the process element. This interface is also used for timebase re-quests and responses. When an AFU-directed context mode is enabled, the interface indicates up-dates to the process elements in the scheduled process area. The interface is a synchronous inter-face. Ha_jval is valid for only one cycle per command, and the other command descriptor signals arealso valid during that cycle. AFU Control Interface shows the signals used for the AFU control inter-face.

Page 18: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 12Workgroup Specification

Standard Track

Table 1.12. AFU Control InterfaceSignal Name Bits Source Description

ha_jval 1 PSL This signal is asserted for a single cycle when a valid job control command ispresent. The ha_j* signals are valid during this cycle.

ha_jcom 8 PSL Job control command opcode. See PSL Control Commands on ha_jcom.

ha_jcompar 1 PSL Odd parity for hX_jcom valid with ha_jval.

ha_jea 64 PSL This is the WED or timebase information or llcmd information.

• Timebase is currently not supported.

hX_jeapar 1 PSL Odd parity for ha_jea valid with ha_jval.

ah_jrunning 1 AFU AFU is running. This signal should transition to a ‘1’ after a start command isrecognized. It must be negated when the job is complete, in error, or a reset

command is recognized.

ah_jdone 1 AFU Assert for a single cycle to acknowledge a reset command or when the AFUis finished. The ah_jerror signal is valid when ah_jdone is asserted.

ah_jcack 1 AFU Assert for a single cycle to acknowledge completion of processes associatedwith an LLCMD notification.

In dedicated-process mode, drive to ‘0’.

ah_jerror 64 AFU AFU error code. A ‘0’ means success. If nonzero, the information is capturedin the AFU_ERR_An Register and PSL_DSISR_An[AE] is set, causing an in-terrupt.

ah_jyield 1 AFU Reserved, drive to ‘0’.

ah_tbreq 1 AFU Single cycle pulse to request that the PSL send a timebase control commandwith the current timebase value.

ah_paren 1 AFU If asserted, the AFU supports parity generation on various interface buses.The parity is checked by the PSL.

hXa_pclock 1 PSL All AFU interfaces are synchronous to the rising edge of this 250 MHz clock.

Table 1.13. PSL Control Commands on ha_jcom Mnemonic Code Description

Start 0x90 Job execution in all modes. Begin running a new context. ha_jea contains the work element de-scriptor in dedicated-process mode and shared mode.

Reset 0x80 Job execution in all modes. Force into a clean state, erasing all of the state from the previous con-text.

reserved 0x16 Reserved for future virtualized time-slice mode.

reserved 0x0C Reserved for future virtualized time-slice mode.

reserved 0x60 Reserved for future virtualized time-slice mode.

reserved 0x41 Reserved for future LPC operations.

Timebase 0x42 Send requested 64-bit timebase value to the AFU on the ha_jea bus.

LLCMD 0x45 Job execution in AFU directed mode. See AFU Control Interface for LLCMD Operations

1.5.1. AFU Control Interface in the Non-Shared ModeIn a non-shared mode, the hypervisor must always reset and enable the AFU through theAFU_CNTL_A Register as shown in PSL AFU Control Interface Flow in Non-Shared Mode. Whilethe AFU is enabled, the following functions are possible:

• Requests can be submitted to the PSL through the command interface.

• MMIO requests can be passed from the PSL to the AFU and must be acknowledged.

Page 19: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 13Workgroup Specification

Standard Track

• Timebase values can be passed to the AFU.

• Process element update commands (LLCMD) can be passed to the AFU (if operating in AFU di-rected mode).

When a PSL slice is initialized for dedicated-process mode, the PSL fetches the process elementfrom system memory if the address specified in PSL_SPAP_An is valid when the AFU_CNTL_A [En-able] is set to ‘1’. If the PSl_SPAP_An address is not valid, the PSL assumes that the process ele-ment registers have been initialized by software already, so the start command is immediately sentto the AFU. The 64-bit ha_jea indicates the value of the work element descriptor.

When a PSL slice is initialized for AFU-directed mode, the PSL assumes the process element regis-ters are read from system memory as needed. The PSL sends the ‘start’ command to the AFU whenAFU_CNTL_An[Enable] is set to ‘1’. The value in the JEA is undefined for the ‘start’ command in thismode.

Figure 1.2. PSL AFU Control Interface Flow in Non-Shared Mode

Page 20: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 14Workgroup Specification

Standard Track

1.5.2. AFU Control Interface for TimebaseThe AFU requests the latest timebase information by asserting ah_tbreq on the AFU control inter-face for one cycle. Only one request can be issued at a time. The PSL returns the timebase informa-tion by asserting ha_jval = ‘1’, ha_jcom = timebase, and ha_jea = timebase value (0:63).

1.5.3. AFU Control Interface for LLCMD OperationsThe ha_jcom = LLCMD command notifies the AFU of a change to a process element in the AFU-di-rected mode. The ha_jea command carries the notification information. The format for ha_jea can befound in PSL Linked List Command Register (PSL_LLCMD_An). Only one notification is presentedat a time. The AFU must acknowledge this notification by asserting ah_jcack for one cycle when ithas completed processing the notification. Only after an acknowledgment will the PSL be able to de-liver another notification.

1.5.3.1. PSL Linked List Command Register (PSL_LLCMD_An)The PSL Linked List Command Register (PSL_LLCMD_An) is for the management of the process el-ements in the scheduled processes area.

When the AFU is operating in a shared or AFU-directed programming model, the PSL fetches pro-cess elements from the link list pointed to by this register, if valid. This register is used by the hyper-visor to manage the linked list.

If multiple PSLs are using the same scheduled processes area, privileged software should only issuecommands to the first PSL. All other PSLs will receive commands through shared memory.

This facility is optional for CAIA-compliant devices that do not support virtualization. System softwarecan detect if this feature is supported by writing x‘000000000000FFFF’ to this register and readingback the contents. If a value of zero is returned, this feature is not supported. Any nonzero value in-dicates that the feature is supported.

There is one register for each PSL slice. Access to these registers should be privileged and must bewritten using a single 64-bit store operation.

Access Type Read/Write

Base Address Offset (P1_Base | P1(n)) + x'90'; where n is an AFU number.

Command Reserved

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Reserved PE_Handle

32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63

Bits Field Name Description

0:15 Command Command.

x‘0000’ No command.

x‘0001’ terminate_element: Terminate process element at the link provided.

x‘0002’ remove_element: Remove the process element at the link provided.

x‘0003’ suspend_element: Stop executing the process element at the link provided.

x‘0004’ resume_element: Resume executing the process element at the link provided.

Page 21: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 15Workgroup Specification

Standard Track

Bits Field Name Descriptionx‘0005’ add_element: Software is adding a process element at the link provided.

x‘0006’ update_element: Software is updating the process element state at the link provid-ed.

All other values are reserved.

16:47 Reserved Reserved.

48:63 PE_Handle Process element handle.

The process element handle, shifted right by 7 bits, is the offset from the SPA_Base of theprocess element to operate on.

1.5.4. Process Element EntryEach process element entry is 128-bytes in length. Process Element Entry Format shows the formatof each process element. The shaded fields in Process Element Entry Format correspond to privi-leged 1 registers, and the fields not shaded correspond to privileged 2 registers. The Software Statefield is an exception and does not have corresponding privileged 1 or privileged 2 registers.

Table 1.14. Process Element Entry FormatProcess Element EntryWord

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

0 State Register (0:31)

1 State Register (32:63)

2 E P SPOffset (most significant bits)

3 SPOffset (least significant bits) Reserved SPSIZE

4 R HTABORG (most significant bits)

5 HTABORG (least significant bits) Reserved HTABSIZE

6 R HAURP Physical Address (most significant bits)

7 HAURP Physical Address (least significant bits) Reserved V

8 Reserved Idle_Time Reserved Context_Time

9 IVTE_Offset_0 IVTE_Offset_1

10 IVTE_Offset_2 IVTE_Offset_3

11 IVTE_Range_0 IVTE_Range_1

12 IVTE_Range_2 IVTE_Range_3

13 LPID

14 TID

15 PID

16 CSRP Effective Address (most significant bits)

17 CSRP Effective Address (least significant bits) Limit

18 B Ks Kp N L CRsv

LP Reserved

19 Reserved AURP Virtual Address (most significant bits)

20 AURP Virtual Address

21 AURP Virtual Address (least significant bits) Reserved V

22 B Ks Kp N L CRsv

LP Reserved SegTableSize

23 Reserved SSTP Virtual Address (most significant bits)

24 SSTP Virtual Address

Page 22: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 16Workgroup Specification

Standard Track

Process Element EntryWord

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

25 SSTP Virtual address (least significant bits) Reserved V

26 Authority Mask (most significant bits)

27 Authority Mask (least significant bits)

28 Reserved

29 Work Element Descriptor (WED word 0)

30 Work Element Descriptor (WED word 1)

31 Software State

Page 23: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 17Workgroup Specification

Standard Track

2. Timing Diagram ExamplesFigure 2.1. Control Interface, Reset

Figure 2.2. Control Interface, Start

Figure 2.3. Command Interface, Read_cl_na

Figure 2.4. Buffer Interface, Write of buffer from Read_cl_na

Page 24: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 18Workgroup Specification

Standard Track

Figure 2.5. Response Interface, Read_cl_na complete

Page 25: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 19Workgroup Specification

Standard Track

3. Conformance to this SpecificationThe following lists a set of numbered conformance clauses to which any implementation of thisspecification must adhere in order to claim conformance to this specification (or any optional por-tion thereof): All interface signals between the PSL and AFU are required to be implemented even ifthey are driven to a constant value. This document is a first attempt at identifying which items are re-quired to be support-ed by the PSL and which items are required by the AFU.

3.1. AFU Command Interface3.1.1. Interface Signals3.1.1.1. PSLPSL supports all signals driven functionally as defined.

3.1.1.2. AFUFor the AFU the parity signals are optional to be driven to correct parity. AFU support of parity is sig-naled to the PSL via the control signal ah_paren.

ah_cabt - translation modes - it is optional which translation modes the AFU supports.

ah_cch - Only required in AFU-directed context mode, drive to 0's in other modes.

3.1.2. Command Opcodes3.1.2.1. PSLPSL supports all command opcodes.

3.1.2.2. AFUAll opcodes are optional for the AFU.

3.2. AFU Buffer Interface3.2.1. Interface Signals3.2.1.1. PSLPSL supports all signals driven functionally as defined.

ah_brlat: PSL required to support values of 1 and 3, all other values optional.

3.2.1.2. AFUFor the AFU the parity signals are optional to be driven to correct parity. AFU support of parity is sig-naled to the PSL via the control signal ah_paren.

Page 26: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 20Workgroup Specification

Standard Track

ah_brlat: PSL only required to support values of 1 and 3. AFU required to support one of these val-ues.

3.3. PSL Response Interface3.3.1. Interface Signals

3.3.1.1. PSL

PSL supports all signals driven functionally except ha_rcachestate and ha_rcachepos (which aremarked reserved).

3.3.1.2. AFU

ha_rcachestate and ha_rcachepos (marked reserved from the PSL) can be terminated.

3.3.2. Response Codes

3.3.2.1. PSL

The PSL is required to support sending all response codes.

3.3.2.2. AFU

The AFU is required to support receiving all response codes except the Context response. Supportof receiving a response of Context is only required if the AFU supports the AFU-directed program-ming model. Responses of NLOCK and NRES are only possible if the AFU sends Lock or Reserva-tion types of commands.

3.4. AFU MMIO Interface3.4.1. Interface Signals

3.4.1.1. PSL

The PSL is required to support all signals.

3.4.1.2. AFU

The AFU is required to have an AFU descriptor space accessed via ha_mmval along withha_mmcfg. It is optional for an AFU to have MMIO space (which is reported in the AFU descriptor),but it is suggested so that errors and status can be logged for the application. For the AFU the paritysignals are optional to be driven to correct parity. AFU support of parity is signaled to the PSL via thecontrol signal ah_paren.

Page 27: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 21Workgroup Specification

Standard Track

3.5. AFU Control Interface3.5.1. Interface Signals3.5.1.1. PSLThe PSL is required to support all signals.

3.5.1.2. AFUThe optional signals are:

ah_tbreq: Can be driven to '0' if timebase is not required.

ah_jyield: Reserved, drive to '0'.

ah_jcack: Only used if the afu supports afu-directed mode. Driven to '0' in dedicated process.

3.5.2. Control Commands3.5.2.1. PSLThe PSL is required to support all signals.

3.5.2.2. AFUThe afu will not see a Timebase command if it doesn't support sending a timebase request by as-serting ah_tbreq.

The afu will not see an LLCMD command code unless it supports AFU directed mode.

Page 28: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 22Workgroup Specification

Standard Track

GlossaryACK

Acknowledgment. A transmission that is sent as an affirmative response to a data transmission.

AFUAccelerator functional unit.

ALUTAdaptive lookup table.

AMORAuthority Mask Override Register.

AMRAuthority Mask Register.

architectureA detailed specification of requirements for a processor or computer system. It does not speci-fy details of how the processor or computer system must be implemented; instead it provides atemplate for a family of compatible implementations.

AURPAccelerator Utilization Record Pointer.

Big endianA byte-ordering method in memory where the address n of a word corresponds to the most-sig-nificant byte. In an addressed memory word, the bytes are ordered (left to right) 0, 1, 2, 3, with 0being the most-significant byte. See little endian.

CacheHigh-speed memory close to a processor. A cache usually contains recently accessed data or in-structions, but certain cache-control instructions can lock, evict, or otherwise modify the cachingof data or instructions.

Caching inhibitedA memory update policy in which the cache is bypassed, and the load or store is performed toor from system memory. A page of storage is considered caching inhibited when the "I" bit has avalue of "1" in the page table. Data located in caching inhibited pages cannot be cached at anymemory hierarchy that is not visible to all processors and devices in the system. Stores must up-date the memory hierarchy to a level that is visible to all processors and devices in the system.

CAIACoherent Accelerator Interface Architecture. Defines an architecture for loosely coupled coherentaccelerators. The Coherent Accelerator Interface Architecture provides a basis for the develop-ment of accelerators coherently connected to a POWER processor.

CAPICoherent Accelerator Process Interface.

CAPPCoherent Attached Processor Proxy. Coherence Refers to memory and cache coherence. Thecorrect ordering of stores to a memory address, and the enforcement of any required cache

Page 29: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 23Workgroup Specification

Standard Track

write-backs during accesses to that memory address. Cache coherence is implemented by ahardware snoop (or inquire) method, which compares the memory addresses of a load requestwith all cached copies of the data at that address. If a cache contains a modified copy of the re-quested data, the modified data is written back to memory before the pending load request isserviced.

CSRPContext Save/Restore Area Pointer.

DLLDelay locked loop.

DMADirect memory access. A technique for using a special-purpose controller to generate the sourceand destination addresses for a memory or I/O transfer.

DSISRData Storage Interrupt Status Register.

DSPDigital signal processor.

EAHPSL effective address high.

EALPSL effective address low.

EAEffective address. An address generated or used by a program to reference memory. A memo-ry-management unit translates an effective address to a virtual address, which it then translatesto a real address (RA) that accesses real (physical) memory. The maximum size of the effec-tive-address space is 264 bytes.

ELFExecutable and linkable format.

ERATEffective-to-real-address translation, or a buffer or table that contains such translations, or a ta-ble entry that contains such a translation. Exception An error, unusual condition, or external sig-nal that can alter a status bit and causes a corresponding interrupt, if the interrupt is enabled.See interrupt. Fetch Retrieving instructions from either the cache or system memory and placingthem into the instruction queue.

FPGAField-programmable gate array.

HAURPHypervisor Accelerator Utilization Record Pointer.

hcallHypervisor call.

Page 30: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 24Workgroup Specification

Standard Track

HPCHighest point of coherency. Hypervisor A control (or virtualization) layer between hardware andthe operating system. It allocates resources, reserves resources, and protects resources among(for example) sets of AFUs that may be running under different operating systems.

IHPCThe owner of the line is the highest point of coherency but it is holding the line in an "I" state. Im-plementation A particular processor that conforms to the architecture but might differ from otherarchitecture-compliant implementations. For example, in design this could be the feature set andimplementation of optional features.

INTInterrupt. A change in machine state in response to an exception. See exception. Interrupt pack-et Used to signal an interrupt, typically to a processor or to another interruptible device.

ISAInstruction set architecture.

JEAJob effective address.

KBKilobyte.

LAA local storage (LS) address of an PSL list. It is used as a parameter in an PSL command.

Least-significant bitThe bit of least value in an address, register, data element, or instruction encoding. Little endianA byte-ordering method in memory where the address n of a word corresponds to the least-sig-nificant byte. In an addressed memory word, the bytes are ordered (left to right) 3, 2, 1, 0, with 3being the most-significant byte. See big endian.

LISNLogical interrupt service number. Logical partitioning A function of an operating system that en-ables the creation of logical partitions.

LPARLogical partitioning.

LPCLowest point of coherency.

LPIDLogical-partition identity.

LSbLeast-significant bit

LSBLeast-significant byte

Main storageThe effective-address space. It consists physically of real memory (whatever is external to thememory-interface controller), Local Storage, memory-mapped registers and arrays, memo-

Page 31: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 25Workgroup Specification

Standard Track

ry-mapped I/O devices, and pages of virtual memory that reside on disk. It does not includecaches or execution-unit register files.

MaskA pattern of bits used to accept or reject bit patterns in another set of data. Hardware interruptsare enabled and disabled by setting or clearing a string of bits, with each interrupt assigned a bitposition in a mask register.

MBMegabyte.

Memory coherencyAn aspect of caching in which it is ensured that an accurate view of memory is provided to all de-vices that share system memory.

Memory mappedMapped into the Coherent Attached Accelerator's addressable-memory space. Registers, localstorage (LS), I/O devices, and other readable or writable storage can be memory-mapped. Privi-leged software does the mapping.

MMIOMemory-mapped I/O.

PIDProcess ID.

PSLPOWER service layer. It is the interface logic for a coherently attached accelerator and providestwo main functions: moves data between accelerator function units (AFUs) and main storage,and synchronizes the transfers with the rest of the processing units in the system.

MMIOMemory-mapped input/output. See memory mapped.

MMUMemory management unit. A functional unit that translates between effective addresses (EAs)used by programs and real addresses (RAs) used by physical memory. The MMU also providesprotection mechanisms and other functions. Most-significant bit The highest-order bit in an ad-dress, registers, data element, or instruction encoding.

MRUSee most recently used.

MSbMost-significant bit.

PageA region in memory. The Power ISA defines a page as a 4 KB area of memory, aligned on a 4KB boundary or a large-page size which is implementation dependent. Page table A table thatmaps virtual addresses (VAs) to real addresses (RAs) and contains related protection parame-ters and other information about memory locations.

PCIePeripheral Component Interconnect Express.

Page 32: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 26Workgroup Specification

Standard Track

PLLPhase locked loop.

POWEROf or relating to the Power ISA or the microprocessors that implement this architecture.

Power ISAA computer architecture that is based on the third generation of reduced instruction set comput-er (RISC) processors. The Power ISA was developed by IBM. Privileged mode Also known assupervisor mode. The permission level of operating system instructions. The instructions aredescribed in PowerPC Architecture, Book III and are required of software that accesses sys-tem-critical resources. Privileged software Software that has access to the privileged modes ofthe architecture. Problem state The permission level of user instructions. The instructions are de-scribed in Power ISA, Books I and II and are required of software that implements applicationprograms.

PSLPOWER service layer.

PTEPage table entry. See page table.

RAMRandom access memory.

RAReal address. An address for physical storage, which includes physical memory, local storage(LS), and memory mapped I/O registers. The maximum size of the real-address space is 250

bytes.

SAOStrict address ordering.

SLBSegment lookaside buffer. It is used to map an effective address to a virtual address.

SPAScheduled processes area.

SSTPStorage segment table pointer.

Storage modelA CAPI User's Manual-compliant accelerator implements a storage model consistent with thePower ISA. For more information about storage models, see the Coherent Accelerator InterfaceArchitecture document.

SUESpecial uncorrectable error.

TAGPSL command tag.

Page 33: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 27Workgroup Specification

Standard Track

Tag groupA group of PSL commands. Each PSL command is tagged with an n-bit tag group identifier. AnAFU can use this identifier to check or wait on the completion of all queued commands in one ormore tag groups.

TGTag parameter.

TIDThread ID.

TLBTranslation lookaside buffer. An on-chip cache that translates virtual addresses (VAs) to real ad-dresses (RAs). A TLB caches page-table entries for the most recently accessed pages, therebyeliminating the necessity to access the page table from memory during load-store operations.

UAMORUser Authority Mask Override.

VA Virtual address.An address to the virtual-memory space, which is typically much larger than the real addressspace and includes pages stored on disk. It is translated from an effective address by a segmen-tation mechanism and used by the paging mechanism to obtain the real address (RA). The maxi-mum size of the virtual-address space is 268 bytes.

VHDLVHSIC Hardware Description Language.

WIMGMemory/cache attributes for PowerPC Power Architecture. Each letter of WIMG represents aone bit access attribute, specifically: Write-Through Access (W), Cache-Inhibited Access (I),memory Coherence (M), and Guarded (G).

WEDWork element descriptor.

Page 34: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 28Workgroup Specification

Standard Track

Appendix A. OpenPOWER FoundationoverviewThe OpenPOWER Foundation was founded in 2013 as an open technical membership organizationthat will enable data centers to rethink their approach to technology. Member companies are enabledto customize POWER CPU processors and system platforms for optimization and innovation for theirbusiness needs. These innovations include custom systems for large or warehouse scale data cen-ters, workload acceleration through GPU, FPGA or advanced I/O, platform optimization for SW ap-pliances, or advanced hardware technology exploitation. OpenPOWER members are actively purs-ing all of these innovations and more and welcome all parties to join in moving the state of the art ofOpenPOWER systems design forward.

To learn more about the OpenPOWER Foundation, visit the organization website atopenpowerfoundation.org.

A.1. Foundation documentationKey foundation documents include:

• Bylaws of OpenPOWER Foundation

• OpenPOWER Foundation Intellectual Property Rights (IPR) Policy

• OpenPOWER Foundation Membership Agreement

• OpenPOWER Anti-Trust Guidelines

More information about the foundation governance can be found at openpowerfoundation.org/about-us/governance.

A.2. Technical resourcesDevelopment resouces fall into the following general categories:

• Foundation work groups

• Remote development environments (VMs)

• Development systems

• Technical specifications

• Software

• Developer tools

The complete list of technical resources are maintained on the foundation Technical Resources webpage.

Page 35: PSL / AFU Interface - Accelerator Work Group Specificationopenpowerfoundation.org/.../psl-afu-spec-20160218.pdf · • PSL Response Interface is the interface through which the PSL

PSL / AFU Interface February 18, 2016 Version 1.0

OpenPOWER Foundation 29Workgroup Specification

Standard Track

A.3. Contact the foundationTo learn more about the OpenPOWER Foundation, please use the following contact points:

• General information -- <[email protected]>

• Membership -- <[email protected]>

• Technical Work Groups and projects -- <[email protected]>

• Events and other activities -- <[email protected]>

• Press/Analysts -- <[email protected]>

More contact information can be found at openpowerfoundation.org/get-involved/contact-us.