ramp common interface krste asanovic derek chiou joel emer
Post on 21-Dec-2015
224 views
TRANSCRIPT
![Page 1: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/1.jpg)
RAMP Common Interface
Krste Asanovic
Derek Chiou
Joel Emer
![Page 2: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/2.jpg)
General Requirements
• Provide a language agnostic environment that facilitates sharing of modules
• Provide a modeling standard to facilitate the representation of time in the model target system that is independent of the host cycle time
• Provide a reusable set of ‘unmodel’ services that can be used by different projects
• Provide an underlying communication standard that can be used to specify standard interfaces
• Facilitate the creation of a specific set of modules that can be shared and that communicate via standard interfaces
![Page 3: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/3.jpg)
Key infrastructure components
• Modeling core architecture • Modeling time• Implementing inter-module data communication
• Simulation control and support infrastructure (unModel)o simulation control
communication to front-end or control processoro simulation support
stats, events, assertions, knobs...
• Virtual Platformo Local memory accesso Shared memory accesso Host to FPGA communication channel
![Page 4: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/4.jpg)
Target and Host RTL
Target RTL
Model RTL
Unmodel RTL
Host RTL
Platform RTL
![Page 5: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/5.jpg)
Translation from Target RTL to Model RTL
• Start (conceptually) with final RTL
• Partition design into units and channelso All inter-unit communication goes over channels
o Channels have fixed latency they are a systolic pipeline latency set by what was mapped into the channel
Representation as a bipartite graph
Unit
Channel
![Page 6: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/6.jpg)
Translation from Target to Model (2)
• Change representation of time from edges to tokens
o Encapsulate data sent on an edge into a timing token data on the timing channel is 1-1 mapping of original data signals
o Replace each channel with a timing token channel
timing channel is a FIFO that transports timing tokens, e.g., A-ports
o Convert unit to sink and source tokens by abiding by the following: Unit waits for tokens on all inputs and reads them Performs same computation as it did Dequeues all input tokens Sends a token on all outputs
o Note: channel must be initialiized
• Proof of equivalence to be provided
![Page 7: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/7.jpg)
Distributed Timing Example
Unit A Unit B
Latency L
DTarget:
RDYs
RDY
Host:
Unit A Unit BDD
Start
Done
Start
DoneDEQs
ENQ DEQ
Pipeline target channel implemented as
distributed FIFO with at least L buffers
![Page 8: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/8.jpg)
Retiming to simply host model
• A shift register in the RTL can be converted into a timing token channel with the same latency.
• A perfectly systolic computation in the RTL can be converted into a timing token channel with the same latency and the functionality of the pipeline must be moved into the 'unit'. In general any retiming that exposes a series of shift registers allows one to convert the shift registers into a timing token channel.
1
1
Multiply2
Multiply
Tokenized TargetRetimed Tokenized Target
![Page 9: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/9.jpg)
Definition: firing
• A token-machine unit firing corresponds to the modeling of a single target machine cycle in that unit.
• A token-machine unit firing comprises: o Reading one token from each input channelo Compute based on tokens and internal stateo Writing one token to each output channel
![Page 10: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/10.jpg)
Multi-cycle host units
• The reads of all input tokens and writes of all output tokens can each be in different host cycles (while still reading each input and writing each output once each modelled cycle)
2
TokenizedTarget
Host Multi-cycle host
• A firing can be implemented by reading all token inputs, computing and writing all token outputs using multiple host cycles
o This is an example of a 'multi-cycle firing‘ and is what allows target cycle accounting to be independent of host cycles.
![Page 11: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/11.jpg)
Pipelined Host Units
• Multiple firings of a single token-machine unit can be overlapped (e.g., pipelined) so long as:o the token firing rules are maintained and o any inter-firing data dependencies internal to the token-
machine unit are also maintained.
• Consequence is that multiple target cycles are in flight in a host unit at the same time.
![Page 12: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/12.jpg)
Multiplexed host units
• Firings from distinct target units can be multiplexed on a single host unito The multiplexed unit has a distinct copy of state for each target
unit being modeledo The multiplexed unit must read tokens from channels associated
with the proper target unit. o This might be accomplished by multiplexing the channels
themselves. Probably simple if all communication in each target unit is to the same token machine unit port
Unit 1
Unit 2
Channel
TokenizedTarget
Host
![Page 13: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/13.jpg)
Basic channel interface
A FIFO interface…
o Send: o out notFull;o in [n:0] enq_data;o in enq_en;
o Recv:o out notEmpty;o out[n:0] first;o in deq;
![Page 14: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/14.jpg)
Channel Interface Variants
Parallel channels (same source and dest and same latency) can be combined into a single timing channel - this reduces flow control overhead
Communication on wide channels might be fragmented or packetized across multiple host cycles and internally reassembled into one token. Unit sees flow control at fragment level, but channel guarantees delivery at the token level.
![Page 15: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/15.jpg)
Multiple clock domains
Simple cross clock domain communication can be handled with rate matchers at fast end of channel.
Unit B – 66 MHzChannelUnit A – 100 MHz
![Page 16: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/16.jpg)
Channel No Message
• Often as part of the process of abstracting a design into a model there is a situation where a communication is viewed as not happening…
• For example,
• To accommodate this situation an channel may include explicit transmission of a 'no message' token
data
enable
![Page 17: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/17.jpg)
Interface Layers
Point-to-pointRingTreeBus
Point-to-pointOne-to-manyMany-to-one
unModel domain
Intra-FPGAInter-FPGACPU-to-FPGA
Direct + Client/Server One-wayClient/Server
Logical Topology
Physical Network
Physical Link
Flow Control
Buffering
Timing
Servers
Model domain
Units
communication domain
Services
![Page 18: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/18.jpg)
Multi-layer implementations
Presentation
Logical Topology
Physical Network
Physical Link
Flow Control
Buffering
Timing
RDL channels
Units
FAST connectors
A-ports
“Soft connections”
![Page 19: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/19.jpg)
Logical Topology Semantics
• Represents host-level inter-module communication• Supports both model and unmodel traffic• Latency may be more than one host cycle
• Multiple patterns to be supported• One-to-one• One-to-many• Many-to-one
• Must be expressible in multiple languageso Bluespec, Verilog...
![Page 20: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/20.jpg)
Pattern Examples
• 1-to-1– Timing channels
• 1-to-many– “run” command broadcast from controller
• Many-to-one– assertion violation reporting
![Page 21: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/21.jpg)
Logical Topology Endpoint Interface
• Endpoints are simply FIFOso Send:
out notFull; in [n:0] enq_data; in enq_en;
o Recv: out notEmpty; out[n:0] first; in deq;
• Clocking o endpoint has same clock as module connected to ito cross host clock domain communication must be supported
• Conifguration Meta-informationo connection nameo connection directiono connection pattern
![Page 22: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/22.jpg)
Logical Topologies/Physical Interconnect
1
2
3
4
As
Ad
Bs
BdExample: shared ring
As at station 1 communicates with Ad at station 2
Bs at station 2 communicates with Bd at station 4
Intra-FPGAlink
![Page 23: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/23.jpg)
Interface Layers
Point-to-pointRingTreeBus
Point-to-pointOne-to-manyMany-to-one
unModel domain
Intra-FPGAInter-FPGACPU-to-FPGA
Connections One-wayClient/Server
Logical Topology
Physical Network
Physical Link
Flow Control
Buffering
Timing
Servers
Model domain
Units
communication domain
Services
![Page 24: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/24.jpg)
Physical Network Characteristics
• Host-level communication fabric
• Reliable transmission
• Deadlock Free
• Includes buffering for meeting above requirements
• Additional buffering is provide at higher layers
![Page 25: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/25.jpg)
Physical Link Interface Semantics
• Host-level communication channel
• FIFO-style interface
• Decoupled input/output
• Error-free (reliable delivery)
• Uni-directional
• Point-to-point
• Packet description (TBD)
• Indeterminate (but finite) latency
![Page 26: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/26.jpg)
Interface Layers
Point-to-pointRingTreeBus
Point-to-pointOne-to-manyMany-to-one
unModel domain
Intra-FPGAInter-FPGACPU-to-FPGA
Connections One-wayClient/Server
Logical Topology
Physical Network
Physical Link
Flow Control
Buffering
Timing
Servers
Model domain
Units
communication domain
Services
![Page 27: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/27.jpg)
UnModel Support Services
• Run control• Units can be commands to start, stop, etc…
• Dynamic parameters• Units can be configured at runtime
• Statistics• Unit can collect and report event counts
• Event logging• Unit can log a series of events for each cycle
• Assertions• Unit can do runtime checks of invariants and report violations
![Page 28: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/28.jpg)
Service Organization
StatDynamicParam
LocalControl
Unit
Global Controller
Host CPU
GlobalControl
ParamController
StatController
![Page 29: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/29.jpg)
Servers and services interface
• Service interface is implemented via separate input and output channels that handle requests and responses •Each input/output pair forms a service which implements multiple methods
• Request / response is in-order for a single service
• Synchronization between calls to different services must be provided by clients.
• We provide serializability of operations.
![Page 30: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/30.jpg)
Build process
• Handling logical endpoint connections• Would like to avoid requiring parents to need to specify connections
• Bluespec: use static elaboration, e.g., “soft connections”• Verilog: use TBD preprocessor
• Who maps logical connections to physical networks?• Locally• Globally
• 'Static' build parameters
• 'Dynamic' run parameters
![Page 31: RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer](https://reader036.vdocuments.net/reader036/viewer/2022062320/56649d5c5503460f94a3b3d3/html5/thumbnails/31.jpg)
Backup