data synchronization issues in gals socs rostislav (reuven) dobkin and ran ginosar technion christos...
TRANSCRIPT
Data Synchronization Issues in GALS SoCs
Rostislav (Reuven) Dobkin and Ran GinosarTechnion
Christos P. SotiriouFORTH
ICS-FORTH
2ICS-
FORTH
Outline
• The Problemo Synchronization Failures in GALS SoCs
• Three solutions:o Timing verificationo Synchronizerso Locally-delayed clocks
• Analysis
3ICS-
FORTH
GALS with Stoppable Clocks
• A GALS Module contains:o Synchronous Islando Local clock generatoro Self-timed wrapper (can stop the local clock)o Handshake for inter-modular communications,
Moore et al., “Point to point GALS interconnect,” ASYNC 2002
Villiger et al., “Self- timed Ring for Globally- Asynchronous Locally- Synchronous Systems,” ASYNC 2003
SYNCISLAND
LOCALCLOCK
GEN
CTRL
PORT
LOCALCLOCK
GEN
CTRL
PORT
HANDSHAKE
DATA
SYNCISLAND
4ICS-
FORTH
Data Synchronization
AK
C
Adjustable Delay Line ~ T/2
Clock Reset
REG
Local Clock Generator
PortREQ
ACK
LocallySynchronous
IslandDATA
XFER
MU
TE
X
MU
TE
X
MU
TE
X
R
A
B
Z
X
Y
INPUT
Moore et al., “Point to point GALS interconnect,” ASYNC 2002
Villiger et al., “Self- timed Ring for Globally- Asynchronous Locally- Synchronous Systems,” ASYNC 2003
5ICS-
FORTH
AK
C
Adjustable Delay Line ~ T/2
Clock Reset
REG
Local Clock Generator
PortREQ
ACK
LocallySynchronous
IslandDATA
XFER
MU
TE
X
MU
TE
X
MU
TE
X
R
A
B
Z
X
Y
INPUT
Synchronization FailureDue to clock tree delay,
the previous clock rise
may conflict with the
handshake
6ICS-
FORTH
Synchronization Failure: RACE !
AK
C
DATA
R
A
B
Z
X
Y
INPUT
R
X
DATA D0 D1
Y
CLK
CONFLICT
Conflict Condition:
CLK = +
7ICS-
FORTH
Conflict / Safe Zones
Conflict Condition:
CLK
T/2 T 3T/2 CLK
T/2
0
W
SS SConflict zones
Three Solutions
9ICS-
FORTH
Solution 1:Timing Verification
• Extract delays• Verify that CLK falls inside the SAFE
zones
T/2 T 3T/2 CLK
T/2
+DNOR
0
W
SS SConflict zones
SAFE
SAFE
SAFE …
10ICS-
FORTH
Solution 1:Matched Delay Port Control
AK
Clock Reset
REG
Local Clock Generator
PortREQ
ACK
LocallySynchronous
IslandDATA
XFER
MU
TE
X
MU
TE
X
MU
TE
X
R
A
B
Z
X
Y
INPUT
MatchedClock-TreeDelay, CLK
CLKCLK
C
Adjustable Delay Line ~ T/2
11ICS-
FORTH
Solution 1: Disadvantages
• Clock tree delays must be re-verified after each layout iteration
• The solution is sensitive to thermal and voltage variations
12ICS-
FORTH
Solution 2:Two-Flop Synchronizer • Low bandwidth
• Resolution time: one clock cycle
• Data Cycle: At least 3 clock cycles
REG
CLK
Locally Synchronous Island
REQENABLE
ACK
DATAINPUT
RESET
13ICS-
FORTH
Solution 3:Locally Delayed Latching
CLOCKLEAVES
LATCH DATA REG2REG1
ACK
CLDL
REQ CONTROL
YY1L
Y ...
14ICS-
FORTH
Solution 3:Time Budget
MS DCTRL Y1
Clock Y
Port WinsConflic
t
MS DCTRLY1 Clock Wins
MUTEX Metastability
Resolution
Asynchronous Controller
Delay
Clock Y1High-Phase
15ICS-
FORTH
36
38
40
42
44
46
48
350 250 180 130 90 70
Feature Size (nm)
FO
4 G
ate
Del
ays
10,000 1,000,000
How much resolution time?
• Less than 50 FO4 delays needed to resolve metastability
• ASIC / SoC clocks are slow: T > 100 FO4 delays• Conclusions:
o Fast clocks: Half a cycle is budgeted for M/S resolutiono Slower clocks (T>200 FO4): Quarter cycle
REQUIRED MTBF (YEARS)
16ICS-
FORTH
Solution 3:Operating Modes
Y
REQ
ACK
L
Y1
DATA Data0 Data1
No Conflict Conflict(Port Wins)
Data2 Data3
Conflict(Clock Wins)
No Conflict(Reduced Cycle)
Data4
dCTRLDelayed Clock
Worst case = dCTRL
Delayed RequestWorst Case = T/2
Delayed Clock
17ICS-
FORTH
Solution 3:A. Decoupled Input Port
LATCH
CTRL
DATA REG2REG1
ACKDO
LA
DI
MUTEX
R1
DL
REQ
R2 R3
YY1L
Y ...
DCTRL=
D{R3+DO+ DI+L-R2-}
18ICS-
FORTH
Solution 3:B. Decoupled Output Port
DATADATA
ReqREG
DataOutREG
CTRL
R1
A1Ack
REG1Ack
REG2 DL
MUTEX
ROUT
AIN
R2
OUTPUT PORT
SYNCISLAND
A2
A3
A4
DCTRL= D{A4+ A1+ A3-}
19ICS-
FORTH
Solution 3:C. A Simpler Input Port
LATCHDATA REG2REG1
ACKDLATCH
MUTEX
DL
REQ Y
YY1L
DCTRL= DLATCH +
DTX {ACK+ REQ-}
20ICS-
FORTH
Solution 3:Analysis
MS DCTRL Y1
Clock Y
T/4 for M/S Resolution
Asynchronous Controller
Delay
Minimal Clock High-Phase,
THP ~3 FO4 gate delays
• Example: T=160 FO4 gate delays. Constraint:
40 3 37 FO4 gate delays4
CTRL HP
TD T
Conflict
DL
21ICS-
FORTH
Solution 3:Simulations
CircuitCritical PathLatency(0.35)
InverterFO4
delays
Decoupled Input Port
R3+Do+Di+L–R2 –3.13 ns24
Decoupled Output Port
A4+A1+A3–1.81 ns14
Simple Input Port with Decoupled Output Port
Latch DelayA2+R2–2.13 ns16
*These results are based on data bus width of 16 bits
22ICS-
FORTH
Summary
• Design of arbitrated clocks for GALS SoCs must consider clock tree delays to control the risk of synchronization failures
• Presented three solutions:o Extract the delays and verify timingo Employ 2-flop synchronizers or matched-delay async
ports (low bandwidth)
o Employ locally-delayed ports(high bandwidth)