12004 mapld: 140jong-ru guo jong-ru guo, c. you, m. chu, k. zhou, jin-woo kim, b.s. goda*, r.p....

22
1 2004 MAPLD: 140 Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute, Troy, NY, 12180 * United State Military Academy, West Point, N.Y. 10096 High performance field programmable gate array for gigahertz applications

Upload: baldric-murphy

Post on 28-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

1 2004 MAPLD: 140Jong-Ru Guo

Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald

Rensselaer Polytechnic Institute, Troy, NY, 12180

* United State Military Academy, West Point, N.Y. 10096

High performance field programmable gate array for

gigahertz applications

Page 2: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

2 2004 MAPLD: 140Jong-Ru Guo

Gigahertz era

High speed reconfigurable system is needed to handle the increasing amount of data.However, the CMOS FPGA just is operated at the hundreds MHz. GHz reconfigurable system is needed.

Page 3: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

3 2004 MAPLD: 140Jong-Ru Guo

IntroductionField Programmable Gate Array (FPGA)

A

C

Logic Cell

I/O Cell

Routing Cell

B FPGA: A reconfigurable chip that can be programmed for a specific function.

Status: There are no FPGA’s that operate at GHz microprocessor clock rates much less at K-band or X-band.

Goal: Change this situation for the better.

K-band: 10.9~36GHzX-band: 8-12GHz

Page 4: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

4 2004 MAPLD: 140Jong-Ru Guo

FPGA Applications

1. Prototyping 2. Digital Networks - Mobile Subscriber Equipment - High Speed Switching Nodes2. Real Time Signal/Image Processing - Radar - Pattern Recognition3. Digital System Processing - Filters - Fourier Transform4. Satellite Systems5. Wireless

Page 5: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

5 2004 MAPLD: 140Jong-Ru Guo

High speed IBM SiGe HBT Process

Ref. 40-Gb/s Circuits Built From a 120-GHz fT SiGe TechnologyIEEE Journal of Solid-State Circuit. VOL. 37, NO.9, Sept. 2003

Approximated cut-off frequency:IBM 0.5 & 0.25 um generations (5HP) ~ 50 GHzIBM 0.18 um generation (7HP) ~ 120 GHzIBM 0.13 um generation (8HP) ~ 180 GHz

Observe the Logarithmic Ic Axis

8HP process

5HP process

7HP process

Page 6: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

6 2004 MAPLD: 140Jong-Ru Guo

SiGe Graded Base Bipolar Transistor

p-Si

Eg,Ge(x=0)

EC

EV

e-

h+n+ Siemitter

Ge

p-SiGebase

Drift Field

n- Sicollector

Eg,Ge(grade)= Eg,Ge(x=Wb)- Eg,Ge(x=0)

(Concentration)

x

Si/SiGe band diagram

Ref. Yuan Taur and Tak H. Ning “Fundamentals of Modern VLSI Devices”, Cambridge University Press, p364, 1998.

Ref. Flash Comm

Page 7: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

7 2004 MAPLD: 140Jong-Ru Guo

Ce,Qe,E,E4Cw,Qw,W,W4 Cs,Qs,S,S4 Cn,Qn,N,N4

Ce,Qe,E,E4Cw,Qw,W,W4 Cs,Qs,S,S4 Cn,Qn,N,N4

Ce,Qe,E,E4Cw,Qw,W,W4 Cs,Qs,S,S4 Cn,Qn,N,N4

Input Routingblock

CLK

C

Q

To East

To West

To South

To North

E,S,N

W,S,N

E,W,S

E,W,N

Output routingblock

New D-FF

Function Unit (FU)

New Structure: Input and Output Block and Function Unit (FU)

Schematic of the new function unitBased on XC6200

West Output MUX

South Output MUX

North Output MUX

East Output MUX

Input 16:1 MUX

Input 17:1 MUX

Input 17:1 MUX

Master-Slave Latch

Memory configurationstructure

2:1 MUX

Output drivers

(170um x 210um)

Page 8: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

8 2004 MAPLD: 140Jong-Ru Guo

210 um

170 um

Area improvement-BC

130 um

135 um

49% layout area saved7HP: 0.18 um process8HP: 0.13 um processSmaller layout Better performance More Configurable cells.

Page 9: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

9 2004 MAPLD: 140Jong-Ru Guo

Prediction of the performance improvement by the different generation processes

130 ps

71 mW

30 ps???

13.8 mW

42 ps

HBT Generations

5HP8T 9HP?

Propagation delay and power consumption comparisons between different processes

100 ps

4.2 mW52 mW

7HP

Page 10: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

10 2004 MAPLD: 140Jong-Ru Guo

Information for the old, new, and future Basic Cells-BC

Process Vcc, Vee Current trees

Iref Power ( PON )

Tp

BC-I 5HP 0, -3.4V 30 0.7 mA 71.4mW 239ps

Basic Cell-II 7HP 0, -2.8V 21 0.8 mA 47.04mW 100ps

Basic Cell-II (high performance case)

8HP 0, -2.2V 21 0.7mA 32.34mW 42ps

Basic Cell-II(Power Saving case)

8HP 0, -2.2V 21 0.3mA 13.86mW 75ps

1. The 8HP cases (High performance case and power saving case) are based on simulations.

2. The difference between 8HP cases is the high performance case has its transistors set to max. cutoff frequency and the transistors in the Power Saving case are set to be the same with the maximum cutoff frequency of 7HP process.

Page 11: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

11 2004 MAPLD: 140Jong-Ru Guo

Test circuit Four stage Basic Cell ring oscillator-BC

Measurement result of the 5HP Basic Cell

Measurement result of the 7HP ring oscillator

Page 12: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

12 2004 MAPLD: 140Jong-Ru Guo

Design Tree # Usage

BC Maximum Usage 21 100%

Case I (Comb./Sequential. Logic) 10/12 47.6%/57.1%

Case II Sequential, One Redir. 15 71.4%

Sequential, Two Redir. 18 85.7%

Sequential, Three Redir 21 100%

Case III redirect function only 3 tree/dir 14.2%/dir

Power-saving scheme Usage [12]Case I: Only combinational logic or sequential logic is used. Case II: Sequential logic and redirection function are used.Case III: Only redirection function is used.

Power-saving scheme- Basic Cell

Page 13: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

13 2004 MAPLD: 140Jong-Ru Guo

Summary: Basic Cell• Layout size has been reduced by 49%. With the latest Basic Cell, there will be 48x48 Basic Cell array

in 7mm x 7mm area.• Propagation delay has been reduced by 82.5% • Power consumption has been reduced by 80.6% (5HP case and

8HP power saving case) for the fully turned-on case.• There is 94% power saved when the power-saving scheme is

enabled.

Page 14: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

14 2004 MAPLD: 140Jong-Ru Guo

High speed reconfigurable system

High speed front end

High speed back end

SiGe FPGA

CMOSFPGA

Interleaving block

De-interleaving block

DSP and other applicationsSuch as, Poly-phase filter,

digital filter…etc

10GHz ~ 80GHz 500MHz ~ 10GHz 100MHz~700MHz

Interleaving data path

De-interleaving data path

To processors or other circuits

High speed inputs

High speed outputs

Page 15: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

15 2004 MAPLD: 140Jong-Ru Guo

• SiGe FPGA can be configured to DSP and other applications. •To compare the performance between the SiGe and CMOS FPGAs, the SiGe FPGA is configured to be 4:1 MUX and 1:4 DEMUX.

•The results can be used to prove its interleaving and de-interleaving functions.

MUX-DEMUX

Application: High speed data acquisition system

Page 16: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

16 2004 MAPLD: 140Jong-Ru Guo

1:2 DEMUX

1:2 DEMUX

1:2 DEMUX

D1D1B

D2D2B

Data

½DIV

½DIV

CLK ¼ CLK out

D4 D4B

D3D3B

The building blocks of the 1:4 DEMUX.

CH1 CH2 CH3 CH4

CHx

T

4T

Data input

Output

The timing diagram of the 4:1 MUX (x represents 1, 2, 3 and 4)

CH1 CH3 CH2 CH4

CHx

T

4T

Outputs

1:4 DEMUX input

The timing diagram of the 1:4 DEMUX

2:1 MUX

2:1 MUX

2:1 MUX

1/2DIV

CLKCLKB

1/2 CLK 1/2 CLKB

CH1CH1B

OutputOutputB

The block diagram of the 4:1 MUX

CH3CH3BCH4CH4B

CH2CH2B

High speed data acquisition MUX-DEMUX

Page 17: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

17 2004 MAPLD: 140Jong-Ru Guo

Layout of the 4:1 MUX and 1:4 DEMUX implemented by the SiGe Basic Cells

Layout of the 4:1 MUX

Layout of the 1:4 DEMUX

Simulation results show both 4:1 MUX and 1:4 DEMUX can operate up to 10GHz

Compare to CMOS FPGA (Xilinx Virtex), same circuits can run to 183MHz

Page 18: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

18 2004 MAPLD: 140Jong-Ru Guo

Simulated eye diagram of the 4:1 MUX programmed by SiGe FPGA runs at 10Gbps

Simulation result of the 4:1 MUX.Inputs: CH_A: 1010011, CH_B: 0010100, CH_C: 0101001 and CH_D: 0001010.Output: 0010-1000-0011-1100-0001-0110.

Simulation results (MUX)

Page 19: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

19 2004 MAPLD: 140Jong-Ru Guo

SiGe and CMOS FPGAPerformance comparisons

Tx rate Power (mW) Used CLB

4:1 MUX (SiGe) 10GBps 258.3 7

4:1 DEMUX (Virtex) 170MHz 61 7

1:4 DEMUX (SiGe) 2.5Gbps (Input: 10Gbps)

194.52 8

1:4 DEMUX (Virtex) 45.5MHz (Input 182MHz)

91 8

Virtex results are based on the following environments:Software: Foundation 2.1Xilinx power consumption work sheet V1.5

Page 20: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

20 2004 MAPLD: 140Jong-Ru Guo

Larger scale SiGe FPGA

• 20x20 Basic Cell array is fabricated by IBM (7HP).

with the dimension of 7mm x 7mm. (400 Basic Cells).

• 48x48 Basic Cell array is developed (8HP) with the high speed ADC integrated.

5.8mm

7mm

Page 21: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

21 2004 MAPLD: 140Jong-Ru Guo

Conclusion

• The performance of the SiGe FPGA can reach up to 20GHz (8HP generation)

• The layout has been reduced by 49% between the 8HP and 5HP generations.

• Applications have been proposed to run at GHz range.

• 4:1 MUX and 1:4 DEMUX have been configured to compare the performance of SiGe and CMOS FPGA.

Page 22: 12004 MAPLD: 140Jong-Ru Guo Jong-Ru Guo, C. You, M. Chu, K. Zhou, Jin-Woo Kim, B.S. Goda*, R.P. Kraft, J.F. McDonald Rensselaer Polytechnic Institute,

22 2004 MAPLD: 140Jong-Ru Guo

Future work

• Test 20x20 Basic Cell array has been fabricated by IBM (7HP).

• Develop high speed data acquisition system.

• Implement DSP applications. Such as software radar, poly phase filtering …etc.

• 10GHz and 20GHz SiGe FPGA.

• Integrated with high speed front-end and back-end circuits.

Primitive layout of the 48x48 SiGe FPGA