l11/12: reconfigurable logic architecturesweb.mit.edu/6.111/www/s2004/lectures/l11-12.pdf ·...

35
L11/12: 6.111 Spring 2004 1 Introductory Digital Systems Laboratory L11/12: Reconfigurable Logic L11/12: Reconfigurable Logic Architectures Architectures Acknowledgements: R. Katz, “Contemporary Logic Design”, Addison Wesley Publishing Company, Reading, MA, 1993. Frank Honore

Upload: others

Post on 11-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 1Introductory Digital Systems Laboratory

L11/12: Reconfigurable Logic L11/12: Reconfigurable Logic ArchitecturesArchitectures

Acknowledgements:

R. Katz, “Contemporary Logic Design”, Addison Wesley Publishing Company, Reading, MA, 1993.

Frank Honore

Page 2: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 2Introductory Digital Systems Laboratory

History of Computational FabricsHistory of Computational Fabrics

Discrete devices: relays, transistors (1940s-50s)Discrete logic gates (1950s-60s)Integrated circuits (1960s-70s)

e.g. TTL packages: Data Book for 100’s of different parts

Gate Arrays (IBM 1970s)Transistors are pre-placed on the chip & Place and Route software puts the chip together automatically – only program the interconnect (mask programming)

Software Based Schemes (1970’s- present)Run instructions on a general purpose core

ASIC Design (1980’s to present)Turn Verilog directly into layout using a library of standard cells Effective for high-volume and efficient use of silicon area

Programmable Logic (1980’s to present)A chip that be reprogrammed after it has been fabricatedExamples: PALs, EPROM, EEPROM, PLDs, FPGAsExcellent support for mapping from Verilog

Page 3: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 3Introductory Digital Systems Laboratory

Reconfigurable LogicReconfigurable Logic

Logic blocksTo implement combinationaland sequential logic

InterconnectWires to connect inputs andoutputs to logic blocks

I/O blocksSpecial logic blocks at periphery of device forexternal connections

Key questions:How to make logic blocks programmable?(after chip has been fabbed!)What should the logic granularity be?How to make the wires programmable?(after chip has been fabbed!)Specialized wiring structures for localvs. long distance routes?How many wires per logic block?

LogicLogic

Configuration

Inputs Outputsn m

Q

QSET

CLR

D

Page 4: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 4Introductory Digital Systems Laboratory

Programmable Array Logic (PAL)Programmable Array Logic (PAL)

Based on the fact that any combinational logic can be realized as a sum-of-productsPALs feature an array of AND-OR gates with programmable interconnect

inputsignals

outputsignals

programming of product terms

programming of sum terms

ANDarray OR array

Page 5: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 5Introductory Digital Systems Laboratory

Inside the 22v10 PALInside the 22v10 PAL

Each input pin (and its complement) sent to the AND arrayOR gates for each output can take 8-16 product terms, depending on output pin“Macrocell” block provides additional output flexibility...

Page 6: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 6Introductory Digital Systems Laboratory

Inside the 22v10 “Inside the 22v10 “MacrocellMacrocell” Block” Block

Outputs may be registered or combinational, positive or invertedRegistered output may be fed back to AND array for FSMs, etc.

From Lattice Semiconductor

b. Combinational/active low

d. Combinational/active high

Combinational/active low

Combinational/active high

Page 7: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 7Introductory Digital Systems Laboratory

Cypress PAL CE22V10Cypress PAL CE22V10

Page 8: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 8Introductory Digital Systems Laboratory

AntiAnti--FuseFuse--Based Approach (Based Approach (ActelActel))

Rows of programmablelogic building blocks

+

rows of interconnect

Anti-fuse Technology:Program Once

8 input, single output combinational logic blocks

FFs constructed from discrete cross coupled gates

Use Anti-fuses to buildup long wiring runs from

short segmentsI/O Buffers, Programming and Test Logic

Logic Module Wiring Tracks

I/O Buffers, Programming and Test Logic

I/O B

uffe

rs, P

rogr

amm

ing

and

Test

Log

ic

I/O B

uffers, Programm

ing and Test Logic

Page 9: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 9Introductory Digital Systems Laboratory

ActelActel Logic ModuleLogic Module

Combinational block does not have the output FFExample Gate Mapping

00011011

GNDA

BC

DE

GND

GND

VDD

Y

R

S

VDDQ

S-R Flip-Flop

00011011

Page 10: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 10Introductory Digital Systems Laboratory

ActelActel Routing & ProgrammingRouting & Programming

Logic Module

Output SegmentsLong Vertical Tracks

Input Segments

Outputs

Inputs

HorizontalChannel

Vpp

Vpp/2

Vpp/2Gnd

Programming an Antifuse

Antifuseshorted

Vpp/2

Vpp/2Vpp/2

Vpp/2

PrechargePhase

Page 11: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 11Introductory Digital Systems Laboratory

RAM Based Field Programmable RAM Based Field Programmable Logic Logic -- XilinxXilinx

CLB

CLB

CLB

CLB

SwitchMatrix

ProgrammableInterconnect I/O Blocks (IOBs)

ConfigurableLogic Blocks (CLBs)

D Q

SlewRate

Control

PassivePull-Up,

Pull-Down

Delay

Vcc

OutputBuffer

InputBuffer

Q D

Pad

D QSD

RDEC

S/RControl

D QSD

RDEC

S/RControl

1

1

F'G'

H'

DIN

F'G'

H'

DIN

F'

G'H'

H'

HFunc.Gen.

GFunc.Gen.

FFunc.Gen.

G4G3G2G1

F4F3F2F1

C4C1 C2 C3

K

Y

X

H1 DIN S/R EC

Page 12: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 12Introductory Digital Systems Laboratory

The The XilinxXilinx 4000 CLB4000 CLB

Page 13: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 13Introductory Digital Systems Laboratory

Two 4Two 4--input Functions, Registered Outputinput Functions, Registered Outputand a Two Input Functionand a Two Input Function

Page 14: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 14Introductory Digital Systems Laboratory

55--input Function, Combinational Outputinput Function, Combinational Output

Page 15: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 15Introductory Digital Systems Laboratory

LUT MappingLUT Mapping

N-LUT direct implementation of a truth table: any function of n-inputs.N-LUT requires 2N storage elements (latches)N-inputs select one latch location (like a memory)

4LUT example

Latches set by configuration bitstream

Inputs

Output

Why Latches and Not Registers?

Page 16: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 16Introductory Digital Systems Laboratory

Configuring the CLB as a RAMConfiguring the CLB as a RAM

Memory is built using Latches not FFs

Read is same a LUT Function!

16x2

Page 17: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 17Introductory Digital Systems Laboratory

XilinxXilinx 4000 Interconnect4000 Interconnect

Page 18: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 18Introductory Digital Systems Laboratory

XilinxXilinx 4000 Interconnect Details4000 Interconnect Details

Wires are not ideal!

Page 19: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 19Introductory Digital Systems Laboratory

Add Bells & WhistlesAdd Bells & Whistles

HardProcessor

I/O

BRAM

Gigabit Serial

Multiplier

ProgrammableTermination

Z

VCCIO

Z

Z

ImpedanceControl Clock

Mgmt

18 Bit

18 Bit36 Bit

Courtesy of David B. Parlour, ISSCC 2004 Tutorial, “The Reality and Promise of Reconfigurable Computing in Digital Signal Processing”

Page 20: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 20Introductory Digital Systems Laboratory

XilinxXilinx 4000 Flexible IOB4000 Flexible IOB

Adjust Transition Time

Adjust the Sampling Edge

Outputs through FF or bypassed

Page 21: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 21Introductory Digital Systems Laboratory

The The VirtexVirtex II CLB (Half Slice Shown)II CLB (Half Slice Shown)

Page 22: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 22Introductory Digital Systems Laboratory

Adder ImplementationAdder Implementation

Y = A ⊕ B ⊕ CinAB

Cin

CoutLUT: A⊕B

1 half-Slice = 1-bit adder

Dedicated carry logic

Page 23: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 23Introductory Digital Systems Laboratory

Carry ChainCarry Chain

1 CLB = 4 Slices = 2, 4-bit adders

64-bit Adder: 16 CLBs

+

CLB15

CLB0A[3:0]B[3:0]

A[63:60]B[63:60]

A[63:0]

B[63:0]Y[63:0]

Y[3:0]

Y[63:60]

Y[64]

CLBs must be in same column

CLB1A[7:4]B[7:4] Y[7:4]

Page 24: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 24Introductory Digital Systems Laboratory

VirtexVirtex II FeaturesII Features

Double Data Rate registers Digital Clock Manager

Embedded MultiplierBlock SelectRAM

Page 25: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 25Introductory Digital Systems Laboratory

The Latest Generation: The Latest Generation: VirtexVirtex--II ProII Pro

Courtesy XilinxHigh-speed I/O

Embedded PowerPc

Embedded memories

Hardwired multipliers

FPGA Fabric

Page 26: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 26Introductory Digital Systems Laboratory

AlteraAltera FLEX 10K FamilyFLEX 10K Family

8 LE’s per LAB

S R A M - based programming

Page 27: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 27Introductory Digital Systems Laboratory

AlteraAltera Logic ElementLogic Element

Th

Page 28: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 28Introductory Digital Systems Laboratory

FLEX 10K Logic Array BlockFLEX 10K Logic Array Block

FLEX 10K70: 9 rows (312 chan/row),

52 columns (24 chan/col)

Page 29: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 29Introductory Digital Systems Laboratory

FLEX 10K Embedded Array BlockFLEX 10K Embedded Array Block

Page 30: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 30Introductory Digital Systems Laboratory

Altera’sAltera’s New New StratixStratix ArchitectureArchitecture

Embedded DSP feature: 9x9, 18x18, 36x36 with 52-bit accumulatorUp to 11,310 LE’s, 10Mbits RAM10 LE’s per LAB

Page 31: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 31Introductory Digital Systems Laboratory

Design Flow Design Flow -- MappingMapping

Technology Mapping: Schematic/HDL to Physical Logic unitsCompile functions into basic LUT-based groups (function of target architecture)

always @(posedge Clock or negedge Reset)beginif (! Reset)

q <= 0;else

q <= (a & b & c) | (b & d);end

Q

QSET

CLR

D

LUTQ

QSET

CLR

D

abc

db

Page 32: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 32Introductory Digital Systems Laboratory

Design Flow Design Flow –– Placement & RoutePlacement & Route

Placement – assign logic location on a particular device

LUT

LUT

LUT

Routing – iterative process to connect CLB inputs/outputs and IOBs. Optimizes critical path delay – can take hours or days for large, dense designs

Iterate placement if timing not met

Satisfy timing? Generate Bitstream to config device

Challenge! Cannot use full chip for reasonable speeds (wires are not ideal). Typically no more than 50% utilization.

Page 33: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 33Introductory Digital Systems Laboratory

Example: Example: VerilogVerilog to FPGAto FPGA

module adder64 (a, b, sum); input [63:0] a, b; output [63:0] sum;

assign sum = a + b;endmodule

Virtex II – XC2V2000

• Synthesis• Tech Map• Place&Route

64-bit Adder Example

Page 34: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 34Introductory Digital Systems Laboratory

How are How are FPGAsFPGAs Used?Used?

(courtesy of IKOS)FPGA-based Emulator

Logic EmulationPrototyping

Ensemble of gate arrays used to emulate a circuit to be manufacturedGet more/better/faster debugging done than with simulation

Reconfigurable hardwareOne hardware block used to implement more than one function

Special-purpose computation engines

Hardware dedicated to solving one problem (or class of problems)Accelerators attached to general-purpose computers (e.g., in a cell phone!)

Page 35: L11/12: Reconfigurable Logic Architecturesweb.mit.edu/6.111/www/s2004/LECTURES/l11-12.pdf · L11/12: 6.111 Spring 2004 Introductory Digital Systems Laboratory 2 History of Computational

L11/12: 6.111 Spring 2004 35Introductory Digital Systems Laboratory

SummarySummary

FPGA provide a flexible platform for implementing digital computingA rich set of macros and I/Os supported (multipliers, block RAMS, ROMS, high-speed I/O)A wide range of applications from prototyping (to validate a design before ASIC mapping) to high-performance spatial computingInterconnects are a major bottleneck (physical design and locality are important considerations)

“College students will study concurrent programming instead of “C” as their first

computing experience.”

-- David B. Parlour, ISSCC 2004 Tutorial