ee109 fpgas and memories

32
18.1 Unit 18 Field Programmable Gate Arrays (FPGAs) Implementing Logic Functions with Memories

Upload: others

Post on 22-Dec-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EE109 FPGAs and Memories

18.1

Unit 18

Field Programmable Gate Arrays (FPGAs)

Implementing Logic Functions with Memories

Page 2: EE109 FPGAs and Memories

18.2

HARDWARE IMPLEMENTATION TARGETS

Page 3: EE109 FPGAs and Memories

18.3

Processing Logic Approaches• Recall HW/SW designs sit on a continuum• Suppose I want to implement: F = (X+Y)*(A+B)• Custom Hardware (Faster, Less Power)

– Logic that directly implements a specific task– Example above may use separate adders and a

multiplier unit

• General Purpose (GP) Processor/Microcontroller (Design Time, Cost)– Logic designed to execute SW instructions– Provides basic processing resources that are reused by

each instruction

• What if I want to perform: (X*Y) + (A*B)– What's easiest to redesign?

+(Adder)

+(Adder)

*

X

Y

A

B

F

Custom HW ImplementationC

om

pu

tin

g S

ys

tem

Co

nti

nu

um

Application

Specific Hardware

(no software)

Processor

Executing Software

Fle

xib

ilit

y, D

es

ign

Tim

e

Pe

rfo

rma

nc

e

Co

st

+ *

CPU controlInstruc.

StoreADD T,X,YADD S,A,BMUL F,T,S

GP Proc. Implementation

of (X+Y)*(A+B)

Data in Mem.

Proc

Page 5: EE109 FPGAs and Memories

18.5

ASICs

• Application Specific Integrated Circuits (ASICs) is another name for a typical "chip"

• Computer engineers determine the gates and their interconnection that performs a specific task/application– Start with high level "behavioral" description

– Use CAD software tools to refine that to logic gates

– Use CAD software tools to refine that to transistors and where each should be located on the surface of the chip and how they should be wired together

– From there the chip is fabricated and mass-produced

• Design process is expensive, and once fabricated the design cannot be changed (but it is fast and uses less power)

In an ASIC design, a

unique chip will be

manufactured that

implements our design at

which point the HW

design is fixed & cannot

be changed (example:

Pentium, etc.)

Page 6: EE109 FPGAs and Memories

18.6

ASICs

Page 7: EE109 FPGAs and Memories

18.7

Motivation for Reconfigurable Logic• Could we get some of the benefits of

both hardware (speed/power) AND software (flexible/reusable)

• Yes…enter Field Programmable Gate Arrays (FPGAs)– Has prebuilt, generic hardware constructs

that can be configured and interconnected based on one design and then reconfigured and interconnected later for another design

• Let's learn more about the secret ingredient to FPGAs…memories!

Computing System ContinuumApplication

Specific Hardware

(no software /

custom chip)

Microcontroller/Processor

Executing Software

Reconfigurable

Hardware; FPGAs

FPGA’s have “logic

resources” on them that

we can configure to

implement our specific

design. We can then

reconfigure it to

implement another design

Page 8: EE109 FPGAs and Memories

18.8

Where are FPGAs Used

• Datacenters

– Bing search engine

– Real-time data analytics

– Compression and encryption

– High-frequency trading

• Robots and Rovers

– JPL and the Mars Rovers

• Telecom

• Aerospace

Page 9: EE109 FPGAs and Memories

18.9

USING MEMORIES TO BUILD COMBINATIONAL CIRCUITS

Page 10: EE109 FPGAs and Memories

18.10

MEMORY BASICSDimensions and Operations

Page 11: EE109 FPGAs and Memories

18.11

Memories

• Memories store (write) and retrieve (read) data

– Read-Only Memories (ROM’s): Can only retrieve data (contents are initialized and then cannot be changed)

– Read-Write Memories (RWM’s): Can retrieve data and change the contents to store new data

Page 12: EE109 FPGAs and Memories

18.12

ROM’s

• Memories are just tables of data with rows and columns

• When data is read, one entire row of data is read out

• The row to be read is selected by putting a binary number on the address inputs

0 0 1 1

1 0 1 0

0 1 0 0

0 1 1 1

1 1 0 1

1 0 0 0

0 1 1 0

1 0 1 1

A2

A0

A1

D3 D2 D1 D0

0

1

2

3

4

5

6

7

Address

Inputs

Data

Outputs

ROM

Page 13: EE109 FPGAs and Memories

18.13

ROM’s

• Example– Address = 410 = 1002 is

provided as input

– ROM outputs data in that row (1101 bin.)

0 0 1 1

1 0 1 0

0 1 0 0

0 1 1 1

1 1 0 1

1 0 0 0

0 1 1 0

1 0 1 1

A2

A0

A1

1 1 0 1

0

1

2

3

4

5

6

7

Address:

1002 = 410

Data:

Row 4 is

output

ROM

1

0

0

D3 D2 D1 D0

Page 14: EE109 FPGAs and Memories

18.14

Memory Dimensions

• Memories are named by their dimensions:

– Rows x Columns

• n rows and m columns => n x m ROM

• n rows => log2n address bits…or…2k rows => k address bits

• m cols => m data outputs

0 … 1

1 0

0 0

0 0

1 1

0

1

2

2n-2

ROM

.

.

.

2n-1

An-1

A0

A1

Dm-1 D0

Page 15: EE109 FPGAs and Memories

18.15

RWM’s

• Writable memories provide a set of data inputs for write data (as opposed to the data outputs for read data)

• A control signal R/W (1=READ / 0 = WRITE) is provided to tell the memory what operation the user wants to perform

0 0 1 1

1 0 1 0

0 1 0 0

0 1 1 1

1 1 0 1

1 0 0 0

0 1 1 0

1 0 1 1

A2

A0

A1

DO3 DO2 DO1 DO0

0

1

2

3

4

5

6

7

Address

Inputs

Data

Outputs

8x4 RWM

DI2

DI0

DI1

DI3Data

Inputs

R/W

Page 16: EE109 FPGAs and Memories

18.16

RWM’s

• Write example– Address = 310 = 0112

– DI = 1210 = 11002

– R/W = 0 => Write op.

• Data in row 3 is overwritten with the new value of 11002.

0 0 1 1

1 0 1 0

0 1 0 0

0 1 1 1

1 1 0 1

1 0 0 0

0 1 1 0

1 0 1 1

0

1

1

? ? ? ?

0

1

2

3

4

5

6

7

Address

Inputs

Data

Outputs

8x4 RWM

1

0

0

1Data

Inputs

0

R/W

1 1 0 0

A2

A0

A1

DI2

DI0

DI1

DI3

DO3 DO2 DO1 DO0

R/W

Page 17: EE109 FPGAs and Memories

18.17

USING MEMORIES TO BUILD COMBINATIONAL FUNCTIONS

Look-up tables…

Page 18: EE109 FPGAs and Memories

18.18

Memories as Look-Up Tables

• One major application of memories in digital design is to use them as LUT’s (Look-Up Tables) to implement logic functions

– This is the core technology used by FPGAs (Field-Programmable Gate Arrays)

• Idea: Use a memory to hold the truth table of a function and feed the inputs of the function to the address inputs to "look-up" the answer

Page 19: EE109 FPGAs and Memories

18.19

Implementing Functions w/ Memories

1

0

1

1

0

0

0

1

A2

A0

A1

D0

0

1

2

3

4

5

6

7

8x1 Memory

X Y Z F

0 0 0 1

0 0 1 0

0 1 0 1

0 1 1 1

1 0 0 0

1 0 1 0

1 1 0 0

1 1 1 1

Arbitrary

Logic

Function

X

Z

Y

F

1

0

1

1

0

0

0

1

A2

A0

A1

D0

0

1

2

3

4

5

6

7

8x1 Memory

1

0

1

0

X,Y,Z inputs

“look up”

the correct

answer

Use a memory with the same dimensions as 'output' side of the truth table.

It's almost TOO easy.

X

YZ

F

XYZ F

A0

A1

A2D0

8x1 Mem.

Page 20: EE109 FPGAs and Memories

18.20

Implementing Functions w/ Memories

0 0

0 1

0 1

1 0

0 1

1 0

1 0

1 1

A2

A0

A1

D1

0

1

2

3

4

5

6

7

8x2 Memory

X Y Z C S

0 0 0 0 0

0 0 1 0 1

0 1 0 0 1

0 1 1 1 0

1 0 0 0 1

1 0 1 1 0

1 1 0 1 0

1 1 1 1 1

Multi-bit function

(One's count)

X

Z

Y

C

8x2 Memory

D0

S

0 0

0 1

0 1

1 0

0 1

1 0

1 0

1 1

A2

A0

A1

D1

0

1

2

3

4

5

6

7

1

1

0

1

D0

01+0+1 = 10

Use a memory with the same dimensions as 'output' side of the truth table.

It's almost TOO easy.

Page 21: EE109 FPGAs and Memories

18.21

3-bit Squaring Circuit

• Q: What size memory would you use to build our 3-bit squaring circuit?

• A: 8x6 memory

• Q: What would you connect to the address inputs of the memory?

• A: A[2:0]

• Q: What bits would you program into row 5 of the memory?

• A: 011001 (i.e. 25 = 52)

Inputs Outputs

A A2 A1 A0 B5 B4 B3 B2 B1 B0 B=A2

0 0 0 0 0 0 0 0 0 0 0

1 0 0 1 0 0 0 0 0 1 1

2 0 1 0 0 0 0 1 0 0 4

3 0 1 1 0 0 1 0 0 1 9

4 1 0 0 0 1 0 0 0 0 16

5 1 0 1 0 1 1 0 0 1 25

6 1 1 0 1 0 0 1 0 0 36

7 1 1 1 1 1 0 0 0 1 49

Memory Contents to

build 3-bit Squaring

Circuit

Page 22: EE109 FPGAs and Memories

18.22

4x4 Multiplier ExampleDetermine the dimensions of the memory that would be necessary to implement a 4x4-bit unsigned multiplier with inputs X[3:0] and Y[3:0] and outputs P[??:0]

Question: How many bits are needed for P?

Question: What are the contents of the numbered rows?

Example:

X3X2X1X0=0010

Y3Y2Y1Y0=0001

P = X * Y = 2 * 1 = 2

= 00010

ROM

...

A2

A0

A1Y1

Y0

Y2

Y3 A3

A6

A4

A5X1

X0

X2

X3 A7

P7 P0

0

2

20

39

255

0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0

0 0 0 0 0 1 0 0

0 0 0 0 1 1 1 0

1 1 1 0 0 0 0 1

20=00010100

=0001*0100=4

39=00100111

=0010*0111=14

255=11111111

=1111*1111=225

Page 23: EE109 FPGAs and Memories

18.23

Implementing Functions w/ Memories

• To implement a function w/ n-variables and m outputs

• Just place the output truth table values in the memory

• Memory will have dimensions: 2n rows and m columns

– Still does not scale terribly well (i.e. n-inputs requires memory w/ 2n rows)

– But it is easy and since we can change the contents of memories it allows us to create "reconfigurable" logic

– This idea is at the heart of FPGAs

Page 24: EE109 FPGAs and Memories

18.24

FPGAS

Page 25: EE109 FPGAs and Memories

18.25

Basis of FPGA’s

• Memories provide a universal way to implement any combinational logic function– 2n x m memory can implement a

function of n-variables and m outputs

• If we use RWM (read/write memory) rather than ROM’s we can change what function the memory implements

• Memories are referred to as Look-up Tables (LUT’s)

0 0

0 1

0 1

1 0

0 1

1 0

1 0

1 1

X

Cin

Y

Cout S

D1 D0

0

1

2

3

4

5

6

7

8x2 Memory

A2

A0

A1

Full Adder

Implementation

Page 26: EE109 FPGAs and Memories

18.26

Configurable Logic Blocks (CLB’s)

• The memory allows for any combinational function

• Provided D-FF’s allow designs with sequential logic

– “Bypass” mux selects the pure combinational output of the LUT or the sequential/registered/D-FF output

• Blue boxes indicate configurable bits that control the operation and function of the logic

Any 3-input /

2-output

combinational

function

FF’s if

sequential

logic needed

0

1

2

3

4

5

6

7

0 0

0 1

0 1

1 0

0 1

1 0

1 0

1 1

A0

A1

A2

D1 D0

8x2 Mem.

CLK

D

Q

CLK

D

Q

CLB

01 01

bypass mux

Page 27: EE109 FPGAs and Memories

18.27

Routing & Switch Matrices

• Inputs and outputs of neighboring CLB’s connect to a “switch matrix” (SM)

• Switch matrix is simply composed of muxesthat allow us to “route” inputs and outputs to another CLB or further away

SM

CLB CLB

CLB CLB

3

2

2

3

2

3

3

2

SM

CLB CLB

CLB CLB

3

2

2

3

2

3

3

2

SM

CLB CLB

CLB CLB

3

2

2

3

2

3

3

2

SM

CLB CLB

CLB CLB

3

2

2

3

2

3

3

2

Page 28: EE109 FPGAs and Memories

18.28

Routing & Switch Matrices

• Suppose we want the connection shown in green and purple, what select values would be used? B

A

L

BA

L

LBA

LBA...

...

...

...

C

To / from

N SM

Switch

Matrix

(SM)

CLB

CLB

To / from E SM

To / from

S SM

CLB

CLB

To / from W SM

A B

D

E

F

GHI

J

K

L 1110

01

11

01

11

1110

11

10=

10

11

2

110=00012

Page 29: EE109 FPGAs and Memories

18.29

Place and Route

• ASIC: Find where each gate should be placed on the chip and how to route the wires that connect to it– Direct connections can be faster

• FPGA: Determine which LUT’s should be used and how to route through switch matrices– Added delay to go through the routing muxes

ASICFPGA

SM

CLB CLB

CLB CLB

3

2

2

3

2

3

3

2

SM

CLB CLB

CLB CLB

3

2

2

3

2

3

3

2

SM

CLB CLB

CLB CLB

3

2

2

3

2

3

3

2

SM

CLB CLB

CLB CLB

3

2

2

3

2

3

3

2

Page 30: EE109 FPGAs and Memories

18.30

BA

L

LBA...

...

C

To / from

N SM

Switch

Matrix

(SM)

CLB

To / from E SM

A B

D

E

F

1110

01

11

CLB

CLB 1CLB 2

CLB 1

CLB 2

CLB 2

CLB 1

Exercise

• Find the configuration bits to build a 3-bit free-running (always enabled) counter

0

1

2

3

4

5

6

7

A0

A1

A2

D1 D0

8x2 Mem.

CLK

D

Q

CLK

D

Q

CLB

01 01

0

1

2

3

4

5

6

7

A0

A1

A2

D1 D0

8x2 Mem.

CLK

D

Q

CLK

D

Q

CLB

01 01

0 1

1 0

d d

d d

d d

d d

d d

d d

0 0

0 1

0 1

1 0

1 0

1 1

1 1

0 0

Q0

0

0

Co

Q1

Q2

0 111

Q1Q2 Q0Co

0

0 0 Q0

Co Q0

Co

Q1

Q2

Q1

Q2

Select to

choose Q0

(B input

label) = 0001

HA

3-bit Reg.

HA HA

1

Q0Q1Q2

Ci

Q1

Q2

Q0

Q0 Co Q0*(Q0+1)

0 0 1

1 1 0

Co

Q2 Q1 Co Q2* Q1*

0 0 0 0 0

0 0 1 0 1

0 1 0 0 1

0 1 1 1 0

1 0 0 1 0

1 0 1 1 1

1 1 0 1 1

1 1 1 0 0

Selects to

choose

A = 0000

D = 0011

E = 0100

3

4

Page 31: EE109 FPGAs and Memories

18.31

ASIC’s vs. FPGA’s

• ASIC’s

– Faster

– Handles Larger Designs

– More Expensive

– Less Flexible (Cannot be reconfigured to perform a new hardware function)

• FPGA’s

– Slower (extra logic to make it reconfigurable)

– Smaller Designs

– Less Expensive

– Extremely Flexible

Page 32: EE109 FPGAs and Memories

18.32

Modern FPGA's

• SoC design (Xilinx Kintex [KU115])

– Quad-Core ARM cores

– DDR3 SDRAM Memory Interface

– ~800 I/O Pins

– ~15M gate equivalent FPGA fabric

• ~1M D-FFs + 552K LUTs

• 1968 dedicated DSP "slices" 18x18 multiply + adder