2/19/2016 fpgas k. elliott fleming computer science & artificial intelligence lab massachusetts...

18
05/11/22 http:// csg.csail.mit.edu/6.375 L11-01 FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

Upload: ernest-baker

Post on 18-Jan-2018

220 views

Category:

Documents


0 download

DESCRIPTION

What can we build? ResourceDE2-70DE agSMIPS V2 Logic Elements Registers SRAM250 (4K)1040 (9k)265 (9K)226(4K) Multipliers Clock Buffers PLL4810 Lines of Code /19/2016 L Very complex systems

TRANSCRIPT

Page 1: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

05/04/23 http://csg.csail.mit.edu/6.375 L11-01

FPGAsK. Elliott Fleming Computer Science & Artificial Intelligence LabMassachusetts Institute of Technology

Page 2: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

05/04/23 L11-2http://csg.csail.mit.edu/6.375

FPGA: A Sea of Resources

Processor

I/O Pads

Logic Blocks

SRAM

Multiplier

Clock Buffers

PLL

Page 3: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

What can we build?Resource DE2-70 DE3 802.11ag SMIPS V2 Logic Elements

68416 135200 85924 6501

Registers 70234 270400 42107 2841SRAM 250 (4K) 1040 (9k) 265 (9K) 226(4K)Multipliers 300 576 321 0Clock Buffers

16 32 7 5

PLL 4 8 1 0Lines of Code

8762 1603

05/04/23 L11-3http://csg.csail.mit.edu/6.375

- Very complex systems

Page 4: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

05/04/23 L11-4http://csg.csail.mit.edu/6.375

Logic Block: Building functionality

Look-up Table

Look-up Table

++

Mux

ing

Logi

c

Com

bina

tiona

l Inp

ut

Com

bina

tiona

l Out

put

Carry Out

Carry In

Page 5: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

05/04/23 L11-5http://csg.csail.mit.edu/6.375

Slice:Look-up Table

Mux

ing

Logi

c

Com

bina

tiona

l Out

put

Com

bina

tiona

l Inp

ut

Enab

le D

emux

Arbitrary Logic Program flipflops Use inputs to select

Can we make a ROM?Can we make a RAM? Just add enable logic

Page 6: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

05/04/23 L11-6

Reconfigurable Wiring

Logic Block

Switch

Switch

Switch

Switch

http://csg.csail.mit.edu/6.375

2D Mesh Grid Local connections

made by driving powerful transistors

Switches route across dimensions

Heterogeneous wire length Many wires to

nearby cells Few long-length

wires

Page 7: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

SMIPS System

05/04/23 L11-7http://csg.csail.mit.edu/6.375

Page 8: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

SMIPS Infrastructure

05/04/23 L11-8http://csg.csail.mit.edu/6.375

Page 9: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

SMIPS InfrastructureBus Interface Logic Avalon Master/SlaveCbus Devices mkCBusWideRegRW(addr,reg); Many interfaces (Get, RegFile, etc.) Mechanism for building memory map

automatically Some C drivers included

05/04/23 L11-9http://csg.csail.mit.edu/6.375

Page 10: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

DemonstrationSynplify ProQuartus IINios-II IDE

05/04/23 L11-10http://csg.csail.mit.edu/6.375

Page 11: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

Cryptosort: Think DifferentLarge (.5 GB) encrypted database Decrypt Database Sort Database on key Encrypt DatabaseDo it fast, on an FPGA Design principals differ from ASIC Must be aware of FPGA hardware

05/04/23 L11-11http://csg.csail.mit.edu/6.375

Joint with Myron King, Man Cheuk Ng

Page 12: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

0x084b6743c6530x3f9856235c580x223ad89654970x328d5487ca840x3982675a91850x928ab986ce460x92861184ff964

0x038d5487ca840x4892675a91850x147ab986ce460x92861184ff9640x084b6743c6530xcc1856235c580x982ad8965497

0x0020000000000x0004000000000x1000000000000x0000040000000x0000000001200x0200000000000x000041000000

0x0000000000300x0110000000000x0000000300000x0000001000000x0000420000000x0000000340000x030000000000

0x0000000000300x0000000001200x0000000300000x0000000340000x0000001000000x0000040000000x000041000000

0x0000420000000x0004000000000x0020000000000x0110000000000x0200000000000x0300000000000x100000000000

From Problem:

Cryptosorter0x084b6743c6530x3f9856235c580x223ad89654970x328d5487ca840x3982675a91850x928ab986ce460x92861184ff964

0x038d5487ca840x4892675a91850x147ab986ce460x92861184ff9640x084b6743c6530xcc1856235c580x982ad8965497

Encrypted Records in External Memory Decrypt Database with AES Sort Records in Ascending OrderEncrypt Sorted Records with AES

05/04/23 L11-12http://csg.csail.mit.edu/6.375

DRAM

Page 13: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

Cryptosort Architecture:

Level 6 Sorter

Level 1 Sorter

Level 2 Sorter

Level 3 Sorter

Level 4 Sorter

Level 5 Sorter

AESCores (2)

xor

xorMemory Write

Logic

Read Memory

Logic

Record Input

Record Output

Sort Tree

Feeder

PPC PLB Master

Function Unit:Sort Tree

PLB

DRAM

05/04/23 L11-13http://csg.csail.mit.edu/6.375

Use Merge Sort O(n log(n))

L-13

Page 14: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

05/04/23

Engineering the Merge Tree < < < <

< <

<

Each level merges 2n streams into n streams

Easy to para-meterize and build tree

Probably optimal for ASIC

L11-14http://csg.csail.mit.edu/6.375

Page 15: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

05/04/23

Refining the ModuleNaïve implementation: exponential resource usage Each comparator takes 3% of slices At most, fit 3 levels Key observation: Throughput is rate-limited by final 2-to-1 merge step

This means each level only needs to perform one comparison per cycle

L11-15http://csg.csail.mit.edu/6.375

Page 16: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

05/04/23

Sharing the Comparator: Idea

<

Loop:

Choose non-empty input pair corresponding to output fifo with room (scheduling)Compare the fifo headsDequeue the smaller one and put it on output fifo

We save area by having one comparator per levelBut we introduce a comparator scheduling problem

L11-16http://csg.csail.mit.edu/6.375

Page 17: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

Sharing the Comparator: Physical Implementation Issues

Not enough regs Each BRAM

contains multiple FIFOs

Aggressive clock Single cycle

scheduling is impossible

Enq happens several cycles after scheduling

Credit based flow control

05/04/23 L11-17http://csg.csail.mit.edu/6.375

Page 18: 2/19/2016  FPGAs K. Elliott Fleming Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology

05/04/23 L11-18http://csg.csail.mit.edu/6.375

DRAM,PLB,OPB

Level 6

Level 5

Level 4

Level 3Level 2

Level 1

AESCore 0

AESCore 1

Sort TreeRead Memory

Logic

Sort TreeWrite Memory

Logic

PPC

Layout:

Level 6 Sorter

Level 1 Sorter

Level 2 Sorter

Level 3 Sorter

Level 4 Sorter

Level 5 Sorter

AESCores (2)

xor

xorMemory Write

Logic

Read Memory

Logic

Record Input

Record Output

Sort Tree