eec 116 lecture #11: implementation strategiesramirtha/eec116/f11/lecture11.pdf• power supply:...

50
EEC 116 Lecture #11: Implementation Strategies Rajeevan Amirtharajah Bevan Baas Zhiyi Yu University of California, Davis

Upload: others

Post on 06-Aug-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

EEC 116 Lecture #11:Implementation Strategies

Rajeevan Amirtharajah Bevan BaasZhiyi Yu

University of California, Davis

Page 2: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 2

Announcements

• Lab 5 due Wednesday, Nov. 23

• Homework 5 due Monday, Nov. 28

• Lab 6 due Friday, Dec. 2

Page 3: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 3

Outline

• Review and Finish: Memories

• Implementation Strategies: Rabaey Ch. 1, 8 (Kang & Leblebici, Ch. 1)

Page 4: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 4

Abstraction of Design Complexity

• Design complexity– Typically tens of transistors in analog circuits

• Each is normally hand crafted along with placement and wiring

– Hundreds of transistors• Each can be hand crafted

– Thousands to 100s of thousands of transistors• Must find regularity in structure and exploit it (re-use cells)• Ex: memory

– Millions to billions of transistors• Must find high-level regularity in structure and exploit it (re-

use modules and subsystems)• Ex: System on Chip (SOC)

Page 5: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 5

Design Abstraction Levels

n+n+S

GD

+

DEVICE

CIRCUIT

GATE

MODULE

SYSTEM

FUNCTIONAL UNIT, or

Page 6: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 6

Abstraction of Design Complexity

• Levels

– Device

– Circuit

– Gate

– Module or functional unit (e.g., adder, memory, etc.)

– Sub-system (e.g., processor, display driver, network interface, etc.)

• Methods to abstract complexity

– Sophisticated Computer-Aided-Design (CAD) tools

– Standard cell libraries

Page 7: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 7

Hierarchical Abstraction

• Example: While designing at the gate level, we do not consider what is inside each gate

AND

AND

AND

OR

OR Carry-Out

AC

AB

BC

Page 8: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 8

Why Learn About Circuits and Layout Then?

• Best designers can:

– Build model abstractions

– Understand limitations of models• Wire or interconnect performance• Changes with technology scaling

• Abstractions limit maximum attainable performance and energy-efficiency

– Multi-disciplinary view needed

• Troubleshooting and debugging

Page 9: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 9

Design Aspects that “Defy Hierarchy”

• Clock distribution

– Timing skew

• Power distribution

– Sufficient current handling

– Adequate noise suppression

Page 10: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 10

Full Custom

• All transistors and interconnect drawn by hand

• Full control over sizing and layout

• Highest area density and higher performance

• Longest time to design “maturity”

[figure from S. Hauck]

Page 11: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 11

Full Custom Design Example

• Multiplier Chip

Page 12: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 12

Standard Cell

• Constant-height cells

• Regular “pin”locations

• Cells represent gates, latches, flip-flops

• Placed and routed by software

[figure from S. Hauck]

Page 13: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 13

Standard Cell

• Channels for routing only in older technologies (not necessary with modern processes with many levels of interconnect)

[figure from S. Hauck]

Page 14: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 14

Standard Cell — Example

[Brodersen92]

• Application-Specific Integrated Circuit (ASIC)

• Hardwired combination of standard cells implements a fixed logic function or FSM (e.g., video codec)

Page 15: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 15

Combination Standard Cell and Full Custom

[figure from S. Hauck]

Page 16: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 16

“Soft” MacroModules

Synopsys DesignCompilerSource: Digital Integrated Circuits, 2nd ©

Page 17: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 17

Gate Array

• Polysilicon and diffusion are the same for all designs

• Metal layers customized for particular chips

n‐type diffusion

polysilicon

p‐type diffusion

PMOS transistor

NMOS transistor

Page 18: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 18

Unprogrammed Gate Array

Isolation Provided

by Spacing

Page 19: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 19

Programmed Gate Array

• Metal connections made to create particular function

• What logic gate is this?

Page 20: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 20

Gate Array — Sea-of-Gates

rows of

cells

routing channel

uncommitted

VD D

GND

polysilicon

metal

possiblecontact

In1 In2 In3 In4

Out

Uncommited Cell

Committed Cell (4-input NOR)

Source: Digital Integrated Circuits, 2nd ©

Page 21: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 21

Unprogrammed Sea-of-Gates Array

Page 22: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 22

Programmed Sea-of-Gates Array

Isolation Provided by Cutoff

Bias

Page 23: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 23

Gate Array

• Polysilicon and diffusion the same for all designs

• 0.125 um example

[figure from LETI]

Page 24: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 24

Field Programmable Gate Array (FPGA)

• Metal layers now programmable with SRAM instead of hardwired during manufacture as with a gate array

• Cells contain general programmable logic and registers

[figure from S. Hauck]

Page 25: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 25

Field Programmable Gate Array (FPGA)

• Chips can now be “designed” with software

• User pays for up-front chip design costs

– All costs: full-custom, standard cell

– Half: gate array

– Shared: FPGA

• User writes code (e.g., Verilog), compiles it, and downloads into the chip

– Can be used to prototype standard cell (ASIC) design

Page 26: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 26

Heterogeneous Programmable Platforms

Xilinx Vertex-II Pro

Courtesy Xilinx

High-speed I/O

Embedded PowerPC Embedded memories

Hardwired multipliers

FPGA Fabric

Page 27: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 27

Design at a Crossroad: System-on-a-Chip

RAM

500 k Gates FPGA+ 1 Gbit DRAMPreprocessing

Multi-SpectralImager

μCsystem+2 GbitDRAMRecog-nition

Ana

log

64 SIMD ProcessorArray + SRAM

Image Conditioning100 GOPS

• Often used in embedded applications where cost,performance, and energy are big issues!

• DSP and control

• Mixed-mode

• Combines programmable and application-specific modules

• Software plays crucial role

Page 28: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 28

A SoC Example: High Definition TV Chip

Courtesy: Philips

Page 29: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 29

Standard Cell Based IC vs. Custom Design IC

• Standard cell based IC: – Design using standard cells

– Standard cells come from library provider

– Many EDA tools to automate this flow

– Shorter design time

• Custom designed IC: – Designed by individual engineers using manual

process

– Higher performance

Page 30: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 30

• Front end– System specification and architecture

– HDL coding & behavioral simulation

– Synthesis & gate level simulation

• Back end– Placement and routing

– DRC, LVS

– Dynamic simulation and static analysis

Standard Cell Based VLSI Design Flow

Page 31: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 31

Asynchronous Array of Simple Processors

• AsAP Project by Prof. Baas’ VLSI Computation Lab

• A processing chip containing multiple uniform simple processor elements

• Each processor has its local clock generator

• Each processor can communicate with its neighbor processors using dual-clock first-in first-out buffers (FIFOs)

Page 32: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 32

Diagram of a 3x3 AsAP

More information: http://www.ece.ucdavis.edu/vcl/asap/

Page 33: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 33

Simple Diagram of Front-End Design Flow

System Specification

RTL Coding Synthesis

Gate Level (Structural)

Code

INV (.in (a), .out (a_inv));AND (.in1 (a_inv), .in2 (b),. out (c));Ex: c = !a & b

Page 34: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 34

Simple Diagram of Back-End Design Flow

Gate level Verilog from synthesis

Place &

Route

Final layout

(sent for fabrication)

DRC

Gate level Verilog LVS

Timing information

Gate level dynamic and/or static analysis

Design rulecheck

Layout vs.Schematic check

Page 35: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 35

Back-end Design of AsAP

• Technology: CMOS 0.18 um

• Standard cell library: Artisan

• Tools

– Placement & Route: Cadence Encounter

– Final layout edit: icfb

– DRC & LVS: Calibre

– Static timing analysis: Primetime

Page 36: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 36

Flow of Placement and Routing

• Import needed files

• Floorplan

• Placement & in-place optimization

• Clock tree generation

• Routing

Page 37: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 37

Import Needed Files

• Gate level verilog (.v)

• Geometry information (.lef)

• Timing information (.lib)

INV (.in (a), .out (a_inv));AND (.in1 (a_inv), .in2 (b), .out (c));

INV: 1um width AND: 2 um width

INV: 1ns delay; AND: 2 ns delay

INV ANDa

b

C

Delay (a->c): 1ns + 2ns = 3ns

Page 38: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 38

Floorplan

• Size of chip

• Location of pins

• Location of main blocks

• Power supply: deliver enough power for each gate

Power Supply (1.8V)Current

Gate 1 Gate 2 Gate 3 Gate 4

1.75V 1.7V

1.65V (need another power line)

Voltage drop equation: V2 = V1 – I * RVSS

VDD (Metal)

Page 39: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 39

Single Processor Floorplan

Page 40: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 40

Placement and In-Placement Optimization

• Placement: place the gates on floorplan

• In-placement optimization

– Why: timing information difference between synthesis and layout (wire delay)

– How: change gate size, insert buffers

– Should not change the circuit function!!

Page 41: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 41

Single Processor: Gate Placement

Page 42: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 42

Clock Tree

• Main parameters: skew, delay, transition time

Page 43: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 43

Single Processor Clock Tree

Page 44: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 44

Routing

• Connect the gates using wires

• Two steps

– Connect the global signals (power)

– Connect other signals

Page 45: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 45

Single Processor Layout

Area: 0.8mm x 0.8mm

Estimated clock frequency:450 MHz

Page 46: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 46

First Generation 6x6 AsAP Layout

Single Processor

Area: 30mm2

36 processors114 I/O pads

Page 47: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 47

Verification After Layout

• DRC (Design Rule Check)• LVS (Layout vs. Schematic)

– GDS vs. (Verilog + Spectre/Spice module)

• Gate level Verilog dynamic simulation– Mainly check the function

– Different with synthesis result: clock, OPT

• Gate level static analysis– Check all the paths

Page 48: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 48

Useful Design and Simulation Tools• Dynamic HDL Simulation

– Modelsim (Mentor), NC-verilog (Cadence), Active-HDL

• Dynamic Analog Simulation– Spectre (Cadence), Hspice (Synopsys)

• Synthesis– RTL Compiler (Cadence)– Design-Compiler, Design-Analyzer (Synopsys)

• Placement & Routing– Encounter & Virtuoso (Cadence)– Astro (Synopsys)

Page 49: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 49

Useful Verification Tools

• DRC & LVS– Calibre (Mentor)– Diva (Cadence – used in EEC 116/119AB)– Dracula (Cadence)

• Static Analysis– Primetime (Synsopsys)

Page 50: EEC 116 Lecture #11: Implementation Strategiesramirtha/EEC116/F11/lecture11.pdf• Power supply: deliver enough power for each gate Power Supply (1.8V) Current Gate 1 Gate 2 Gate 3

Amirtharajah, EEC 116 Fall 2011 50

Next Topics: Low Power and DFM

• Low power design principles and circuit techniques

– Voltage scaling, activity factor reduction, clock gating, leakage reduction

• Design for Manufacturability

– Parameter variations in CMOS digital circuits

– Yield maximization and worst-case design