memory design - duke electrical and computer...
TRANSCRIPT
1
ECE 261 Krish Chakrabarty 1
Memory Design
• Memory Types• Memory Organization• ROM design• RAM design• PLA design
Adapted from J. M. Rabaey, A. Chandrakasan and B. Nikolic, Digital Integrated Circuits, 2nd ed. Copyright 2003 Prentice Hall/Pearson.
ECE 261 Krish Chakrabarty 2
Semiconductor Memory Classification
Read-Write MemoryNon-VolatileRead-Write
MemoryRead-Only Memory
EPROM
E2PROM
FLASH
RandomAccess
Non-RandomAccess
SRAM
DRAM
Mask-Programmed
Programmable (PROM)
FIFO
Shift Register
CAM
LIFO
2
ECE 261 Krish Chakrabarty 3
Memory Timing: Definitions
Write cycleRead access Read access
Read cycle
Write access
Data written
Data valid
DATA
WRITE
READ
ECE 261 Krish Chakrabarty 4
Memory Architecture: Decoders
Word 0
Word 1
Word 2
Word N2 2
Word N2 1
Storagecell
M bits M bits
N
w o r d s
S0
S1
S2
SN2 2
A 0
A 1
AK2 1
K 5 log2N
SN2 1
Word 0
Word 1
Word 2
Word N2 2
Word N2 1
Storagecell
S0
Input-Output(M bits)
Intuitive architecture for N x M memoryToo many select signals:
N words == N select signalsK = log2N
Decoder reduces the number of select signals
Input-Output(M bits)
D e c o d e r
3
ECE 261 Krish Chakrabarty 5
Row
Dec
oder
Bit line2L 2 K
Word line
AKAK1 1
AL 2 1
A0
M.2K
AK2 1
Sense amplifiers / Drivers
Column decoder
Input-Output(M bits)
Storage cell
Array-Structured Memory ArchitectureProblem: ASPECT RATIO or HEIGHT >> WIDTH
Amplify swing torail-to-rail amplitude
Selects appropriateword
ECE 261 Krish Chakrabarty 6
Hierarchical Memory Architecture
Advantages:Advantages:1. Shorter wires within blocks1. Shorter wires within blocks2. Block address activates only 1 block => power savings2. Block address activates only 1 block => power savings
Globalamplifier/driver
Controlcircuitry
Global data busBlock selector
Block 0
Rowaddress
Columnaddress
Blockaddress
Block i Block P 2 1
I/O
4
ECE 261 Krish Chakrabarty 7
Read-Only Memory CellsWL
BL
WL
BL
1WL
BL
WL
BL
WL
BL
0
VDD
WL
BL
GND
Diode ROM MOS ROM 1 MOS ROM 2
ECE 261 Krish Chakrabarty 8
MOS OR ROM
WL[0]
VDD
BL[0]
WL[1]
WL[2]
WL[3]
Vbias
BL[1]
Pull-down loads
BL[2] BL[3]
VDD
5
ECE 261 Krish Chakrabarty 9
ROM Example• 4-word x 6-bit ROM
– Represented with dot diagram– Dots indicate 1’s in ROM
Word 0: 010101Word 1: 011001Word 2: 100101Word 3: 101010
ROM Array
2:4DEC
A0A1
Y0Y1Y2Y3Y4Y5
weakpseudo-nMOS
pullups
Looks like 6 4-input pseudo-nMOS NORs
ECE 261 Krish Chakrabarty 10
MOS NOR ROM
WL[0]
GND
BL [0]
WL [1]
WL [2]
WL [3]
VDD
BL [1]
Pull-up devices
BL [2] BL [3]
GND
6
ECE 261 Krish Chakrabarty 11
MOS NOR ROM Layout
Programmming using theActive Layer Only
Polysilicon
Metal1
Diffusion
Metal1 on Diffusion
Cell (9.5λ x 7λ)
ECE 261 Krish Chakrabarty 12
MOS NOR ROM Layout
Polysilicon
Metal1
Diffusion
Metal1 on Diffusion
Cell (11λ x 7λ)
Programmming usingthe Contact Layer Only
7
ECE 261 Krish Chakrabarty 13
MOS NAND ROM
All word lines high by default with exception of selected row
WL [0]
WL [1]
WL [2]
WL [3]
VDD
Pull-up devices
BL [3]BL [2]BL [1]BL [0]
ECE 261 Krish Chakrabarty 14
MOS NAND ROM Layout
No contact to VDD or GND necessary;
Loss in performance compared to NOR ROMdrastically reduced cell size
Polysilicon
Diffusion
Metal1 on Diffusion
Cell (8λ x 7λ)Programmming usingthe Metal-1 Layer Only
8
ECE 261 Krish Chakrabarty 15
NAND ROM LayoutCell (5λ x 6λ)
Polysilicon
Threshold-alteringimplant
Metal1 on Diffusion
Programmming usingImplants Only
ECE 261 Krish Chakrabarty 16
Decreasing Word Line Delay
Metal bypass
Polysilicon word lineK cells
Polysilicon word lineWLDriver
(b) Using a metal bypass
(a) Driving the word line from both sides
Metal word line
WL
9
ECE 261 Krish Chakrabarty 17
Precharged MOS NOR ROM
PMOS precharge device can be made as large as necessary,but clock driver becomes harder to design.
WL [0]
GND
BL [0]
WL [1]
WL [2]
WL [3]
VDD
BL [1]
Precharge devices
BL [2] BL [3]
GND
pref
ECE 261 Krish Chakrabarty 18
Read-Write Memories (RAM)STATIC (SRAM)
DYNAMIC (DRAM)
Data stored as long as supply is appliedLarge (6 transistors/cell)FastDifferential
Periodic refresh requiredSmall (1-3 transistors/cell)SlowerSingle Ended
10
ECE 261 Krish Chakrabarty 19
6-transistor CMOS SRAM Cell WL
BL
VDD
M5M6
M4
M1
M2
M3
BL
ECE 261 Krish Chakrabarty 20
6T-SRAM — Layout
VDD
GND
WL
BLBL
M1 M3
M4M2
M5 M6
11
ECE 261 Krish Chakrabarty 21
3-Transistor DRAM Cell
No constraints on device ratiosReads are non-destructiveValue stored at node X when writing a “1” = VWWL-VTn
WWL
BL1
M1 X
M3
M2
CS
BL2
RWL
VDD
VDD 2 VT
DVVDD 2 VTBL 2
BL 1
X
RWL
WWL
ECE 261 Krish Chakrabarty 22
3T-DRAM — Layout
BL2 BL1 GND
RWL
WWL
M3
M2
M1
12
ECE 261 Krish Chakrabarty 23
1-Transistor DRAM Cell
Write: CS is charged or discharged by asserting WL and BL.Read: Charge redistribution takes places between bit line and storage capacitance
Voltage swing is small; typically around 250 mV.
M1
CS
WL
BL
CBL
VDD 2 VT
WL
X
sensing
BL
GND
Write 1 Read 1
VDD
VDD /2 VDD /2
∆V BL VPRE– VBIT VPRE–CS
CS CBL+------------= =V
ECE 261 Krish Chakrabarty 24
DRAM Cell Observations1T DRAM requires a sense amplifier for each bit line, due to
charge redistribution read-out.DRAM memory cells are single ended in contrast to SRAM
cells.The read-out of the 1T DRAM cell is destructive; read and
refresh operations are necessary for correct operation.Unlike 3T cell, 1T cell requires presence of an extra
capacitance that must be explicitly included in the design.When writing a “1” into a DRAM cell, a threshold voltage is
lost. This charge loss can be circumvented by bootstrapping the word lines to a higher value than VDD
13
ECE 261 Krish Chakrabarty 25
1-T DRAM Cell
Uses Polysilicon-Diffusion CapacitanceExpensive in Area
M1 wordline
Diffusedbit line
Polysilicongate
Polysiliconplate
Capacitor
Cross-section Layout
Metal word line
Poly
SiO2
Field Oxiden+ n+
Inversion layerinduced byplate bias
Poly
ECE 261 Krish Chakrabarty 26
Periphery
DecodersSense AmplifiersInput/Output BuffersControl / Timing Circuitry
14
ECE 261 Krish Chakrabarty 27
Row DecodersCollection of 2M complex logic gatesOrganized in regular and dense fashion
(N)AND Decoder
NOR Decoder
ECE 261 Krish Chakrabarty 28
Hierarchical Decoders
• • •
• • •
A2A2
A2A3
WL 0
A2A3A2A3A2A3
A3 A3A 0A0
A0A1A0A1A0A1A0A1
A1 A1
WL 1
Multi-stage implementation improves performance
NAND decoder usingNAND decoder using22--input preinput pre--decodersdecoders
15
ECE 261 Krish Chakrabarty 29
Dynamic DecodersPrecharge devices
VDD φ
GND
WL3
WL2
WL1
WL0
A0A0
GND
A1A1φ
WL3
A0A0 A1A1
WL 2
WL 1
WL 0
VDD
VDD
VDD
VDD
2-input NOR decoder 2-input NAND decoder
ECE 261 Krish Chakrabarty 30
4-to-1 tree based column decoder
Number of devices drastically reducedDelay increases quadratically with # of sections; prohibitive for large decoders
buffersprogressive sizingcombination of tree and pass transistor approaches
Solutions:
BL 0 BL 1 BL 2 BL 3
D
A 0
A 0
A1
A 1
16
ECE 261 Krish Chakrabarty 31
PLA versus ROMProgrammable Logic Arraystructured approach to random logic“two level logic implementation”
NOR-NOR (product of sums)NAND-NAND (sum of products)
IDENTICAL TO ROM!
Main differenceROM: fully populatedPLA: one element per minterm
Note: Importance of PLA’s has drastically reduced1. slow2. better software techniques (mutli-level logic
synthesis)But …
ECE 261 Krish Chakrabarty 32
Programmable Logic Array
GND GND GND GND
GND
GND
GND
VDD
VDD
X 0X 0 X 1 f0 f1X 1 X 2X 2
AND-plane OR-plane
Pseudo-NMOS PLA
17
ECE 261 Krish Chakrabarty 33
Dynamic PLA
GND
GNDVDD
VDD
X 0X 0 X 1 f0 f1X 1 X 2X 2
ANDf
ANDf
ORf
ORf
AND-plane OR-plane
ECE 261 Krish Chakrabarty 34
PLA LayoutVDD GNDφ
And-Plane Or-Plane
f0 f1x0 x0 x1 x1 x2 x2Pull-up devices Pull-up devices
18
ECE 261 Krish Chakrabarty 35
CAMs• Extension of ordinary memory (e.g. SRAM)
– Read and write memory as usual– Also match to see which words contain a key
CAM
adr data/key
matchread
write
ECE 261 Krish Chakrabarty 36
10T CAM Cell
• Add four match transistors to 6T SRAM– 56 x 43 λ unit cell
bit bit_b
word
match
cell
cell_b
19
ECE 261 Krish Chakrabarty 37
CAM Cell Operation
• Read and write like ordinary SRAM• For matching:
– Leave wordline low– Precharge matchlines– Place key on bitlines– Matchlines evaluate
• Miss line– Pseudo-nMOS NOR of match lines– Goes high if no words match
row decoder
weak
missmatch0
match1
match2
match3
clk
column circuitry
CAM cell
address
data
read/write