en160: vlsi project cache memory simulation
TRANSCRIPT
EN160: VLSI Project
Spring 2008
Cache Memory Simulation
By: Holiano, Chaka, and Rotor
1 | P a g e
Index Title Page no.
0.0 Tables of Contents 1 1.0 System Overview 2 1.1 -System Diagram 2 1.2 -Specifications 2 1.3 -I/O List 3 1.4 -Direct-mapped Cache Algorithm 3 1.5 -Main Components Description 4 2.0 Components Descriptions 2.1 -Memory Cell 5 2.2 -Memory Cell Tests and Timing 6 2.3 -Demux 1-to-4 7 2.4 -Mux 4-to-1 7 2.5 -Demux 1-to-4 Tests 8 2.6 -Mux 1-to-4 Tests 9 2.7 -4-bit Tag Comparator 10 2.8 -4-bit Tag Comparator Tests 10 3.0 Final System 3.0 -Core Layout 11 3.1 -Core + Pads + Test Signal Layout 11 3.2 -Core Placement and Layout 12 3.3 -SPR Setup 12 3.4 -PADFrame Placement and Layout 13 3.5 -Placement and Routing Summary 13 3.6 -DRC Error Check 14 3.7 -DRE Geometry Error Details (disabled check) 15 4.0 Systems Testing 4.1 -Read/Write Test 16 4.2 -Hit/Miss Test 17 4.3 -Hit/Miss Timing Analysis 17 5.0 Conclusion 18 6.0 Pin Layout 19
2 | P a g e
System Diagram:
DataLineTag
Co
mp
ara
tor
Tag
Tag
Tag
Tag
Data
Data
Data
Data
Re
We
De
mu
xD
em
ux
Mux
Data_Out
Data_In
8bit
Line_In
2bit
Tag_In
4bits
4 bits
4 bits
4 bits
4 bits
2 bits
8 bits
32 bits
F1
F2
Status
Specification:
Data width: 8-bit Tag: 4-bit Address: 4-bit Index 2-bit Replacement Policy: Direct Mapped Cache Fill Perform the following functions:
Operation Read_en Write_en Status Data Out
Read-Hit 1 X 1 Mem[index]
Read-Miss 1 X 0 Previous Data
Write-Hit 0 1 X X
Write-Miss 0 1 X X
3 | P a g e
Inputs:
From CPU: New Data: 8-bit
Address: 6-bit Address<5:2> Tag: 4-bit Address<1:0> Index: 2-bit
Read enable: 1-bit Write enable: 1-bit
Outputs: To CPU: Dataout: 8-bit Status: 1-bit [Signifies when data is ready] Total pins required: 25pins + 1 Vdd + 1 gnd. Extra outputs: Ring Oscillator Test Signal: 1-bit Ring Oscillator Test Signal w/En: 2-bits Inverter: 2-bits Replacement Algorithm: Direct Mapped Cache Fill
This is the fastest algorithm for cache replacement where the cache takes 2 least significant bits of the
address as index. It essentially takes the main memory address and indexes the address by using
modulus.
4 | P a g e
Main components:
Muxes/Demuxes – The memory design simulates a cache memory, similar to a register memory. Muxes
are essential in ensuring that data from the memory cells can be selected for the output. The demuxes is
essential in ensuring that the signals between the components arrive at the correct memory cell for
proper operation.
Memory – Stores all the cache memory data, read or write only. In the design, read is prominent, you
cannot write while read is on, but you can read while write is on. The memory cells are designed using
flip-flops, and modified to have two signals for read and write enables. In each memory line/cell we
store 8-bits of actual data, and 4-bit for tag comparison.
Comparator – Compares the Tag of the data from the memory, and the Tag of the data requested.
5 | P a g e
Memory Cell:
1-bit Cell:
This is a single bit memory cell utilizing the Flip-flop design and a independent read or write enable
signals. Q is the output of the memory cell, and Q_b is the inverted output.
12-bit Memory Cell:
Cascaded single-bit cells to form one line. We have separate read and write signals for tag and for the
data.
6 | P a g e
Read/Write Test:
7 | P a g e
1-bit Demux 1-to-4
1-bit Mux 4-to-1
8 | P a g e
Demux 1-to-4 Test:
9 | P a g e
Mux 4-to-1 Test:
10 | P a g e
4-bit Tag Comparator:
Comparator Test:
11 | P a g e
Core Layout:
Core + Pads + Test Signals Layout:
12 | P a g e
Core Placement and Layout:
Core = 2448λ x 1232.5λ
SPR Setup:
3-metal Layers: H2: Metal3 V-H2: Via2 V: Metal2 H1-V: Via1 H1: Metal1
13 | P a g e
PadFrame Placement and Layout:
Placement and Routing Summary: SPR SUMMARY 'mAMIs050DL_AND_PADS.tdb' Date and time : 05/22/2008-21:12 1 Lambda = 1.000 Lambda = 3.333 Micron(s) Design file : E:\reda en160 proj BU\mAMIs050DL_AND_PADS.tdb Netlist file : Project\cache_pads.tpr Library file : mAMIs050DL_AND_PADS.tdb Placement optimization factor : 1.00 Routing optimization (3 layer) : Netlength and via reduction Standard Cell Place and Route done : - Core cell "Core" generated. - Padframe cell "Min_Frame" generated. - Chip cell "Library_Test_s" generated. ------------------------------------------------------------- Number of standard cells : 184 Number of signals in netlist : 336 Core size in Lambda : 2438.5 x 1128.5 Core area (Lambda^2) : 2751847.25 Frame size in Lambda : 5000.00 x 5000.00 Frame area (Lambda^2) : 25000000.00 Length of nets in core : 161951.00 Lambda Generated vias in core : 647 SPR elapsed time : 0:00:04
14 | P a g e
DRC Error Check:
L-Edit DRC SUMMARY REPORT
EXECUTION SUMMARY Execution Start Time May 22 2008 21:20:11
L-Edit Version L-Edit Win32 12.10.20060718.19:30:32 Rule Set Name MOSIS AMI 0.50UM - SUBMICRON RULES_ Last Updated 10/08/2001 File Name E:\reda en160 proj BU\mAMIs050DL_AND_PADS.tdb Cell Name Channel_4 (May 22 21:20:08 2008) User Name Rotor Computer Name SREDA-XP1 Memory used at start 46.5M
DRC JOB RESULTS SUMMARY
Total DRC Errors Generated 0 CPU Time 00:00:05 Real Time 00:00:05 Rules Executed 93
DRC Errors Generated by Rule Set
DRC Standard Rule Set 0
RUN-TIME DRC ERRORS AND WARNINGS
GEOMETRY FLAG SUMMARY ACUTE ANGLES Disabled
ALL ANGLE EDGES 0 OFFGRID Disabled ZERO-WIDTH WIRES 0 POLYGONS WITH OVER 199 VERTICES 0 WIRES WITH OVER 200 VERTICES 0 SELF INTERSECTIONS 0 WIRE JOIN/END STYLES 0
CELLS WITH ERRORS FOUND
RESULTS SUMMARY
DRC Errors Generated 0 CPU Time 00:00:05 REAL Time 00:00:05
Input Objects 404 (404) Rules Executed 93 Geometry Flags Executed 6 Disabled Rules 18
15 | P a g e
DRC Geometry Error Details (Acute Angles):
Error #1
Error #6
These error checks were disabled.
16 | P a g e
Systems Analysis:
Single-bit Read/Write Systems Test:
17 | P a g e
Status Hit/Miss Systems Test:
Timing Analysis:
Read time:
tdf = 17ns
tdr = 9ns
18 | P a g e
Conclusion:
We successfully implemented a Cache Memory Simulation device. All verification data appears to meet
the design criteria. There were unpredictable design errors on the way, but none that stopped the cache
memory to function normally. DRC errors turned up geometrical errors on the Padless frame generated
by SPR. The DRC errors also determined that there were some metal to metal spacing errors in the core
after SPR. There were also disconnected Metal layers on the Padless Frame that had to be manually
connected.
We have not yet expanded the design to include fetching control systems to a Main Memory system.
This is a functionality that can be added on in the future. We also have not expanded the cache size, to
determine the maximum size of cache that is possible using the type of memory cells that we have.
Other improvements would be to actually use 6T SRAM cell design for the memory cell instead of Flip-
flops that requires more area due to more transistors in each memory cell.
19 | P a g e
Data_out<0>
Pin Layout:
Test Signals:
Test 1: Test1_in Test1_out Test2: Test2_in Test2_out Test 3: Test3_out
Inverter 0 1 Ring Oscillator w/En
0 0 Ring Oscillator
1 0 1
Data<7>
Data<4>
Data<3>
Data<2>
Data<1>
Data<0>
Data<6>
Data<5>
Vd-d Data_out <3>
Data_out <5>
Data_out <6>
Data_out <7>
Data_out <4>
Data_out <1>
Data_out <2>
gnd D
ata_o
ut_sl<
7>
Data_
ou
t_sl<6>
Data_
ou
t_sl<1>
Data_
ou
t_sl<3>
Data_
ou
t_sl<2>
Status
Data_
ou
t_sl<5>
Data_
ou
t_sl<4>
Test1_out
Data_
ou
t_sl<0>
Test2_ou
t
Ind
ex<0
>
Ind
ex<1
>
Tag<
0>
Tag<
1>
Tag<
2>
Tag<
3>
Test3_out
Test
2_i
n
Test
1_i
n
Re
We