energy reduction for stt-ram using early write termination
DESCRIPTION
Energy Reduction for STT-RAM Using Early Write Termination. Ping Zhou , Bo Zhao, Jun Yang, *Youtao Zhang Electrical and Computer Engineering Department *Department of Computer Science University of Pittsburgh. ICCAD 2009. Introduction. Traditional SRAM Cache - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/1.jpg)
Energy Reduction for STT-RAM Using Early Write Termination
Ping Zhou, Bo Zhao, Jun Yang, *Youtao ZhangElectrical and Computer Engineering Department
*Department of Computer ScienceUniversity of Pittsburgh
1ICCAD 2009
![Page 2: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/2.jpg)
Introduction
• Traditional SRAM Cache– Limited by density, leakage and scalability
• STT-RAM Cache?– High density (~4x than SRAM)– High speed (same read speed as SRAM)– Non-volatile– No write endurance problem
2
![Page 3: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/3.jpg)
STT-RAM: Cell
• Magnetic Tunnel Junction (MTJ)• Relative magnetization direction
– Different resistances Logic 0 or 1
• Write: spin-polarized current– Much less write current than conventional MRAM
3
MgO MgO
High Resistance (Logic 1)
Low Resistance (Logic 0)
Reference Layer
Free Layer
![Page 4: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/4.jpg)
• Similar array structure as SRAM• Bidirectional write current
STT-RAM: Cell Array
4
write 0 write 1
MTJ MTJ
MTJMTJ
BL SL BL SLWL
WL
![Page 5: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/5.jpg)
STT-RAM Cache: Challenge
• High dynamic energy– 6~14x more energy per write access
[Dong et al. DAC 2008, Sun et al. HPCA 2009]
– Write contributes >74% of total dynamic energy
5
74.2%
Need to reduce write energy in STT-RAM cache!
![Page 6: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/6.jpg)
Opportunity
• Many bits are unchanged in a write access – Redundant bit-writes [Zhou et al. ISCA 2009]
• Redundant bit-writes in 16MB STT-RAM cache
6
88%
How to exploit this opportunity?
![Page 7: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/7.jpg)
Exploiting Redundant Bit-Writes
• Need to know the old value…• Read & compare before write [Zhou et al. ISCA 2009]
• Can we do better?
7
![Page 8: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/8.jpg)
Observation
• MTJ resistance changes abruptly by the end of write cycle– Cell still holds old value at
early stage of write cycle
• Read is much faster than write
8
Y. Chen et al. ISQED 2008
Possible to sense the old value at early stage of write cycle
![Page 9: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/9.jpg)
Early Write Termination: Idea
• On a write access…– Start write cycle like normal– Sense the old value at early stage– Terminate the write cycle if old value is same as
new value
• Does not require a preceding read & compare!
9
![Page 10: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/10.jpg)
EWT Circuit
10
MTJ
pass pass
Vsense1 Vsense0
write 0write 1
conversionconversionVin1 Vin0
Conversion circuit-Basic differential amplifier-Input lower Output higher-Input higher Output lower
Rwire Rwire
Vsense0Vsense1
Vref0Vref1
Sense-Amp
New value
Terminate?
SLBLWL
![Page 11: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/11.jpg)
How EWT Works?
11
MTJ
pass pass
Vsense1 Vsense0
lowwrite 0
high
conversionconversionVin1 Vin0
Rwire Rwire
Old Value New Value Vsense0 SA output Action
0 0 higher 1 Terminate
Vin0
lower
1 0 lower 0 Continuehigher
0.536ns
SLBLWL
![Page 12: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/12.jpg)
Advantages of EWT
• No performance penalty!– Carried within a write cycle– No need to read & compare before a write– Write access may finish early Slight speedup
• Low energy overhead (3.23%)• Low complexity• Easy to integrate with existing designs
12
![Page 13: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/13.jpg)
MODELING STT-RAM AND EWT
13
![Page 14: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/14.jpg)
Latency Modeling
• Cell– Derived from recent works [Dong et al. DAC 2008]
• Peripheral– Derived from CACTI
[Thoziyoor et al. ISCA 2008, Dong et al. DAC 2008]
14
![Page 15: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/15.jpg)
Dynamic Energy Modeling
• Baseline: Derived from recent works[Dong et al. DAC 2008]
• EWT– Read energy: same as baseline– Write energy: variable
15
EWTwriteE peripheralE overheadE cellsE
peripheralE
overheadE
cellsE
Peripheral (derived from CACTI)
Extra energy introduced by EWT circuits (HSPICE)
Nchanged × Echanged + Nunchanged × Eunchanged
Cell change Terminated cell change
![Page 16: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/16.jpg)
Leakage Energy Modeling
• STT-RAM is non-volatile– Power gate the idle banks– Assume 1ns delay to “wake up”– Used in both baseline and EWT
16
![Page 17: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/17.jpg)
Experimental Setup
• Simics-based simulator– 4-core CMP, 1GHz– 32KB private L1 cache– 16MB shared L2 cache using STT-RAM, 16 banks– 4GB main memory– Enhanced cache model: STT-RAM & EWT
17
![Page 18: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/18.jpg)
Results: Performance
18
• Normalized Cycle-Per-Instruction (CPI)
1% speedup
Slight performance improvement
![Page 19: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/19.jpg)
Results: Write Energy
19
• Normalized write energy
Up to 80% write energy reduction
70% saving
![Page 20: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/20.jpg)
Results: Dynamic Energy
20
• Normalized dynamic energy
52% reductionEWT
Base
![Page 21: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/21.jpg)
Results: Total Energy
• Normalized total energy
21
33% reduction
![Page 22: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/22.jpg)
Results: Energy-Delay Product
• Normalized ED2
22
34% reduction
![Page 23: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/23.jpg)
Conclusion
• Address a key challenge to STT-RAM cache: dynamic energy
• EWT: Exploit redundant bit-writes without performance penalty– Low overhead and complexity
• Modeling and evaluation– Up to 80% write energy reduction– 34% ED2 reduction
23
![Page 24: Energy Reduction for STT-RAM Using Early Write Termination](https://reader033.vdocuments.net/reader033/viewer/2022051402/56815884550346895dc5e67d/html5/thumbnails/24.jpg)
THANK YOU!
24