references - springer978-1-4757-5808-5/1.pdf · dependent circuits based on symbolic computation of...

16
References [1] A. Agarwal, S. D. Pudar, "Column-Associative Caches: A Technique for Re- ducing the Miss Rate of Direct-Mapped Caches," ISCA-93: ACM/IEEE Inter- national Symposium on Computer Architecture, pp. 179-180, San Diego, CA, May 1993. [2] D. H. Albonesi, "Selective Cache Ways: On-Demand Cache Resource Alloca- tion," IEEE International Symposium on Microarchitecture, pp.248-259, Haifa, Israel, November 1999. [3] M. Alidina, J. Monteiro, S. Devadas, A. Ghosh, M. Papaefthymiou, "Precomputation-Based Sequential Logic Optimization for Low Power," IEEE Transactions on VLSI Systems, Vol. 2, No. 4, pp. 426-436, December 1994. [4] ARM Corporation, ARM Software Development Toolkit, Version 2.50, Refer- ence Guide, ARM DUI 0041C, Chapter 12, November 1998. [5] Artisan Components, Process-Perfect SRAM Generator Datasheet, http://www.artisan.com. 1999. [6] R. 1. Bahar, E. T. Lampe, E. Macii, "Power Optimization of Technology- Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems, Vol. 5, No. 3, pp. 267-293, July 2000. [7] R. I. Bahar, H. Cho, G. D. Hachtel, E. Macii, F. Somenzi, "Symbolic Timing Analysis and Re-Synthesis for Low Power of Combinational Circuits Contain- ing False Paths," IEEE Transactions on CAD/ICAS, Vol. 16, No. 10, pp. 1101- 1115, October 1997. [8] R. 1. Bahar, G. Albera, S. Manne, "Power and Performance Tradeoffs Using Various Caching Strategies," ISLPED-98: ACM/IEEE International Sympo- sium on Low Power Electronics and Design, pp. 64-69, Monterey, CA, Au- gust 1998. [9] R. S. Bajwa, M. Hiraki, H. Kojima, D. J. Gorny, K. Nitta, A. Shridhar, K. Seki, K. Sasaki, "Instruction Buffering to Reduce Power in Processors for Signal 129

Upload: others

Post on 18-Mar-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

References

[1] A. Agarwal, S. D. Pudar, "Column-Associative Caches: A Technique for Re­ducing the Miss Rate of Direct-Mapped Caches," ISCA-93: ACM/IEEE Inter­national Symposium on Computer Architecture, pp. 179-180, San Diego, CA, May 1993.

[2] D. H. Albonesi, "Selective Cache Ways: On-Demand Cache Resource Alloca­tion," IEEE International Symposium on Microarchitecture, pp.248-259, Haifa, Israel, November 1999.

[3] M. Alidina, J. Monteiro, S. Devadas, A. Ghosh, M. Papaefthymiou, "Precomputation-Based Sequential Logic Optimization for Low Power," IEEE Transactions on VLSI Systems, Vol. 2, No. 4, pp. 426-436, December 1994.

[4] ARM Corporation, ARM Software Development Toolkit, Version 2.50, Refer­ence Guide, ARM DUI 0041C, Chapter 12, November 1998.

[5] Artisan Components, Process-Perfect SRAM Generator Datasheet, http://www.artisan.com. 1999.

[6] R. 1. Bahar, E. T. Lampe, E. Macii, "Power Optimization of Technology­Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems, Vol. 5, No. 3, pp. 267-293, July 2000.

[7] R. I. Bahar, H. Cho, G. D. Hachtel, E. Macii, F. Somenzi, "Symbolic Timing Analysis and Re-Synthesis for Low Power of Combinational Circuits Contain­ing False Paths," IEEE Transactions on CAD/ICAS, Vol. 16, No. 10, pp. 1101-1115, October 1997.

[8] R. 1. Bahar, G. Albera, S. Manne, "Power and Performance Tradeoffs Using Various Caching Strategies," ISLPED-98: ACM/IEEE International Sympo­sium on Low Power Electronics and Design, pp. 64-69, Monterey, CA, Au­gust 1998.

[9] R. S. Bajwa, M. Hiraki, H. Kojima, D. J. Gorny, K. Nitta, A. Shridhar, K. Seki, K. Sasaki, "Instruction Buffering to Reduce Power in Processors for Signal

129

Page 2: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

130 MEMORY DESIGN TECHNIQUES

Processing," IEEE Transactions on VLSI Systems, Vol. 5, No. 4, pp. 417-424, December 1998.

[10] N. Bellas, I. Hajj, C. Polychronopoulos, G. Stamoulis, "Architectural and Com­piler Support for Energy Reduction in the Memory Hierarchy of High Perfor­mance Microprocessors," ISLPED-98: ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 64-69, Monterey, CA, August 1998.

[11] L. Benini, P. Siegel, G. De Micheli, "Automatie Synthesis of Gated Clocks for Power Reduction in Sequential Circuits," IEEE Design and Test 0/ Computers, Vol. 11, No. 4, pp. 32-40, December 1994.

[12] L. Benini, G. De Micheli, "State Assignment for Low Power Dissipation," IEEE Journal 0/ Solid State Circuits, Vol. 30, No. 3, pp. 258-268, March 1995.

[13] L. Benini, G. De Micheli, "Transformation and Synthesis of FSMs for Low Power Gated Clock Implementation," IEEE Transactions on CAD/ICAS, Vol. 15, No. 6, pp. 630-643, June 1996.

[14] L. Benini, G. De Micheli, E. Macii, D. Sciuto, C. Silvano, "Asymp­totic Zero-Transition Activity Encoding for Address Busses in Low-Power Microprocessor-Based Systems," GLS- VLSI-97: IEEE/ACM 7th Great Lakes Symposium on VLSI, pp. 77-82, Urbana-Champaign, IL, March 1997.

[15] L. Benini, G. De Micheli, E. Macii, D. Sciuto, C. Silvano, "Address Bus En­co ding Techniques for System-Level Power Optimization," DATE-98: IEEE Design Automation and Test in Europe, pp. 861-866, Paris, France, Febru­ary 1998.

[16] L. Benini G. De Micheli, E. Macii, M. Poncino, S. Quer, "Power Optimization of Core-Based Systems By Address Bus Encoding," IEEE Transactions on VLSI Systems, Vol. 6, No. 4, pp. 554-562, December 1998.

[17] L. Benini, G. De Micheli, Dynamic Power Management 0/ Electronic Systems, Kluwer Academic Publishers, 1998.

[18] L. Benini, F. Vermeulen, G. De Micheli, "Finite State Machine Partitioning for Low Power Consumption," ISCAS-98: IEEE International Symposium on Circuits and Systems, Vol. 2, pp. 5-8, Monterey, CA, May 1998.

[19] L. Benini, G. De Micheli, A. Lioy, E. Macii, G. Odasso, M. Poncino, "Synthesis of Power-Managed Sequential Components Based on Computational Kernel Extraction," IEEE Transactions on CAD/ICAS, Vol. 20, No. 9, pp. 1118-1131, September 2001.

[20] L. Benini, G. De Micheli, E. Macii, M. Poncino, R. Scarsi, "Symbolic Synthesis of Clock-Gating Logic for Power Optimization of Synchronous Controllers," ACM Transactions on Design Automation 0/ Electronic Systems, Vol. 4, No. 4, pp. 351-375, October 1999.

[21] L. Benini, G. Paleologo, A. Bogliolo, G. De Micheli, "Policy Optimization for Dynamic Power Management," IEEE Transactions on CA D/ICAS, Vol. 18, No. 6, pp.813-833, June 1999.

Page 3: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

REFERENCES 131

[22] L. Benini, A. Macii, E. Macii, M. Poncino, R. Scarsi, "Architectures and Syn­thesis Algorithms for Power-Efficient Bus Interfaces", IEEE Transactions on CAD/ICAS, Vol. 19, No. 9, pp. 969-980, September 2000.

[23] L. Benini, G. De Micheli, A. Macii, E. Macii, M. Poncino, R. Scarsi, "Glitch Power Minimization by Selective Gate Freezing," IEEE Transactions on VLSI Systems, Vol. 8, No. 3, pp. 287-299, June 2000.

[24] L. Benini, G. Castelli, A. Macii, E. Macii, R. Scarsi, "Battery-Driven Dynamic Power Management of Portable Systems," ISSS-OO: IEEE International Sym­posium on System Synthesis, pp. 25-30, Madrid, Spain, September 2000.

[25] L. Benini, A. Bogliolo, G. De Micheli, "A Survey of Design Techniques for System-Level Dynamic Power Management," IEEE Transactions on VLSI Sys­tems, Vol. 8, No. 3, pp. 299-316, June 2000.

[26] L. Benini, G. De Micheli, "System-Level Power Optimization: Techniques and Tools," ACM Transactions on Design Automation of Electronic Systems, Vol. 5, No. 2, pp. 115-192, April 2000.

[27] L. Benini, A. Macii, E. Macii, M. Poncino, "Increasing Energy Efficiency of Embedded Systems by Application-Specific Memory Hierarchy Generation," IEEE Design and Test of Computers, Vol. 17, No. 2, pp. 74-85, April 2000.

[28] L. Benini, A. Macii, E. Macii, M. Poncino, "Selective Instruction Compres­sion for Memory Energy Reduction in Embedded Systems," ISLPED-99: ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 206-211, San Diego, CA, August 1999.

[29] L. Benini, A. Macii, M. Poncino, "A Recursive Algorithm for Low-Power Mem­ory Partitioning," ISLPED-OO: ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 78-83, Rapallo, Italy, July 2000.

[30] L. Benini, A. Macii, A. Nannarelli, "Cached-Code Compression for Energy Minimization in Embedded Processors," ISLPED-Ol: ACM/IEEE Interna­tional Symposium on Low Power Electronics and Design, pp. 322-327, Hunt­ington Beach, CA, August 2001.

[31] L. Benini, L. Macchiarulo, A. Macii, , E. Macii, M. Poncino, "From Archi­tecture to Layout: Partitioned Memory Synthesis for Embedded Systems-on­Chip," DAC-38: ACM/IEEE Design Automation Conference, pp. 784-789, Las Vegas, NV, June 2001.

[32] M. Borgatti, et al., "A 64-Min Single-Chip Voice Recorder/Player Using Embedded 4-b/cell FLASH Memory," IEEE Journal of Solid-State Circuits, Vol. 36, No. 3, pp. 516-521, March 2001.

[33] D. C. Burger, J. R. Goodman, A. Kagle, "Limited Bandwidth to Affect Pro­cessor Design", IEEE Micro, Vol. 17, No. 6, pp. 55-62, November-December 1997.

[34] D. C. Burger, Hardware Techniques to Improve the Performance of the Proces­sor/Memory Interface, Ph.D. Dissertation, University of Wisconsin-Madison, 1998.

Page 4: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

132 MEMORY DESIGN TECHNIQUES

[35] P. Cappelletti, C. Golla, P. Olivo, E. Zanoni, Flash Memories, Kluwer Aca­demic Publishers, 1999.

[36] F. Catthoor, S. Wuytack, E. De Greef, F. Balasa, L. Nachtergaele, A. Vande­cappelle, Custom Memory Management Methodology: Exploration of Memory Organization for Embedded Multimedia System Design, Kluwer Academic Pub­lishers, 1998.

[37] A. Chandrakasan, S. Sheng, R. W. Brodersen, "Low-Power CMOS Digital Design," IEEE Journal of Solid-State Circuits, Vol. 27, No. 4, pp. 473-484, April 1992.

[38] A. Chandrakasan, W. Bowhill, F. Fox, Design of High-Performance Micropro­cessor Circuits, IEEE Press, 2001.

[39] A. P. Chandrakasan, M. Potkonjak, R. Mehra, J. Rabaey, R. W. Broder­sen, "Optimizing Power Using Transformations," IEEE 'JIransactions on CAD/ICAS, Vol. 14, No. 1, pp. 12-31, January 1995.

[40] J. M. Chang, M. Pedram, "Low Power Register Allocation and Binding," DAC-32: ACM/IEEE Design Automation Conference, pp. 29-35, San Francisco, CA, June 1995.

[41] J. M. Chang, M. Pedram, "Module Assignment for Low Power," EuroDAC-96: IEEE European Design Automation Conference, pp. 376-381, Geneva, Switzer­land, September 1996.

[42] H. Chang, et al., Surviving the SoC Revolution: A Guide to Platform-Based Design, Kluwer Academic Publishers, 1999.

[43] S. Y. Chiang, "Foundries and the Dawn of an Open IP Era," IEEE Computer, Vol. 34, No. 4, pp. 43-46, April 2001.

[44] S. H. Chow, Y. C. Ho, T. Hwang, C. L. Liu, "Lower Power Realization ofFinite State Machines - A Decomposition Approach," ACM 'JIransactions on Design Automation of Electronic Systems, Vol. 1, No. 3, pp. 315-340, July 1996.

[45] S. L. Coumeri, D. E. Thomas, "Memory Modeling for System Synthesis," ISLPED-98: ACM/IEEE International Symposium on Low Power Electron­ics and Design, pp. 179-184, Monterey, CA, August 1998.

[46] S. L. Coumeri, Modeling Memory Organizations tor the Synthesis 0/ Low Power Systems, Ph. D. Dissertation, EE and CS Dept., Carnegie Mellon University, May 1999.

[47] J. Davis 11, et al., Overview 0/ the Ptolemy Project, ERL Technical Report UCB/ERL No. M99/37, UC Berkeley, 1999.

[48] Dolphin Integration, Ragtime Embedded Memory Generators, 2001.

[49] A. Farrahi, G. Tellez, M. Sarrafzadeh, "Memory Segmentation to Exploit Sleep Mode Operation," DAC-32: ACM/IEEE Design Automation Con/er­ence, pp. 36-41, San Francisco, CA, June 1995.

Page 5: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

REFERENCES 133

[50] A. Farrahi, M. Sarrafzadeh, "System Partitioning to Maximize Sleep Time," ICCAD-95: IEEE/ACM International Conference on Computer-Aided Design, pp. 452-455, San Jose, CA, November 1995.

[51] B. R. Fisk, R. I. Bahar, "The Non-Critical Buffer: Using Load Latency Tol­erance to Improve Data Cache Efficiency, ICCD-99: IEEE International Con­ference on Computer Design, pp. 538-545, Austin, TX, October1999.

[52] D. Flynn, "AMBA: Enabling Reusable On-Chip Designs," IEEE Micro, Vol. 17, No. 4, pp. 20-27, July-August 1997.

[53] D. Frank, R. Dennard, E. Novak, P. Solomon, Y. Taur, H. S. Wong, "Device Scaling Limits of Si MOSFETs and Their Application Dependencies," Pro­ceedings of the IEEE, Vol. 89, No. 3, pp. 259-288, March 200l.

[54] D. D. Gajski, N. D. Dutt, A. C. H. Wu, S. Y.-L. Lin, High-Level Synthesis -Introduction to Chip and System Design, Kluwer Academic Publishers, 1992.

[55] Gartner, Inc., Final 2000 Worldwide Semiconductor Market Share, 2000.

[56] C. Gebotys, "Low Energy Memory and Register Allocation Using Network Flow," DAC-34: ACM/IEEE Design Automation Conference, pp. 435-440, Anaheim, CA, June 1997.

[57] J. D. Gee, M. D. Hili, D. N. Pnevmatikatos, A. J. Smith, "Cache Performance of the SPEC Benchmark Suite," IEEE Micro, Vol. 13, No. 4, pp. 17-27, August 1993.

[58] Goldman-Sachs Technical Report, Wireless Wave II - The Data Wave Un­plugged, 1999.

[59] A. Gonzalez, C. Aliagas, M. Valero, "A Data-Cache with Multiple Caching Strategies Tuned to Different Types of Locality," IC8-95: ACM International Conference on Supercomputing, pp. 338-347, Barcelona, Spain, July 1995.

[60] P. Grun, N. Dutt, A. Nicolau, "Access Pattern Based Local Memory Cus­tomization for Low-Power Embedded Systems," DATE-OI: IEEE Design Au­tomation and Test in Europe, pp. 778-784, Munich, Germany, March 2001.

[61] M. Gumm, VHDL-Modeling and Synthesis of the DLX RISC Processor, Uni­versity of Stuttgart, Department of Integrated Systems Engineering, Stuttgart, Germany, 1995.

[62] G. D. Hachtel, M. Hermida, A. Pardo, M. Poncino, F. Somenzi, "Re-Encoding Sequential Circuits to Reduce Power Dissipation," ICCAD-94: IEEE/ACM International Conference on Computer-Aided Design, pp. 70-73, San Jose, CA, November 1994.

[63] A. Hasegawa, et al., "SH3: High Code Density, Low Power," IEEE Micro, Vol. 15, No. 6, pp. 11-19, December 1995.

[64] J. L. Hennessy, D. A. Patterson, Computer Architecture - A Quantitative Ap­proach, II Edition, Morgan Kaufmann Publishers, 1996.

Page 6: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

134 MEMORY DESIGN TECHNIQUES

[65J C. H. Hwang, A. C. H. Wu, "A Predictive System Shutdown Method for Energy Saving of Event-Driven Computation," ICCAD-97: IEEE/ACM In­ternational Conference on Computer-Aided Design, pp. 28-32, San Jose, CA, November 1997.

[66J IBM Blue Logic Technology, http://www.chips.ibm.com/bluelogic

[67J S. Iman, M. Pedram, "Multi-Level Network Optimization for Low Power," ICCAD-94: IEEE/ACM International Conference on Computer-Aided Design, pp. 372-377, San Jose, CA, November 1994.

[68J S. Iman, M. Pedram, "POSE: Power Optimization and Synthesis Environ­ment," DAC-33: ACM/IEEE Design Automation Conference, pp. 21-26, Las Vegas, NV, June 1996.

[69J K. Inoue, T. Ishihara, K. Murakami, "Way-Predicting Set-Associative Cache for High-Performance and Low-Energy Consumption, ISLPED-99: ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 273-275, San Diego, CA, August 1999.

[70J 1. K. John, A. Subramanian, "Design and Performance Evaluation of a Cache Assist to Implement Selective Caching," ICCD-97: IEEE International Con­ference on Computer Design, pp. 510-518, Austin, TX, October 1997.

[71J N. Jouppi, "Improving Direct-Mapped Cache Performance by the Addition of a Small Fully-Associative Cache and Pre-Fetch Buffer," ISCA-90: ACM/IEEE International Symposium on Computer Architecture, pp. 364-373, Seattle, WA, May 1990.

[72J M. B. Kamble, K. Ghose, "Analytical Energy Dissipation Models for Low­Power Caches," ISLPED-97: ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 143-148, Monterey, CA, August 1997.

[73J G. Kane, J. Heinrich, MIPS RISC Architecture, Prentice Hall, 1994.

[74J D. Keitel-Schulz, N. Wehn, "Embedded DRAM Development: Technology, Physical Design and Application Issues," IEEE Design and Test of Computers, Vol. 18, No. 3, pp. 7-15, May-June 2001.

[75J K. Keutzer, A. Newton, J. Rabaey, A. Sangiovanni-Vincentelli, "System-Level Design: Orthogonalization of Concerns and Platform-Based Design," IEEE Transaction on CAD/ICAS, Vol. 19, No. 12, pp. 1523-1543, December 2000.

[76J D. Kim, K. Choi, "Power-Conscious High-Level Synthesis Using Loop Fold­ing," DAC-34: ACM/IEEE Design Automation Conference, pp. 441-445, Ana­heim, CA, June 1997.

[77J J. Kin, M. Gupta, W. Mangione-Smith, "The Filter Cache: An Energy Efficient Memory Structure," MICRO-30: Annual IEEE/ACM International Sympo­sium on Microarchitecture, pp. 184-193, Research Triangle Park, NC, Decem­ber 1997.

Page 7: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

REFERENCES 135

[78] D. Kirovski, C. Lee, M. Potkonjak, W. Mangione-Smith, "Synthesis of Power Efficient Systems-on-Silicon," ASP-DAC-98: IEEE Asian and South Pa­cific Design Automation Conference, pp. 557-562, Yokohama, Japan, Febru­ary 1998.

[79] U. Ko, P. T. Balsara, A. K. Nanda, "Energy Optimization of Multilevel Cache Architectures for RISC and CISC Processors," IEEE Transactions on VLSI Systems, Vol. 6, No. 2, pp. 299-308, June 1998.

[80] S. Komatsu, M. Ikeda, K. Asada, "Low Power Chip Interface Based on Bus Data Encoding with Adaptive Code-Book Method," GLS- VLSI-99: ACM/IEEE Great Lakes Symposium on VLSI, pp. 368-371, Ypsilanti, MI, March 1999.

[81] B. Kumthekar, I. H. Moon, F. Somenzi, "A Symbolic Algorithm for Low-Power Sequential Synthesis," ISLPED-97: ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 56-61, Monterey, CA, August 1997.

[82] A. Kunimatsu, et al., "Vector Unit Architecture for Emotion Synthesis," IEEE Micro, Vol. 20, No. 2, pp. 40-47, March-April 2000.

[83] Intel, "Intel Intel XScale™ Microarchitecture Technical Summary," http://www.intel.com/design/intelxscale.

[84] K. Itoh, K. Sasaki, Y. Nakagome, "Trends in Low-Power RAM Circuit Tech­nologies," Proceedings of the IEEE, Vol. 83, No. 4, pp. 524-543, April 1995.

[85] G. Jackson, et al., "An Analog Record, Playback and Processing System on a Chip for Mobile Communications Devices," IEEE Custom Integrated Circuits Conference, pp. 99-102, San Diego, CA, May 1999.

[86] T. Juan, T. Lang, J. J. Navarro, "Reducing TLB Power Requirements," ISLPED-97: ACM/IEEE International Symposium on Low Power Electron­ics and Design, pp. 196-201, Monterey, CA, August 1997.

[87] P. Laramie, Instruction-Level Power Analysis and Low Power Design Method­ology of a Core Processor, Master Thesis, UC Berkeley, 1998.

[88] H. S. Lee, G. S. Tyson, "Region-Based Baching: An Energy-Delay Efficient Memory Architecture for Embedded Processors," IEEE International Confer­ence on Compilers, Architecture and Synthesis for Embedded Systems, pp. 120-127, November 2000.

[89] T. C. Lee, S. Malik, V. Tiwari, M. Fujita, "Power Analysis and Minimiza­tion Techniques for Embedded DSP Software," IEEE Transactions on VLSI Systems, Vol. 5, No. 1, pp. 123-135, March 1997.

[90] H. Lekatsas, W. Wolf, "Code Compression for Low Power Embedded Systems," DAC-97: ACM/IEEE Design Automation Conference, pp. 294-299, Los Ange­les, CA, June 2000.

[91]" Y. Li, J. Henkel, "A Framework for Estimating and Minimizing Energy Dis­sipation of Embedded HW /SW Systems," DAC-95: ACM/IEEE Design Au­tomation Conference, pp. 188-193, San Francisco, CA, June 1998.

Page 8: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

136 MEMORY DESIGN TECHNIQUES

[92] S. Y. Liao, S. Devadas, K. Keutzer, "Code Density Optimization for Embedded DSP Processors Using Data Compression Techniques," IEEE Transactions on CAD/ICAS, Vol. 17, No. 7, pp. 601-608, July 1998.

[93] D. Lidsky, J. Rabaey, "Low-Power Design of Memory Intensive Func­tions," IEEE Symposium on Low Power Electronics, pp. 16-17, San Diego, CA, September 1994.

[94] E. Macii, M. Pedram, F. Somenzi, "High-Level Power Modeling, Estima­tion, and Optimization," IEEE Transactions on CAD /ICAS, Vol. 17, No. 11, pp. 1061-1079, November 1998.

[95] K. Mai, et al. , "Smart Memories: A Modular Reconfigurable Architecture," ISCA-OO: ACM/IEEE International Symposium on Computer Architecture, pp. 161-171, Vancouver, BC, 2000.

[96] H. Mehta, R. M. Owens, M. J. Irwin, "Some Issues in Gray Code Addressing," GLS- VLSI-96: ACM/IEEE Great Lakes Symposium on VLSI, pp. 178-180, Ames, IA, March 1996.

[97] J. Mendl, "Low Power Microelectronics: Retrospect and Prospect," Proceedings 01 the IEEE, Vol. 83, No. 4, pp. 619-635, April 1995.

[98] V. Milutinovic, B. Markovic, M. Tomasevic, M. Tremblay, "The Split Tempo­ral/Spatial Cache: A Complexity Analysis," SClzzL-6 Workshop, pp. 89-96, Santa Clara, CA, September 1996.

[99] J. Monteiro, S. Devadas, A. Ghosh, "Retiming Sequential Circuits for Low Power," ICCAD-93: IEEE/ACM International Conlerence on Computer-Aided Design, pp. 398-402, Santa Clara, CA, November 1993.

[100] J. Monteiro, S. Devadas, P. Ashar, A. Mauskar, "Scheduling Techniques to Enable Power Management," DAC-33: ACM/IEEE Design Automation Con­lerence, pp. 349-352, Las Vegas, NV, June 1996.

[101] J. Monteiro, S. Devadas, A. Ghosh, "Sequential Logic Optimization for Low Power Using Input-Disabling Precomputation Architectures," IEEE Transac­tions on CAD /ICAS, Vol. 17, No. 3, pp. 279-284, March 1998.

[102] J. Monteiro, A. Oliveira, "Finite State Machine Decomposition for Low Power," DAC-35: ACM/IEEE Design Automation Conlerence, pp. 763-768, San Francisco, CA, June 1998.

[103] S. Muchnick, Advanced Compiler Design fj Implementation. Morgan Kauf­mann, 1997.

[104] M. Munch, B. Wurth, R. Mehra, J. Sproch, N. Wehn, "Automatie RT-Level Operand Isolation to Minimize Power Consumption in Datapaths," DA TE-00: IEEE Design Automation and Test in Europe, pp. 624-631, Paris, France, March 2000.

[105] E. Musoll, J. Cortadella, "Scheduling and Resource Binding for Low Power," ISSS-95: IEEE International Symposium on System Synthesis, pp. 104-109, Cannes, France, April 1995.

Page 9: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

REFERENCES 137

[106] E. Musoll, T. Lang, J. Cortadella, "Working-Zone Encoding for Reducing the Energy in Microprocessor Address Buses," IEEE Transactions on VLSI Sys­tems, Vol. 6, No. 4, pp. 568-572, December 1998.

[107] L. Nachtergaele, F. Catthoor, C. Kulkarni, "Random-Access Data Storage Components in Customized Architectures," IEEE Design and Test of Com­puters, Vol. 18, No. 3, pp. 40-54, May-June 2001.

[108] D. Pan, "A Tutorial on MPEG / Audio Compression," IEEE Multimedia, Vol. 2, No. 2, pp. 60-74, Summer 1995.

[109] C. Panasik, "Overcoming Obstacles to 3G Wireless Technology", Communi­cation System Design, Vol. 7, No. 1, January 2001.

[110] P. R. Panda, N. Dutt, A. Nicolau, "Efficient Utilization of Scratch-Pad Mem­ories in Embedded Processors," EDTC-97: IEEE European Design and Test Conference, pp. 7-11, Paris, France, March 1997.

[111] P. Panda, N. Dutt, Memory Issues in Embedded Systems-on-Chip Optimization and Exploration, Kluwer Academic Publishers, 1999.

[112] P. Panda, N. Dutt, A. Nicolau, "On-Chip vs. Off-Chip Memory: The Data Partitioning Problem in Embedded Processor-Based Systems," ACM Trans­actions on Design Automation of Electronic Systems, Vol. 5, No. 3, pp. 682-704, July 2001.

[113] P. R. Panda, F. Catthor, N. D. Dutt, K. Danckaert, E. Brockmeyer, C. Kulka­rni,A. Vandercappele, P. G. Kjeldsberg, "Data and Memory Optimization Techniques for Embedded Systems," ACM Transactions on Design Automa­tion of Electronic Systems, Vol. 6, No. 2, pp. 149-206, April 2001.

[114] R. Panwar, D. Renneis, "Reducing the Frequency of Tag Compares for Low Power I-Cache Design," ISLPD-95: ACM/IEEE International Symposium on Low Power Design, pp. 57-62, Dana Point, CA, April 1995.

[115] C. Passerone, L. Lavagno, C. Sansoe, M. Chiodo, A. Sangiovanni, "Trade-Off Evaluation in Embedded System Design via Co-simulation," ASP-DAC-97: IEEE Asia South Pacific Design Automation Conference, pp. 291-297, Chiba, Japan, January 1997.

[116] D. Patterson, et al., "The Case for Intelligent RAM," IEEE Micro, Vol. 17, No. 2, pp. 34-44, March-April 1997.

[117] M. Powell. S. H. Yang, B. Falsafi, K. Rou, N. Vijaykumar, "Reducing Leakage in a High-Performance Deep-Submicron Instruction Cache," IEEE Transac­tions on VLSI Systems, Vol. 9, No. 1, pp. 77-89, February 2001.

[118] B. Prince, Semiconductor Memories, 2nd Ed., John Wiley & Sons, 1997.

[119] A. Raghunathan, S. Dey, N. Jha, "Glitch Analysis and Reduction in Register Transfer Level Power Optimization," DAC-33: ACM/IEEE Design Automa­tion Conference, pp. 331-336, Las Vegas, NV, June 1996.

Page 10: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

138 MEMORY DESIGN TECHNIQUES

[120] R. Rajsuman, "Design and Test of Large Embedded Memories: An Overview," IEEE Design and Test of Computers, Vol. 18, No. 3, pp. 16-27, May-June 200l.

[121] S. Ramprasad, N. Shanbhag, 1. Hajj, "Signal Co ding for Low Power: Fun­damental Limits and Practical Realizations," ISCAS-98: IEEE International Symposium on Circuits and Systems, pp. 1-4, Monterey, CA, May 1998.

[122] K. Roy, S. C. Prasad, "Circuit Activity Based Synthesis for Low Power Reliable Operations," IEEE Transactions on VLSI Systems, Vol. 1, No. 4, pp. 503-513, December 1993.

[123] M. Schlett, "Trends in Embedded Microprocessor Design," IEEE Computer, Vol. 31, No. 8, pp. 44-49, August 1998.

[124] S. Segars, K. Clarke, L. Goudge, "Embedded Control Problems, Thumb and the ARM7TDMI," IEEE Micro, Vol. 15, No. 5, pp. 22-30, October 1995.

[125] S. Segars, "The ARM9 Family - High Performance Microprocessors for Em­bedded Applications," ICCD-98: IEEE International Conference on Computer Design, pp. 230-235, Austin, TX, October 1998.

[126] Semiconductor Industry Association, 1999 International Technology Roadmap for Semiconductors, http://public.itrs.net.

[127] A. Seznec, "A Case for Two-Way Skewed-Associative Caches," ISCA-93: ACM/IEEE International Symposium on Computer Architecture, pp. 169-178, San Diego, CA, May 1993.

[128] Y. Shin, S.-K. Chae, K. Choi, "Partial Bus-Invert Co ding for Power Optimiza­tion of System-Level Buses," ISLPED-98: ACM/IEEE International Sympo­sium on Low Power Electronics and Design, pp. 127-129, Monterey, CA, Au­gust 1997.

[129] W. Shiue, C. Chakrabarti, "Memory Exploration for Low Power, Embedded Systems," DAC-36: ACM/IEEE Design Automation Conference, pp. 140-145, New Orleans, LA, June 1999.

[130] R. Siegmund, C. Kretzschmar, D. Müller, "Adaptive Partial Bus Invert for Power Efficient Data Transfer over Wide System Buses," SECCI-OO: Sympo­sium on Integrated Circuit and System Design, pp. 371-376, Manaus, Brazil, August 2000.

[131] M. Srivastava, A. P. Chandrakasan, R. W. Brodersen, "Predictive System Shutdown and Other Architectural Techniques for Energy Efficient Pro­grammable Computation," IEEE Transactions on VLSI Systems, Vol. 4, No. 1, pp. 42-55, March 1996.

[132] M. Stan, W. Burleson, "Bus-Invert Coding for Low-Power 1/0," IEEE Trans­actions on VLSI Systems, Vol. 3, No. 1, pp. 49-58, January 1995.

[133] M. Stan, W. Burleson, "Low-Power Encodings for Global Communication in CMOS VLSI," IEEE Transactions on VLSI Systems, Vol. 5, No. 4, pp. 444-455, December 1997.

Page 11: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

REFERENCES 139

[134J C. L. Su, C. Y. Tsui, A. M. Despain, "Saving Power in the Control Path of Embedded Processors," IEEE Design and Test 0/ Computers, Vol. 11, No. 4, pp. 24-30, Winter 1994.

[135J C. L. Su, A. M. Despain, "Cache Design Trade-Offs for Power and Perfor­mance Optimization: A Case Study," ISLPD-95: ACM/IEEE International Symposium on Low Power Design, pp. 63-68, Dana Point, CA, April 1995.

[136J M. Suzuoki, et al., "A Microprocessor with a 128-bit CPD, Ten Floating-Point MACs, Four Floating-Point Dividers and an MPEG-2 Decoder," IEEE Journal 0/ Solid-State Circuits, Vol. 34, No. 11, pp. 1608-1618, November 1999.

[137J M. Takahashi, et al., "A 60-MHz 240-m W MPEG-4 Videophone LSI with 16-Mb embedded DRAM," IEEE Journal 0/ Solid-State Circuits, Vol. 35, No. 11, pp. 1713-1721, November 2000.

[138J V. Tiwari, S. Malik, A. Wolfe, "Power Analysis ofEmbedded Software: A First Step Towards Software Power Minimization," IEEE Transactions on VLSI Systems, Vol. 2, No. 4, pp. 437-445, December 1994.

[139] V. Tiwari, S. Malik, P. Ashar, "Guarded Evaluation: Pushing Power Manage­ment to Logic Synthesis/Design," IEEE Transactions on CAD /ICAS, Vol. 17, No. 10, pp. 1051-1060, November 1998.

[140] H. V. Tran, et al., "A 2.5-V, 256-Level Nonvolatile Analog Storage Device using EEPROM Technology," IEEE International Solid-State Circuits Con/erence, pp. 270-271, San Francisco, CA, February 1996.

[141] C. Y. Tsui, M. Pedram, A. M. Despain, "Technology Decomposition and Map­ping Targeting Low Power Dissipation," DAC-30: ACM/IEEE Design Au­tomation Con/erence, pp. 68-73, Dallas, TX, June 1993.

[142] C. Y. Tsui, M. Pedram, A. M. Despain, "Low Power State Assignment Tar­geting Two- and Multi-Level Logic Implementations, ICCAD-94: IEEE/ACM International Con/erence on Computer-Aided Design, pp. 82-87, San Jose, CA, November 1994.

[143] DMC, Embedded 6T Static RAM Macros Datasheet, http://www.urnc.com. 1999.

[144] Virage Logic, Custom-Touch Memory Compiler Datasheet, http://www.viragelogic.com. 1999.

[145] E. Vittoz, "Low Power Microelectronics: Ways to Approach the Limits," In­ternational Con/erence on Solid-State Circuits, pp. 14-18, San Francisco, CA, January 1994.

[146J S. J. Walsh, J. A. Board, "Pollution Control Caching," ICCD-95: IEEE In­ternational Con/erence on Computer Design, pp. 300-306, Austin, TX, Octo­ber 1995.

[147] S. J. E. Wilton, N. P. Jouppi, "CACTI: An Enhanced Cache Access and Cyc1e Time Model," IEEE Journal 0/ Solid-State Circuits, Vol. 31, No. 5, pp. 677-687, May 1996.

Page 12: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

140 MEMORY DESIGN TECHNIQUES

[148] Y. Yoshida, B. Song, H. Okuhata, T. Onoye, 1. Shirakawa, "An Object Code Compression Approach to Embedded Processors," ISLPED-97: ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 265-268, Monterey, CA, August 1997.

[149] K. Yoshikawa, "Embedded Flash Memories - Technology Assessment and Fu­ture," IEEE International Symposium on VLSI Technology, Systems and Ap­plications, pp. 183-186, Taipei, Taiwan, June 1999.

[150] V. Zyuban, P. Kogge, "The Energy Complexity of Register Files," ISLPED-98: ACM/IEEE International Symposium on Low Power Electronics and Design, pp. 305-310, Monterey, CA, August 1998.

Page 13: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

Index

AAC LC, 31 ADPCM,33 AGP, 17 ALU, 13 AMBA,18 AMR, 31, 33 ARM, 17, 48, 83, 88

ARM7TDMI, 104, 111 Thumb,104

ARMulator, 89, 96 ASIC,13

design, 14, 16 market, 14

ASM,69 Access profile, 74 Adaptive encoding, 49 Address decoder, 72 Annex cache, 45 Application-specific memory, 69 Average power, 4 Back-end fiow, 73 Bandwith, 46

optimization, 46 Basic block, 46, 49, 119 Battery

life-time, 4 Bin-packing, 85 Bipartitioning, 75 Block halos, 85 Blue Logic, 15 Branch instruction, 107 Branch target, 108 Bulfer

compressed instruction, 111 non-critical, 45 pre-decoded, 46 scratch-pad, 46 speculative, 45

Bus invert (BI), 49 clustered, 49

partitioned, 49 Bus, 7

address, 49, 87 data,87 encoding, 7, 48 energy, 99

CAS, 39 CIB,111 CMOS, 4-5

variable threshold, 32 CPU, 88

core, 89 Cache

annex, 45 associativity, 42 column associative, 50 hit rate, 39 hit ratio, 45 line size, 42 laap,46 miss rate, 50 replacement policy, 42 skewed-associative, 50 spatial, 44 sub-banking, 43 traffic-efficient, 47 victim,45 way-predicting, 45

Clock,4 frequency, 4

Clock-gating, 9 Code

compression, 47 density, 47 reordering, 46

Cadebaok-based encoding, 50 Column-associative cache, 50 Compression, 47

141

ratio, 105, 108 schemes, 105

Page 14: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

142

Computer-aided design (CAD), 8 Conflict misses, 50 Coprocessor, 27 Core processor, 10, 40, 73 Correlation, 48

spatial, 44 spatio-temporal, 49 temporal, 44

Critical path, 88 DCT,32 DLX,104 DMA, 29, 32, 35 DRAM, 13, 15, 39

Rambus,29 embedded, 20, 23, 26, 31

DSPs,7 Data encoding, 7 Data

compression, 37 encoding, 48 transfer optimization, 48

Decompression unit, 109 Deep-submicron (DSM), 73 Design closure, 73 Design

cycles, 14 high-level, 5 system-level, 6

Dynamic access profile, 71, 73 EEPROM, 23, 25 Electronic Design Automation (EDA), 17,

71 tools, 17

Embedded DRAM, 23 Embedded SRAM, 42 Embedded system, 6-7, 9 Embedded

SRAM,70 application, 10, 69, 72-73, 91 memory, 19 processors, 2 software, 8

Embedded-system real-time, 69

Emotion engine, 27 Encoding,48

adaptive, 49 bus-invert, 49 codebook-based, 50

Energy management, 7, 9 Energy,4

bus energy, 104 fetch, 114 instruction decompression energy, 104 model, 74 optimization, 102

Energy-aware scheduling, 9

MEMORY DESIGN TECHNIQUES

Entropy,49 FGMOS, 25, 33 FIFO,31 FLASH, 23, 25, 33, 104, 117 FMAC,29 FSM decomposition, 9 Finite state machine (FSM), 9 Flat memory, 37 Floating-gate transistor, 25 Floating-point, 27 Floorplan, 23, 73 Floorplanning, 82, 84 Foundry verified qualification, 17 GFLOPS,30 Gate freezing, 9 Gate resizing, 9 Glitch,9

filtering, 9 Gray code, 49 Guarded evaluation, 9 HW /SW partitioning, 18 Hamming distance, 49 Hard macros, 17, 71 Hardware prefetching, 47 Hardware synthesis, 8 High-level design, 5 Hit rate, 39 IBM, 15 IDT, 101, 111 ILP,46 IP, 11, 16

qualification, 17 vendors,17

ITU H.223,31

ITU-T G.726,35

Insertion sort, 96 Instruction compression, 11 Instruction, 99

compression, 104 compression, 99 decompression table, 101 decompression, 104 fetching/decompression logic, 101

Instruction-level parallelism, 46 Instruction-level simulator, 71, 74 Intel, 13 Intellectual property (IP), 11, 16 Kernel extraction, 9 LSI,13 Layout, 70, 82

valid, 89 Load capacitance, 4 Locality

temporal, 37 Logical partitioning, 43

Page 15: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

INDEX

Loop cache, 46 MIPS, 28, 100, 104

DLX,l11 R4000, 100

MPEG,70 MPEG2,30 MPEG4,30 MUX

output, 84 Mark, 104, 108, 119 Memory generator, 22 Memory

access trace, 89 application-specific, 69 architecture, 10 bandwidth, 19, 46 cut, 74, 84, 89, 92 dedieated-process, 20 embedded, 19 energy, 9, 74, 99 fetch energy, 104 flat, 37 generator, 22, 74, 83-84 hierarchy design, 40 hierarchy, 38 interface optimization, 48 interface, 48 latency, 46-47 market,14 non volatile, 20 non-volatile, 25 partitioning, 10, 43 process-compatible, 20 processor interface, 99 read energy, 104 select signal, 83 traflie, 11, 47, 105, 114 usage, 105, 116 volatile, 20

Microcontrollers, 7, 13 Moore's law, 2, 13 Multi-chip modules (MCM), 25 NMOS, 5,21 Non-critieal buffer, 45 OS, 126 Operand isolation, 9 Operating system, 126 Over-the-cell routing, 22 PCB, 19, 25 PCI,17 PCM,35 PMI,46 PMOS,21 Package,3 Parasities, 89 Partitioning

logieal,43

physieal, 43 Peak power, 4 Physieal design, 70 Physical partitioning, 43 Place and Route (P&R), 73, 82, 84-85 Placement and routing (P&R), 84 Placement, 73, 85

automatie, 85 floorplan-directed, 85 legal, 85

Platform-based design, 17 Playstation 2, 27 Power

average, 4 distribution, 84, 87 management, 7, 9 metries, 3 peak, 4 short-circuit, 5 switching, 4

Power-delay product, 4 PowerPC,17 Pre-decoded instruction buffers, 46 Pre-silicon qualification, 17 Precomputation, 9 Prefetching, 47 Processor

core, 10, 40, 69, 71, 73-74 market,14

Production rating, 17 Profiling, 100 Ptolemy,92 QCIF,33 RAM macro compilers, 71 RAM,104 RAS, 39 RISC, 27-28, 31, 48 ROM, 15, 23, 104, 117 Rambus,29 Random White Noise (RWN), 48 Real-time embedded systems, 69 Region-based caching, 44 Register files, 42 Resource allocation, 9 Retiming,9 Routing, 73, 86

block, 84 cell, 84

SIMD,28 SRAM, 13, 15, 70-71, 104

design view, 22 embedded, 42, 70 frame view, 22 generator, 83 low-leakage, 20 on-chip, 73-74 power view, 22

143

Page 16: References - Springer978-1-4757-5808-5/1.pdf · Dependent Circuits Based on Symbolic Computation of Logic Implications," ACM Transactions on Design Automation of Electronic Systems,

144

segmented, 44 ST, 83, 89 STG restructuring, 9 Scheduling

energy-aware, 9 Scratch-pad buffer, 46 Scratch-pad memory, 46 Segmented SRAM, 44 Sense amplifiers, 22 Shift registers, 13 Short-circuit power, 5 Signal integrity, 4 Silicon fabs, 17 Skewed-associative cache, 50 Sleep mode, 43 Soft macros, 71 Sony, 27 Spatial cache, 44 Spatio-temporal correlation, 49 Speculative buffer, 45 State re-encoding, 9 State transition graph (STG), 9 Super-block, 120 Supply voltage, 4 Switching activity, 4, 85, 87, 92 Switching power, 4 Synthesis, 8

logic, 89

MEMORY DESIGN TECHNIQUES

physical, 89 System

architecture, 7 design, 8

System-level design, 6 System-level exploration, 91 System-on-Chip (SoC), 5, 10-11, 15,69-70 TLB,42 Temporal cache, 44 Temporallocality, 37 Temporal

cache, 44 Thumb instruction set, 48 Toggle count, 92 Toshiba,27 Transformations

computation, 40 Translation look-aside buffer, 42 Twin-VQ,31 VLIW processor, 47 Variable supply voltage, 8 Vector processing unit (VPU), 27 Verilog, 83-84, 89 Victim cache, 45 Way-predicting cache, 45 Wire length, 85-86 Tapeout, 14 X86,13