references - inflibnetshodhganga.inflibnet.ac.in/bitstream/10603/13051/19/19_reference.pdf · 211...
TRANSCRIPT
207
References
1. Accessed 2007/4/7:http://www.xilinx.com/products.
2. Alan Marshall, Tony Stansfield, Igor Kostarnov, Jean Vuillemin and Brad
Hutchings, ―The high level synthesis of digital systems,‖ Proceedings of IEEE,
vol. 78, no. 2, pp. 301–318, 1990.
3. Annapolise Micro-Systems Inc., The FPGA Performance Company:
www.annapmicro.com/S
4. Arrays from Recurrences Equations with Linear Dependencies,‖ Proceedings of
Sixth Conference on Foundations of Software Technology and Theoretical Computer
Science, LNCS, Springer Verlag, vol. 241, pp. 488-503, 1986.
5. Baganne, A., I. Bennour, M. Elmarzougui, R. Gaiech, and E. Martin, ―A Multi-
Level Design Flow for Incorporating IP cores: Case study of 1-D Wavelet IP
Integration,‖ Proceedings of Design, Automation and Test Conference and
Exhibition, pp. 250-255, 2003.
6. Banerjee, P., M. Haldar, A. Nayak, V. Kim, V. Saxena and V. S. Parkes,
―Overview of a Compiler for Synthesizing Matlab Programs onto FPGAs,‖ IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, vol. 12, no. 3, pp.
312-324, 2004.
7. Banerjee, P., N. Shenoy, A. Choudhary, S. Hauck, C. Bachmann, M. Haldar,
P. Joisha, A. Jones, A. Kanhare, A. Nayak, S. Periyacheri, M.Walkden and
D. Zaretsky, ―A Matlab Compiler for Distributed, Heterogeneous, Reconfigurable
Computing Systems,‖ Proceedings of IEEE Symposium Field-Programmable
Custom Computing Machines, pp. 39-48, 2000.
8. Bednara, M. and J. Teich, ―Interface synthesis for FPGA based VLSI processor
arrays,‖ Proceedings of the International Conference on Engineering of
Reconfigurable Systems and Algorithms, ERSA02, pp. 24-27, 2002.
9. Bednara, M. and J. Teich, ―Interface synthesis for FPGA based VLSI processor
arrays,‖ Proceedings of the International Conference on Engineering of
Reconfigurable Systems and Algorithms, ERSA02, pp. 24-27, 2002.
208
10. Beletska, A., W. Bielecki, A. Cohen, M. Palkowski and K. Siedlecki, ―Coarse
Grained Loop Parallelization: Iteration Space Slicing vs Affine Transformations,‖
Journal of Parallel Computing, vol. 37, no. 8, pp. 479-497, 2011.
11. Benaini, A. and Y. Robert, ―Space-Minimal Systolic Arrays for Gaussian
Elimination and the Algebraic Path Problem, Journal of Parallel Computers,‖ vol.
15, pp. 211-225, 1990.
12. Benedetti, A. and P. Perona, ―Real-Time 2-D Feature Detection on a
Reconfigurable
13. Benitez, D. and J. Cabrera, ―Reactive Computer Vision System with Reconfigurable
Architecture,‖ Lectures Notes in Computer Science, Springer-Verlag, vol. 1542, pp.
348-360, Proceedings of International Conference on Vision Systems (ICVS'99),
1999.
14. Bermond, J.C., C. Peyrat, I. Sakho and M. Tchuenté, ―Parallelization of the Gauss
elimination on systolic arrays,‖ Internal Report, LRI, 1988.
15. Bermond, J.C., C. Peyrat, I. Sakho and M. Tchuenté, ―Parallelization of the
Gaussian elimination on systolic arrays,‖ Journal of Parallel and Distributed
Computers, vol. 33, pp. 69-75, 1996.
16. Bondalapati, K. and V. K. Prasanna, ―Reconfigurable Computing Systems,‖
Proceedings of IEEE, pp. 1201-1217, 2002.
17. Bondhugula, U., J. Ramanujam and P. Sadayappan, ―Automatic Mapping of
Nested Loops to FPGAs,‖ PPoPP ‘07, ACM, pp. 101-111, 2007.
18. Brunelli, C., F. Garzia, D. Rossi and J. Nurmi, ―A Coarse-Grain Reconfigurable
Architecture for Multimedia Applications supporting Sub-word and Floating-point
Calculations,‖ Journal of System Architecture, vol. 56, no. 1, pp. 38-47, 2010.
19. Bu, J. and E.F. Deprettere, ―Processor Clustering for the Design of Optimal Fixed-
Size Systolic Arrays,‖ Proceedings of the Sixth International Parallel Processing
Symposium, pp. 275-282, 1992.
20. Bu, J. and E.F. Deprettere, ―Processor Clustering for the Design of Optimal Fixed-
Size Systolic Arrays,‖ Proceedings of the Sixth International Parallel Processing
Symposium, pp. 275-282, 1992.
209
21. C.T. Djamegni, ―Matrix product on modular linear systolic arrays for algorithms
with affine schedule,‖ Journal of Parallel and Distributed Computing, vol. 66, no.
3, pp. 323-333, 2006.
22. Cameron project, Colorado State University: www.cs.colostate.edu/ cameron.
23. Cappello, P., ―A Processor-Time Minimal Systolic Array for Cubical Mesh
Algorithms,‖ IEEE Transactions on Parallel Distributed Systems, vol. 3, no.
1, pp. 4-13, 1992.
24. Cappello, P., ―Application-Specific Processor Architecture: Then and Now,‖
Journal of Signal Processing Systems, Springer Science, vol. 53, pp. 197–215,
2008.
25. Casseau, E. and B. L. Gal, ―Design of Multi-Mode Application-Specific Cores
based on High-Level Synthesis,‖ Journal of Integration, the VLSI Journal, vol.
45, no. 1, pp. 9-21, 2012.
26. Castillo-Atoche, A., D. Torres-Roman and Y. Shkvarko, ―Towards Real Time
Implementation of Reconstructive Signal Processing Algorithms using Systolic
Arrays Coprocessors,‖ Journal of Systems Architecture, vol. 56, no. 8, pp. 327-
339, 2010.
27. Catapult C synthesis, Mentor Graphics Corp.: http://www.mentor.com/.
28. Celoxica DK4–DK Design Suite User Guide, Celoxica Ltd., 2005.
29. Chandrakasan, A. P, M. Potkonjak, J. Rabaey and R. W. Brodersen, ―Hyper -
LP: A System for Power Minimization using Architectural Transformations,‖ in
IEEE/ACM International Conference on Computer-Aided Design, (ICCAD-92),
Digest of Technical Papers, pp. 300-303, November 1992.
30. Chao, D. Y. and D. T. Wang, ―Iteration bounds of single-rate data flow graphs for
concurrent processing,‖ IEEE Transactions on Circuits Systems I, Fundamental.
Theory Applications, vol. 40, pp. 629–634, 1993.
31. Chao, L. F. and E. H.-M. Sha, ―Scheduling data-flow graphs via retiming and
unfolding,‖ IEEE Transactions on Parallel and Distributed Systems, vol. 8,
pp. 1259-1267, 1997.
210
32. Chavet, C., C. Andriamisaina, P. Coussy, E. Casseau, E. Juin, P. Urard and E.
Martin, ―A design flow dedicated to multimode architectures for DSP
applications,‖ Proceedings of IEEE/ACM ICCAD, pp. 604-611, 2007.
33. Chen, J. and C. H. Chang, ―High-Level Synthesis Algorithm for the Design of
Reconfigurable Constant Multiplier,‖ IEEE Transactions on CAD of Integrated
Circuits and Systems, vol. 28, no. 12, pp. 1844-1856, 2009.
34. Clauss, Ph., C Mongenet and G .R. Perrin, ―Synthesis of Size-Optimal Toroïdal
Arrays Space Optimal for the Algebraic Path Problem,‖ Journal of Parallel
Computing, vol. 18, no. 2, pp. 185–194, 1992.
35. Compton, K. and S. Hauck, ―Reconfigurable computing: A Survey of Systems and
Software,‖ ACM Computer Surveys, vol. 34, no. 2, pp. 171-210, 2002.
36. A. Benedetti and P. Perona, ―Real-Time 2-D Feature Detection on a
Reconfigurable Computers,‖ IEEE Proceedings of Conference Computer
Vision and Pattern Recognition, pp. 586-593, 1998.
37. Cong, J. and J. Xu, ―Simultaneous FU and Register Binding Based on Network
Flow Method,‖ Proceedings of Design Automation and Test, pp. 1057–1062,
2008.
38. Cong, J., Y. Fan, G. Han and Z. Zhang, ―Application Specific Instruction
Generation for Configurable Processor Architectures,‖ in Proceedings of the
ACM/SIGDA 12th
International Symposium on Field Programmable Gate Arrays,
(FPGA‘04), pp. 183-189, 2004.
39. Cormen, T. H., C. E. Leiserson, R. L. Rivest and C. Stein, ―Single-Source
Shortest Path,‖ Introduction to Algorithms, 2nd ed. Cambridge, MA: MIT Press,
pp. 580–619, 2001.
40. Coussy, P. and A. Morawiec,(eds.,), ―High-Level Synthesis: From Algorithm to
Digital Circuit,‖ Springer, 2008.
41. Coussy, P., C. Chavet, P. Bomel, D. Heller, E. Senn and E. Martin, ―GAUT: A
High- Level Synthesis Tool for DSP Applications,‖ Springer, 2008.
42. Coussy, P., and A. Takach, ―Raising the Abstraction Level of Hardware Design,‖
IEEE Design and Test of Computers, vol. 26, no. 4, pp. 4-6, 2009.
211
43. Darte, A. and C. Quinson, ―Scheduling Register-Allocated Codes in User-
Guided High-Level Synthesis,‖ Proceedings of IEEE International Conference on
Application-Specific Systems, Architectures and Processors, pp. 140-147, 2007.
44. DeHon, A. and J. Wawrzynek, ―Reconfigurable Computing: What, Why and
Implications for Design Automation,‖ Proceedings of 36th Design automation
Conference, pp. 610-615, 1999.
45. DeHon, A., ―The Density Advantage of Reconfigurable Computing,‖ IEEE
computers, vol. 33, no. 4, pp. 41-49, 2000.
46. Densmore, D., A. S. Vincentelli and R. Passerone, ―A Platform Based
Taxonomy for ESL design,‖ IEEE Design and Test of Computers, vol. 23, no. 5,
pp. 359–374, 2006.
47. Densmore, D., A. Simalatsar, A. Davare, R. Passerone and A. S. Vincentelli,
―UMTS MPSoC Design Evaluation Using a System Level Design Framework
Design, A Platform Based Taxonomy for ESL Design,‖ IEEE Design and Test of
computers, IEEE Circuits and Systems Society, vol. 23, no. 5, pp. 359-374, 2006.
48. Derrien, S. and S.V. Rajopadhye ―Loop Tiling for Reconfigurable Accelerators,‖
Proceedings of the 11th
International Conference on Field Programmable Logic
and Applications, ( FPLA‘01), pp. 398-408, 2001.
49. Dimitroulakos, G., M. D. Galanis and C. E. Goutis, ―Design Space Exploration of
an Optimized Compiler Approach for a Generic Reconfigurable Array
Architecture,‖ Springer Journal of Supercomputing, vol. 40, pp. 127–157, 2007.
50. Djamegni, C. T., ―Synthesis Of Space-Time Optimal Systolic Algorithms For The
Cholesky Factorization,‖ Discrete Mathematics and Theoretical Computer Science,
vol. 5, pp. 109–120, 2002.
51. Djamegni, C. T., ―Complexity of Matrix Product on Modular Linear Systolic
Arrays for Algorithms with Affine Schedules,‖ Journal of Parallel and Distributed
Computing, vol. 66, no. 3, pp. 323-333, 2006.
52. Djamegni, C.T., ―Contribution to the Synthesis of Optimal Algorithms for regular
arrays,‖ Thèsis, Department of Computer Science, University of Yaoundé I-
Cameroun, 1997.
212
53. Djamegni, C.T., ―Mapping rectangular mesh algorithms onto asymptotically space-
optimal arrays,‖ Journal of Parallel and Distributed Computers, vol. 64, no. 3, pp.
345-359, 2004.
54. Draper, B. A., J. R. Beveridge, A. P. Willem Böhm, C. Ross and M. Chawathe,
―Accelerated Image Processing on FPGAs,‖ IEEE Transactions on Image
Processing, vol. 12, no. 12, pp. 1543-1551, 2003.
55. Dutt, N. and C. Ramchandran, ―Benchmarks for the 1992 high level synthesis
workshop,‖ University of California, Irvine, Tech. Rep. 92-107, 1992.
56. Dutta, H., D. Kissler, F. Hannig, A. Kupriyanov, J. Teich and B. Pottier, ―A
holistic approach for tightly coupled reconfigurable parallel processors,‖ Journal of
Microprocessors and Microsystems, vol. 33, no. 1, pp. 53-62, 2009.
57. Dutta, H., F. Hannig, H. Ruckdeschel and J. Teich, ―Efficient Control Generation
for Mapping Nested Loop Programs onto Processor Arrays,‖ Journal of Systems
Architecture, vol. 53, no. 5-6, pp. 300-309, 2007.
58. Ebeling, C., D. Cronquist, P. Franklin and C. Fisher, ―RaPiD—A Configurable
Computing Architecture for Compute Intensive Applications,‖ Technical Report,
TR-96-11-03, 1996.
59. Ebeling, C., D. C. Cronquist, P. Franklin, J. Secosky and S. G. Berg, ―Mapping
Applications to the RaPiD Configurable Architecture,‖ Proceedings of 5th Annual
IEEE Symposium on FPGA for Custom Computing Machines, pp. 106-115,
1997.
60. Fan, K., M. Kudlur, H. Park, and S. Mahlke, ―Increasing Hardware Efficiency
with Multifunction Loop Accelerators,‖ Proceedings of International Conference
of Hardware-Software Codesign System and Synthesis (CODESS+ISSS), pp.
276-281, 2006.
61. Gajski, D. D., J. Zhu, R. D¨omer, A. Gerstlauser and S. Zhoa, ―SpecC:
Specification Language and Methodology,‖ Kluwer Academic Publishers,
2000.
62. Gajski, D. D., N. D. Dutt, A. C. H. Wu and S. Y. L. Lin, (Editors), ―Architectural
models in synthesis - High-Level Synthesis: Introduction to Chip and System
Design,‖ Kluwer Publications , pp. 27-61, 1992.
213
63. Gajski, D. D., N. D. Dutt, C. Allen, H. Wu, Y. Steve and L. Lin, ―High Level
Synthesis: Introduction to Chip and System Design,‖ Kluwer Academic Press,
1992.
64. Galanis, M., G. Theodoridis, S. Tragoudas and C. Goutis, ―A High Performance
Data-Path for Synthesizing DSP Kernels,‖ IEEE Transactions on Computer Aided
Design Integrated Circuits Systems, vol. 25, no. 6, pp. 1154-1163, 2006.
65. Ganapathy, K.N. and B.W. Wah, ―Optimal synthesis of algorithm-specific lower
dimensional processor arrays,‖ IEEE Transactions on Parallel and Distributed
Systems, vol. 7, no. 3, pp. 274 -287, 1996.
66. Gokhale, M., ―Stream Oriented FPGA Computing in Streams-C,‖ Proceedings of
IEEE Symposium of Field-Programmable Custom Computing Machines, 2000.
67. Goldstein, S. C., H. Schmit, M. Budiu, S. Cadambi, M. Moe and R. Taylor,
―PipeRench: A Reconfigurable Architecture and Compiler,‖ IEEE Computers, vol.
33, no. 4, pp. 70-77, 2000.
68. Goldstein, S. C., H. Schmit, M. Moe, M. Budiu, S. Cadambi, R. R. Taylor, and
R. Laufer, ―PipeRench: A Coprocessor for Streaming Multimedia Acceleration,‖
Proceedings of International Symposium of Computer Architecture, 1999.
69. Govindaraju, N. K. and D. Manocha, ―Cache-Efficient Numerical Algorithms
using Graphics Hardware,‖ Elsevier Journal of Parallel Computing, vol. 33, no. 10-
11, pp. 663-684, 2007.
70. Gupta, S., N. Dutt, R. Gupta and A. Nicolau, ―SPARK: A High-Level Synthesis
Framework for Applying Parallelizing Compiler Transformations,‖ Proceedings of
International Conference on Very-Large-Scale Integration Design, pp. 461-466,
2003.
71. Gupta, S., N. Dutt, R. Gupta and A. Nicolau, ―Spark: a High-Level Synthesis
Framework for Applying Parallelizing Compiler Transformations,‖ VLSID‘03–
The Proceedings of 16th
International Conference on VLSI Design, pp. 461–466,
2003.
72. Hannig, F., H. Dutta and J. Teich, ―PARO-A Design Tool for the Automatic
Generation of Hardware Accelerators,‖ Proceedings of 20th
IEEE International
Conference on Application-Specific Systems, Architectures and Processors
(ASAP), 2009.
214
73. Hartenstein, R., ―Coarse Grain Reconfigurable Architectures,‖ Proceedings of 6th
Asia South Pacific Design Automation Conference, pp. 564-570, 2001.
74. Hartenstein, R., M. Herz, T. Hoffmann and U. Nageldinger, ―Using the Kress
Array for Reconfigurable Computing,‖ Proceedings of SPIE, pp. 150-161, 1998.
75. Hartenstein, R.W., J. Becker, R. Kress, H. Reinig and K. Schmidt, ―A
Reconfigurable Machine for Applications in Image and Video Compression,‖
Proceedings of Conference of Compression Technologies and Standards for Image
and Video Compression, 1995.
76. Heysters, P. and G. Smit, ―Mapping of DSP Algorithms on the MONTIUM
Architecture,‖ Proceedings of International Parallel and Distributed Process
Symposium, pp. 180-185, 2003.
77. Hourani, R., R. Jenkal, W. R. Davis and A. Winser, ―Automated Design Space
Exploration for DSP Applications,‖ Springer Journal of Signal Processing
Systems, vol. 56, no. 2-3, pp. 199–216, 2009.
78. Wikipedia, B. Kolmogorov complexity, 2007; http:// en.wikipedia.org/wiki/search:
Kolmogorov complexity; last accessed 2007.
79. Huang, C., Y. Chen, Y. Lin, and Y. Hsu, ―Data Path Allocation Based on
Bipartite Weighted Matching,‖ Proceedings of Design Automation Conference,
pp. 499-504, 1990.
80. Huang, Z., S. Malik, N. Moreano, and G. Araujo, ―The design of dynamically
reconfigurable datapath coprocessors,‖ ACM Transactions on Embedded
Computers and Systems, vol. 3, no. 2, pp. 361–384, 2004.
81. Hwang, C. T., J. H. Lee and Y. C. Hsu, ―A Formal Approach to the Scheduling
Problem in High Level Synthesis,‖ IEEE Transactions on Computer Aided Design
vol. 10, no. 4, pp. 464-475, 1991.
82. Impulse Co-Developer, Impulse Accelerated Technologies:
http://www.impulsec.com/ products.html.
83. J´o´zwiak, L. and A. Douglas, ―Hardware Synthesis for Reconfigurable
Heterogeneous Pipelined Accelerators,‖ The Proceedings of 5th
International
Conference on Information Technology: New Generations, pp. 1123-1130, 2008.
215
84. Jain, R., K. Somalwar, J. Werth and J.C Browne, ―Heuristics for Scheduling I/O
Operations,‖ IEEE Transactions on Parallel and Distributed Systems, vol. 8,
no. 3, pp-310-320, 2002.
85. Jóźwiak, L., N. Nedjah and M. Figueroa, ―Modern Development Methods and
Tools for Embedded Reconfigurable Systems: A Survey,‖ Integration, the VLSI
Journal, vol. 43, no. 1, pp. 1-33, 2010.
86. Karfa, C., D. Sarkar and C. Mandal, ―Verification of Data-path and Controller
Generation Phase in High-Level Synthesis of Digital Circuits,‖ IEEE Transactions
on Computer Aided Design of Integrated Circuits and Systems, vol. 29, no. 3,
pp. 479-492, 2010.
87. Karp, R.M., R.E. Miller, S. Winograd, ―The Organization of Computations for
Uniform Recurrence Equations,‖ Journal of ACM, vol. 14, no. 3, pp. 563-590,
1967.
88. Karuri, K., A. Chattopadhyay, C. Xiaolin, D. Kammler, L. Hao, R. Leupers, G.
Ascheid and H. Meyr, ―A Design Flow for Architecture Exploration and
Implementation of Partially Reconfigurable Processors,‖ IEEE Transactions on
VLSI systems, vol. 16, no. 10, pp. 1281-1294, 2008.
89. Karypis, G., R. Aggarwal, V. Kumar and S. Shekhar, ―Multilevel Hyper-graph
Partitioning: Applications in VLSI Domain,‖ IEEE Transactions on Very Large
Scale Integration (VLSI) Systems, vol. 7, no. 1, pp. 69-79, 1999.
90. Keutzer, K., A.R. Newton, J.M. Rabaey and A.S. Vincentelli, ―System- level
design: orthogonalization of concerns and platform-based design,‖ IEEE
Transaction on CAD, vol. 19, no. 12, pp. 1523-1543, 2000.
91. Khalili, A.J. AI., ―Synthesis of Systolic Arrays from Single Assignment
Algorithm,‖ IEEE Transactions on Signal Processing, vol.1, pp. 3-11, 1995.
92. Kim, T. and X. Liu, ―A Functional Unit and Register Binding Algorithm for
Interconnect Reduction,‖ IEEE Transactions on Computer Aided Design of
Integrated Circuits and Systems, vol. 29, no. 4, pp. 641-646, 2010.
216
93. Kittitornkun, S. and Y. H. Hu, ―Mapping Deep Nested Do-Loop DSP Algorithms
to Large Scale FPGA Array Structures,‖ IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, vol. 11, no. 2, pp. 208-217, 2003.
94. Komarek, T. and P. Pirsch, ―Array Architectures for Block Matching
Algorithms,‖ IEEE Transactions on Circuit Systems, vol. 36, no. 10, pp. 1301-
1308, 1989.
95. Kornaros, G., ―A Soft Multi-Core Architecture for Edge Detection and Data
Analysis of Microarray Images,‖ Journal of Systems Architecture, vol. 56, no. 1,
pp. 48-62, 2010.
96. Kumar, V. V. and J. Lach, ―Highly Flexible Multi-mode System Synthesis,‖
Proceedings of International Conference of Hardware–Software Codesign
System and Synthesis. (CODESS+ISSS), pp. 27–32, 2005.
97. Kung, H.T. and C. E. Leiserson, ―Systolic arrays for VLSI,‖ Proceedings of Sparse
Matrix, SIAM, pp. 256-282, 1979.
98. Kung, S. Y., ―VLSI Array Processors,‖ IEEE ASSP Magazine, vol. 2, no. 3,
pp. 4-22, 1985.
99. Lamport, L., ―The parallel Execution of Do loops,‖ Journal of Communication
ACM, vol. 17, no. 2, pp. 83-93, 1974.
100. Lee, C., S. Kim and S. Ha, ―A Systematic Design Space Exploration of MPSoC
Based on Synchronous Data Flow Specification,‖ Journal of Signal Processing
Systems, Springer Science Journal, vol. 58, no. 2, pp. 193-213, 2010.
101. Lee, P. Z., and Z. M. Kedem, ―Synthesizing Linear Array Algorithms from
Nested for Loop Algorithms,‖ IEEE Transactions on Computers, vol. 37, no. 12,
pp. 1578–1598, 1988.
102. Liang, X. and J. Jean, ―Mapping of Generalized Template Matching onto
Reconfigurable Computers,‖ IEEE Transaction on Very Large Scale Integration
(VLSI) Systems, vol. 11, no. 3, pp. 485-498, 2003.
103. Louka, B. and M. Tchuenté, ―Triangular Matrix Inversion on Systolic Arrays,‖
Journal of Parallel Computers, vol. 14, no. 2, pp. 223-228, 1990.
104. Louka, B. and M. Tchuenté, ―Triangular matrix inversion on systolic arrays‖,
Journal of Parallel Computers, vol. 14, no. 2, pp. 223-228, 1990.
217
105. Maheshwari N. and S. Sapatnekar, ―Efficient Retiming of Large Circuits,‖ IEEE
Transaction On Very Large Scale Integration (VLSI) Systems, vol. 6, no. 1,
pp. 74-83, 1998.
106. Maheshwari, N. and S. Sapatnekar, ―Efficient Retiming of Large Circuits,‖ IEEE
Transactions On Very Large Scale Integration (VLSI) Systems, vol. 6, no. 1,
pp. 74-83, 1998.
107. Marshall, A., T. Stansfield, I. Kostarnov, J. B. Vuillemin and Hutchings, ―A
Reconfigurable Arithmetic Array for Multimedia Applications,‖ Proceedings of 7th
ACM/SIGDA International Symposium Field Programmable Gate Arrays,
pp. 135-143,1999.
108. Matthew Areno, Brandon Eames and Joshua Templin, ―A Force Directed
Scheduling based Architecture Generation Algorithm and Design Tool for
FPGAs,‖
Journal of Systems Architecture, vol. 56, no. 2-3, pp. 124-135, 2010.
109. McFarland, M. C., A. C. Parker and R. Camposano, ―Tutorial on High-Level
Synthesis,‖ Proceedings of the 25th ACM/IEEE Design Automation
Conference, pp. 330-336, 1988.
110. Meher, P. K., S. Chandrasekhar and A. Amira, ―FPGA Realization of FIR
Filters by Efficient and Flexible Systolization using Distributed Arithmetic,‖
IEEE Transactions on Signal Processing, vol. 56, no. 7, pp. 3009-3017, 2008.
111. Metropolis Gigascale Systems Research Center,: www.gigascale.org/ metropolis/.
112. Micheli, G. D., ―Resource Sharing and Binding,‖ Synthesis and Optimization of
Digital Circuits,‖ pp. 229–266, McGraw-Hill, 1994.
113. Michelli, G. D., ―Synthesis and Optimization of Digital Circuits,‖ McGraw-Hill
Higher Education, 1994.
114. Milovanovic, E.I., T.R. Nikolic, M.K. Stojčev and I.Z. Milovanovic, ―Multi-
Functional Systolic Array with Reconfigurable Micro-Power Processing
Elements,‖ Microelectronics Reliability, vol. 49, no. 7, pp. 813-820, 2009.
115. Milovanovic, I.Z., E.I. Milovanovic, M.K. Stojxev and M.P. Bekakos,
―Orthogonal Fault-Tolerant Systolic Arrays for Matrix Multiplication,‖
Microelectronics Reliability, vol. 51, no. 3, pp. 711-725, 2011.
218
116. Moldovan, D.I., ―On the Analysis and Synthesis of VLSI Algorithms,‖ IEEE
Transactions on Computers, vol. 31, no. 11, pp. 1121- 1126, 1982.
117. Molina, M.C., R. R. Sautua, P. G. Repetto and J.M. Mendías, ―Performance
Driven Scheduling of Behavioural Specifications,‖ Integration the VLSI Journal,
vol. 42, pp. 294-303, 2009.
118. Moreano, N., E. Borin, C. de Souza and G. Araujo, ―Efficient Datapath Merging
for Partially Reconfigurable Architectures,‖ IEEE Transactions on Computer
Aided Design of Integrated Circuits and Systems, vol. 24, no. 7, pp. 969–980,
2005.
119. Myjak, M. J. and D. Frias, ―A Medium-Grain Reconfigurable Architecture for
DSP: VLSI Design, Benchmark Mapping and Performance,‖ IEEE transactions on
Very Large Scale Integration (VLSI) Systems, vol. 16, no. 1, pp. 14-23, 2008.
120. Ong, S. A., ―System-Level Design Decision-Making for Real-Time Embedded
Systems,‖ Ph.D. Dissertation, Faculty of Electrical Engineering, Eindhoven
University of Technology, pp. 1-221, 2004.
121. Ouni, B., R. Ayadi and A. Mtibaa, ―Temporal Partitioning of Data Flow Graph
for Dynamically Reconfigurable Architecture,‖ Journal of Systems Architecture,
vol. 57, no. 8, pp. 790-798, 2011.
122. Ouni, B., R. Ayadi and A. Mtibaa, ―Temporal Partitioning of Data Flow Graph
for Dynamically Reconfigureurable Architecture,‖ Journal of Systems
Architecture, vol. 57, no.8, pp. 790-798, 2011.
123. PACT Informations Technologie, ―GmbH the XPP white paper,‖ vol. 2, no. 1,
2002.
124. Palkovic, M., F. C. Imec and H. Corporaal, ―Trade-offs in Loop
Transformations,‖ ACM Transactions on Design Automation of Electronic
Systems, vol. 14, no. 2, Article 22, pp. 22-1-230, 2009.
125. Panda, P. and N. Dutt, ―1995 High Level Synthesis Design Repository‖,
Proceedings of International Symposium of System and Synthesis, pp. 170-174,
1995.
126. Parhi, K. K. and D. G. Messerschmitt, ―Static rate-optimal scheduling of
iterative data-flow programs via optimum unfolding,‖ IEEE Transactions on
Computers, vol. 40, pp. 178–195, 1991.
219
127. Parhi, K. K., ―VLSI Signal Processing,‖ John Wiley 1991.
128. Passos, N. L. and E. H. Sha, ―Achieving Full Parallelism Using
Multidimensional Retiming,‖ IEEE Transactions on Parallel and Distributed
Systems, vol. 7, no. 11, pp. 1150-1163, 1996.
129. Passos, N. L. and E. H. Sha, ―Achieving Full Parallelism Using Multidimensional
Retiming,‖ IEEE Transactions On Parallel And Distributed Systems, vol. 7, no. 11,
1996.
130. Passos, N. L., E. H. Sha and S. C. Bass, ―Optimizing DSP Flow Graphs via
Schedule-Based Multidimensional Retiming,‖ IEEE Transactions on signal
processing, vol. 44, no. 1, 1996.
131. Paulin, P. G. and J. P. Knight, ―Force Directed Scheduling for Behavioral
Synthesis of ASICs,‖ IEEE transactions on Computer-Aided Design of Integrated
Circuits and Systems, vol. 8, no. 6, pp. 661-679, 1989.
132. Pavlatos, C., A. C. Dimopoulos, A. Koulouris, T. Andronikos, I. Panagopoulos
and G. Papakonstantinou, ―Efficient Reconfigurable Embedded Parsers,‖ Journal
of Computer Languages, Systems and Structures, vol. 35, no. 2, pp. 196-215,
2009.
133. Peng, D. and M. Lu, ―On Exploring Inter Iteration Parallelism Within Rate-
Balanced Multi-rate Multidimensional DSP Algorithms,‖ IEEE Transaction on
Very Large Scale Integration (VLSI) Systems, vol. 13, no. 1, pp. 106-125, 2005.
134. Philippe, C. and M. Adam (Editors), ―High-level synthesis from algorithm to
digital circuit,‖ Springer, 2008,
135. Philippidis Cesar, J. and Weijia Shang, ―On Minimizing Register Usage of
Linearly Scheduled Algorithms with Uniform Dependencies,‖ Computer
Languages, Systems and Structures, vol. 36, no. 3, pp. 250-267, 2010.
136. Philippidis,
C. J. and W. Shang, ―On Minimizing Register Usage of Linearly
Scheduled Algorithms with Uniform Dependencies,‖ Journal of Computer
Languages, Systems and Structures, vol. 36, no. 3, pp. 250-267, 2010.
220
137. Platform Architect, CoWare: /http://www.coware.com/products/S.
138. Plunkett, B. B. and J. Watson, ―Adapt 2400 ACM architecture overview,‖ Quick-
Silver Technology, online available: http://www.qstech.com/pdfs/Adapt 2000_
Overview. pdf.
139. Prasanna, V. K. and C. T. Yu, ―On Synthesizing Optimal Family of Linear
Systolic Arrays for Matrix Multiplication,‖ IEEE Transactions on Computers, vol.
40, no. 6, pp. 770-774, 1991.
140. Ptolemy, University of California at Berkeley: http://ptolemy.eecs.berkeley.edu/.
141. Quinton P., C.T. Djamegni, S. Rajopadhye, T. Risset and M. Tchuente, ―A
Reindexing Based Approach Towards Mapping Affine Schedules Onto Parallel
Embedded Systems,‖ Journal of Parallel and Distributed Computing, vol. 69,
pp. 1-11, 2009.
142. Quinton, P. and V.V. Dongen, ―The Mapping of Linear Equations on Regular
Arrays,‖ Journal of VLSI Signal Processing, vol. 1, no. 2, pp. 95-113, 1989.
143. Quinton, P., S. Rajopadhye and T. Risset, ―Extension of the ALPHA language to
Recurrences on Sparse Periodic Domains,‖ Proceedings of IEEE International
Conference on Application Specific Array Processors, pp. 391-401, 1996.
144. Rajopadhye, S. V., ―Synthesizing Systolic Arrays with Control Signal from
Recurrence Equations,‖ Journal of Distributed Computers, vol. 3, pp. 88-105,
1989.
145. Rajopadhye, S.V., S. Purushothaman and R.M. Fujimoto, ―Synthesizing Systolic
Arrays from Recurrences Equations with Linear Dependencies,‖ Proceedings of
Sixth Conference on Foundations of Software Technology and Theoretical
Computer Science, LNCS, Springer Verlag, vol. 241, pp. 488-503, 1986.
146. Ramesh, T. and J. Meier, ―A Multi-FPGA High Performance Computing Platform
for Network Centric Applications,‖ Web Proceedings of International High
Performance Computing Conference, 2006.
221
147. Rao, S. and T. Kailath, ―Regular Iterative Algorithms and their Implementation
on Processors Arrays,‖ Proceedings of IEEE, vol. 76, no. 4, pp. 259-269, 1998.
148. Richardson, I. E. G., ―H.264 and MPEG-4 Video compression and video coding
for next-generation multimedia,‖ John Wiley publications, 2003.
149. Ruiz, G. A. and J. A. Michell, ―An Efficient VLSI Architecture of Fractional
Motion Estimation in H.264 for HDTV,‖ Journal of Signal Processing Systems
Springer Science, vol. 62, no. 3, pp. 443-457, 2010.
150. Samir Palnitkar, ―Verilog HDL, A Guide to Digital Design and Synthesis,‖
Second Edition, Prince Hall, 2003.
151. Say, F. and C. F. Bazlamaçcı, ―A Reconfigurable Computing Platform for Real
Time Embedded Applications,‖ Journal of Microprocessors and Microsystems, vol.
36, no. 1, pp. 13-32, 2012.
152. Schreiber, R., S. Aditya, B. Ramakrishna Rau, V. Kathail, S. Mahlke, S.
Abraham, G. Snider, ―High-level synthesis of non-programmable hardware
accelerators,‖ Proceedings of ASAP‘00 – the IEEE International Conference on
Application Specific Systems, Architectures and Processors, pp. 113-124, 2000.
153. Sengupta, A., R. Sedaghat and Z. Zeng, ―A High Level Synthesis Design Flow
with a Novel Approach for Efficient DSE in Case of Multi-Parametric
Optimization Objective,‖ Journal of Microelectronics Reliability, vol. 50, pp.
424-437, 2010.
154. Shang, L., R.P. Dick and N.K. Jha, ―SLOPES: Hardware–Software Co-Synthesis
of Low-Power Real-Time Distributed Embedded Systems with Dynamically
Reconfigurable FPGAs,‖ IEEE Transaction on CAD, vol. 26, no. 3, pp. 508–526,
2007.
155. Shang, W. and J.A.B. Fortes, ―On Mapping of Uniform Dependence Algorithms
into Lower Dimensional Processors Arrays,‖ IEEE Transactions on Parallel and
Distributed Systems, vol. 3, pp. 350-363, 1992.
156. Shang, W. and J.A.B. Fortes, ―Time Optimal Linear Schedules for Algorithms
with Uniform Dependencies,‖ IEEE Transactions on Computers, vol. 40, pp.
723-742, 1991.
222
157. Simulink HDLCoder, Math Works Inc. http://www.mathworks.com/products/
hdlcoder/.
158. SpecC, University of California, Irvine: http://www.cecs.uci.edu/ specc/
159. Srinivasan, V., S. Govindarajan and R. Vemuri, ―Fine-Grained and Coarse-
Grained Behavioral Partitioning with Effective Utilization of Memory and
Design Space Exploration for Multi-FPGA Architectures,‖ IEEE Transactions on
VLSI Systems, vol. 9, no. 1, pp. 140-158, 2001.
160. Srivastava, M. B. and M. Potkonjak, ―Optimum and Heuristic Transformation
Techniques for Simultaneous Optimization of Latency and Throughput,‖ IEEE
Transactions on Very Large Scale Integration Systems, vol. 3, no. 1, pp. 2-19,
1995.
161. Stojcev, M.K., I.Z. Milovanovic, E.I. Milovanovic and T.R. Nikolic, ―Address
Generators for Linear Systolic Array,‖ Journal of Microelectronics Reliability,
vol. 50, no. 2, pp. 292-303, 2010.
162. Stone, A. and E. S. Manalokos, ―DG2VHDL: To Facilitate the High Level
Synthesis of Parallel Processing Array Architectures‖ , Journal of VLSI Signal
Processing , vol. 24, no. 1, pp. 99-120, 2000.
163. Sun, F., S. Ravi, A. Raghunathan and N. K. Jha, ―Synthesis of Application-
Specific Heterogeneous Multiprocessor Architectures using Extensible
Processors,‖ Proceedings of 18th
International Conference on VLSI Design,
(VLSID ‘05), pp. 551-556, 2005.
164. Sun, F., S. Ravi, A. Raghunathan and N.K. Jha, ―A Scalable Application-Specific
Processor Synthesis Methodology,‖ Proceedings of, International Conference on
Computer-Aided Design, (IEEE/ACMICCAD‘03), pp. 283-290, 2003.
165. Sun, F., S. Ravi, A. Raghunathan and N.K. Jha, ―Custom Instruction Synthesis for
Extensible Processor Platforms,‖ IEEE Transaction on Computer Aided Design,
vol. 23, no. 2, pp. 216-228, 2004.
166. Sun, F., S. Ravi, A. Raghunathan and N.K. Jha, ―Synthesis of Custom Processors
Based on Extensible Platforms,‖ Proceedings of International Conference on
Computer-Aided Design, (IEEE/ACMICCAD‘02) pp. 641-648, 2002.
167. Synopsys, Describing Synthesizable RTL in Systemc. http://www.synopsys.com,
2002.
223
168. Synplify DSP, Synplicity: /http://www.synplicity.com/products/dsp_solutions.
html.
169. Systemc: Systemc version 2.0 user‘s guide. http:// www.systemc.org, 2002.
170. Taylor, M. B., W. Lee, S. Amarasinghe and A. Agarwal, ―Scalar Operand
Networks: On-chip Interconnect for ILP in Partitioned Architectures,‖ Proceedings
of International Symposium on High Performance Computer Architecture, 2003.
171. Teich, J. and L. Thiele, ―Partitioning of Processor Arrays: a Piecewise Regular
Approach,‖ Integration, The VLSI Journal, vol. 14, no. 3, pp. 297-332, 1993.
172. Tensilica, The Xtensa 7 Processor for SOC Design, Accessed on 2007/ 4/7:
http://www.tensilica.com/.
173. Tessier, R. and W. Burleson, Y. Hu, Ed., ―Reconfigurable Computing for Digital
Signal Processing: a Survey,‖ Proceedings of Programmable Digital Signal
Processors, 2001.
174. The International Technology Roadmap for Semiconductors, 2002 Update.
175. Todman, T. J., G. A. Constantinides, S. J. E. Wilton, O. Mencer, W. Luk and
P.Y.K. Cheung, ―Reconfigurable Computing Architectures and Design Methods,‖
IEEE Proceedings of Computer and Digital Techniques, vol. 152, no. 2, pp.193-
207, 2005.
176. Trimaran - An Infrastructure for Research in Backend Compilation and
Architecture Exploration‖ , http://www.trimaran.org/S.
177. Vos, L. D and M. Stegherr, ―Parameterizable VLSI architectures for the Full
Search Block Matching Algorithm,‖ IEEE Transactions on Circuit Systems, vol.
36, no. 10, pp. 1309-1316, 1999.
178. Weinhardt, M. and W. Luk, ―Pipeline Vectorization,‖ IEEE Transactions on
Computer Aided Design, vol. 20, no. 2, pp. 234-248, 2000.
179. West, D. B, ―Introduction to Graph Theory,‖ Pearson Education, 2005.
180. Wong, Y. and J.M. Delosme, ―Transformation of broadcasts into propagations in
systolic algorithms,‖ Journal of Parallel Distributed Computers, vol. 14, no. 2,
pp. 121-145, 1992.
224
181. Woodfill, J. and B. V. Herzen, ―Real-Time Stereo Vision on the ‗PARTS‘
Reconfigurable Computer,‖ IEEE Proceedings of the Symposium of Field
Programmable Custom Computing Machines, pp. 201-210, 1997.
182. www. Xilinx.com; last accessed 2007.
183. www-labsticc.univ-ubs.fr/GAUT; GAUT website 2012.
184. Xilinx, ―Virtex-4 family overview,‖ Literature Number DS112, vol. 15, 2007.
185. Xilinx, BXilinx Virtex-II Pro FPGAs,
186. Xuejie, Z. and W. N. Kam, ―A review of high-level synthesis for dynamically
reconfigurable FPGAs,‖ Journal of Microprocessors and Microsystems, vol. 24,
no. 4, pp. 199-211, 2000.
187. Xydis, S., G. Economakos, D. Soudris and K. Pekmestzi, ―High Performance
and Area Efficient Flexible DSP Data-path Synthesis,‖ IEEE Transactions on
VLSI Systems vol. 19, no. 3, pp. 429-442, 2011.
188. Yeo, H. and Y. H. Hu, ―A Novel Modular Systolic Array Architecture for Full-
Search Block Matching Motion Estimation,‖ IEEE Transactions on Circuit
Systems Video Technology, vol. 5, no. 5, pp. 407–416, 1995.
189. Yong-Kyu Jung, ―Hardware/Software Co-reconfigurable Instruction Decoder for
Adaptive Multi-core DSP Architectures,‖ Journal of Signal Processing Systems,
vol. 62, pp. 273-285, 2011, DOI 10.1007/s11265-010-0461-1.
190. Zhang, H., V. Prabhu, V. George, M. Wan, M. Benes, A. Abnous and J.M.
Rabaey, ―A 1-V heterogeneous reconfigurable DSP IC for Wireless baseband
Digital Signal processing,‖ IEEE Journal of Solid-State Circuits, vol. 35, no. 11,
pp. 1697-1704, 2000.
191. Zhang, J., Z. Zhang, S. Zhou, M. Tan, X. Liu, X. Cheng and J. Cong, ―Bit-Level
Optimization for High-Level Synthesis and FPGA-Based Acceleration,‖ FPGA‘10,
2010.
192. Zhang, X.and K. K. Parhi, ―High Speed VLSI Architectures for the AES
Algorithm,‖ IEEE Transactions on VLSI, vol. 12, no. 9, pp. 957-967, 2004.
193. Zhu, J.and D.D. Gajski, ―Soft scheduling in High Level Synthesis,‖ Proceedings
of the Design Automation Conference, 1999.
194. Zhuo, L. and V. K. Prasanna, ―High Performance Designs for Linear Algebra
Operations on Reconfigurable Hardware,‖ IEEE Transactions on computers, vol.
57, no. 8, pp. 1057-1071, 2008
225
PUBLICATIONS BASED ON THE RESEARCH WORK
International Journal Publications
1. B. Bala Tripura Sundari, ―Design Space Exploration of Deeply Nested Loop 2-D
Filtering and 6-level FSBM Algorithms Mapped onto Systolic Array‖, The VLSI
Design Journal, Hindawi Publishing, vol. 2012, Article ID 268402, 15 pages,
doi:10.1155/2012/268402.
2. B. Bala Tripura Sundari, T R Padmanabhan, ―A Direct Method for Optimal VLSI
Realization of Deeply Nested n-D loop Problems‖, Journal Microprocessors and
Microsystems, Embedded Hardware Design, Affiliated with Euromicro, Copyright ©
Elsevier B.V; (Accepted). DOI: 10.1016/ j.micpro.2013.04.003.
International Conference Publications
3. B. Bala Tripura Sundari, ―Design Space Exploration of Deeply Nested Loop Motion
Estimation Algorithm Mapped onto Systolic Array‖, Proceedings of International
Conference on Communication and Computational Intelligence 2010, ISBN-978-81-
8371-369-6, pp. 46-52, 2010.
4. B. Bala Tripura Sundari, ―Dependence Vectors and Fast Search of Systolic Mapping
for Computationally Intensive Image Processing Algorithms‖, in the Proceedings of
International Multi-Conference for Engineers and Computer Scientists (IMECS) 2011,
ISSN- 2078-0958; ISBN: 978-988-18210-3-4, pp. 555-565, 2011.