Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/354880.354886acmconferencesArticle/Chapter ViewAbstractPublication PagesesweekConference Proceedingsconference-collections
Article
Free access

A preprocessing step for global loop transformations for data transfer optimization

Published: 01 November 2000 Publication History
First page of PDF

References

[1]
{1} A. Agarwal, D. Krantz, V. Nataranjan, "Automatic partitioning of parallel loops and data arrays for distributed shared-memory multiprocessors", IEEE Trans. on Parallel and Distributed Systems, Vol. 6, No. 9, pp. 943-962, Sep. 1995.
[2]
{2} S. Amarasinghe, J. Anderson, M. Lam, and C. Tseng, "The SUIF compiler for scalable parallel machines", Proc. of the 7th SIAM Conf. on Parallel Proc. for Scientific Computing, 1995.
[3]
{3} C. Ancourt, F. Irigoin and Y. Yang, "Minimal data dependence abstractions for loop transformations", Int. J. of Parallel Programming, Vol. 23, No. 4, pp. 359-388, 1995.
[4]
{4} U. Banerjee, R. Eigenmann, A. Nicolau, D. Padua, "Automatic program parallelisation", Proc. of the IEEE, invited paper, Vol. 81, No. 2, Feb. 1993.
[5]
{5} E. Brockmeyer, L. Nachtergaele, F. Catthoor, J. Bormans, H. De Man, "Low power memory storage and transfer organization for the MPEG-4 full pel motion estimation on a multi media processor", IEEE Trans. on Multi-Media, Vol. 1, No. 2, pp. 202-216, June 1999.
[6]
{6} F. Catthoor, S. Wuytack, E. De Greef, F. Franssen, L. Nachtergaele. H. De Man, "System-level transformations for low power data transfer and storage", in paper collection on "Low power CMOS design" (eds. A. Chandrakasan, R. Brodersen), IEEE Press, pp. 609-618, 1998.
[7]
{7} B. Creusillet, F. Irigoin, "Interprocedural array region analysis", Int. J. of Parallel Programming, Vol. 24, No. 6, pp. 513-546.
[8]
{8} K. Danckaert, K. Masselos, F. Catthoor, H. De Man, C. Goutis, "Strategy for power efficient design of parallel systems", IEEE Trans. on VLSI Systems, Vol. 7, No. 2, pp. 258-265, June 1999.
[9]
{9} K. Danckaert, C. Kulkarni, F. Catthoor, H. De Man, V. Tiwari, "A systematic approach for system bus load reduction applied to medical imaging", accepted for Proc. IEEE Int. Conf. on VLSI Design, Bangalore, India, Jan. 2001.
[10]
{10} E. De Greef, F. Catthoor, H. De Man, "Memory Size Reduction through Storage Order Optimization for Embedded Parallel Multimedia Applications", Intnl. Parallel Proc. Symp. (IPPS) in Proc. Workshop on "Parallel Processing and Multimedia", Geneva, Switzerland, pp. 84-98, 1997.
[11]
{11} H. De Man, F. Catthoor, G. Goossens, J. Vanhoof, J. Van Meerbergen, S. Note, J. Huisken, "Architecture-driven synthesis techniques for VLSI implementation of DSP algorithms", Proc. of the IEEE, special issue on "The future of computer-aided Design", Vol. 78, No. 2, pp. 319-335, Feb. 1990.
[12]
{12} M. Dion, Y. Robert, "Mapping affine loop nests: new results", Lecture Notes in Computer Science, Vol. 919 on "High-Performance Computing and Networking", pp. 184-189, 1995.
[13]
{13} P. Feautrier, "Some efficient solutions to the affine scheduling problem", Int. J. of Parallel Programming, Vol. 21, No. 5, pp. 389-420, 1992.
[14]
{14} P. Feautrier, "Automatic parallelization in the polytope model", to appear.
[15]
{15} D. Gannon, W. Jalby, K. Gallivan, "Strategies for cache and local memory management by global program optimizations" J. of Parallel and Distributed Computing, vol. 5, pp. 587-616, 1988.
[16]
{16} M. Gupta, E. Schonberg, H. Srinivasan, "A Unified Framework for Optimizing Communication in Data-Parallel Programs", IEEE Trans. on Parallel and Distributed Systems, Vol. 7, No. 7, pp. 689-704, July 1996.
[17]
{17} M. Kandemir, J. Ramanujam, A. Choudhary, "Improving cache locality by a combination of loop and data transformations", IEEE trans. on computers, vol. 48, no. 2, pp. 159-167, 1999.
[18]
{18} W. Kelly, W. Pugh, "A framework for unifying reordering transformations", Technical report CS-TR-3193, Dept. of CS, Univ. of Maryland, College Park, April 1993.
[19]
{19} C. Kulkarni, K. Danckaert, F. Catthoor, M. Gupta, "Interaction between data parallel compilation and data transfer and storage cost for multimedia applications", Proc. EuroPar Conf., Toulouse, France, September 1999.
[20]
{20} L. Lamport, "The parallel execution of DO loops", Communications of the ACM, Vol. 17, No. 2, pp. 83-93, Feb. 1974.
[21]
{21} C. Lengauer. "Loop parallelization in the polytope model", Proc. of the Fourth Intnl. Conf. on Concurrency Theory, Hildesheim, Germany, Aug. 1993.
[22]
{22} P. Lippens, J. van Meerbergen, W. Verhaegh, A. van der Werf, "Allocation of multiport memories for hierarchical data streams", Proc. IEEE Int. Conf. Comp. Aided Design, Santa Clara CA, Nov. 1993.
[23]
{23} K. McKinley, "A compiler optimization algorithm for shared-memory multiprocessors", IEEE Trans. on Parallel and Ditsributed Systems, Vol. 9, No. 8, pp. 769-787, Aug. 1998.
[24]
{24} I. Verbauwhede, F. Catthoor, J. Vandewalle, H. De Man, "In-place memory management of algebraic algorithms on application-specific IC's", Journal of VLSI signal processing, Vol. 3, Kluwer, Boston, pp. 193-200, 1991.
[25]
{25} M. van Swaaij, F. Franssen, F. Catthoor, H. De Man, "Automating high-level control flow transformations for DSP memory management", Proc. IEEE workshop on VLSI signal processing, Napa Valley CA, Oct. 1992.
[26]
{26} D. Wilde, S. Rajopadhye, "Memory reuse analysis in the polyhedral model", Proc. Euro-Par Conf., Lyon, France, Aug. 1996. Lecture notes in computer science, Vol. 1123, pp. 389-397, Springer, 1996.
[27]
{27} M. Wolfe, U. Banerjee, "Data Dependence and its Application to Parallel Processing", Int. J. of Parallel Programming, Vol. 16, No. 2, pp. 137-178, 1987.
[28]
{28} M. Wolf, "Improving locality and parallelism in nested loops", Ph.D. dissertation, Aug. 1992.
[29]
{29} S. Wuytack, F. Catthoor, L. Nachtergaele, H. De Man, "Power Exploration for Data Dominated Video Applications", Proc. IEEE Intnl. Symp. on Low Power Design, Monterey CA, pp. 359-364, Aug. 1996.

Cited By

View all
  • (2019)Context-based image acquisition from memory in digital systemsJournal of Real-Time Image Processing10.1007/s11554-016-0591-116:4(1057-1076)Online publication date: 1-Aug-2019
  • (2009)Trade-offs in loop transformationsACM Transactions on Design Automation of Electronic Systems10.1145/1497561.149756514:2(1-30)Online publication date: 7-Apr-2009
  • (2007)Bit-Width Constrained Memory Hierarchy Optimization for Real-Time Video SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2006.88456926:4(781-800)Online publication date: 1-Apr-2007
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CASES '00: Proceedings of the 2000 international conference on Compilers, architecture, and synthesis for embedded systems
November 2000
200 pages
ISBN:1581133383
DOI:10.1145/354880
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2000

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 52 of 230 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)37
  • Downloads (Last 6 weeks)14
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Context-based image acquisition from memory in digital systemsJournal of Real-Time Image Processing10.1007/s11554-016-0591-116:4(1057-1076)Online publication date: 1-Aug-2019
  • (2009)Trade-offs in loop transformationsACM Transactions on Design Automation of Electronic Systems10.1145/1497561.149756514:2(1-30)Online publication date: 7-Apr-2009
  • (2007)Bit-Width Constrained Memory Hierarchy Optimization for Real-Time Video SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2006.88456926:4(781-800)Online publication date: 1-Apr-2007
  • (2006)Data dependency size estimation for use in memory optimizationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2003.81425722:7(908-921)Online publication date: 1-Nov-2006
  • (2006)Polyhedral space generation and memory estimation from interface and memory models of real-time video systemsJournal of Systems and Software10.1016/j.jss.2005.04.03479:2(231-245)Online publication date: 1-Feb-2006
  • (2005)Low power engineeringEmbedded Systems Design10.5555/2137690.2137724(450-478)Online publication date: 1-Jan-2005
  • (2005)Low Power EngineeringEmbedded Systems Design10.1007/978-3-540-31973-3_30(450-478)Online publication date: 2005
  • (2001)Detection of partially simultaneously alive signals in storage requirement estimation for data intensive applicationsProceedings of the 38th annual Design Automation Conference10.1145/378239.378525(365-370)Online publication date: 22-Jun-2001
  • (2001)Code Transformations for Data Transfer and Storage Exploration Preprocessing in Multimedia ProcessorsIEEE Design & Test10.1109/54.92280418:3(70-82)Online publication date: 1-May-2001

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media