Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Automatic Storage Optimization for Arrays

Published: 08 April 2016 Publication History

Abstract

Efficient memory allocation is crucial for data-intensive applications, as a smaller memory footprint ensures better cache performance and allows one to run a larger problem size given a fixed amount of main memory. In this article, we describe a new automatic storage optimization technique to minimize the dimensionality and storage requirements of arrays used in sequences of loop nests with a predetermined schedule. We formulate the problem of intra-array storage optimization as one of finding the right storage partitioning hyperplanes: each storage partition corresponds to a single storage location. Our heuristic is driven by a dual-objective function that minimizes both the dimensionality of the mapping and the extents along those dimensions. The technique is dimension optimal for most codes encountered in practice. The storage requirements of the mappings obtained also are asymptotically better than those obtained by any existing schedule-dependent technique. Storage reduction factors and other results that we report from an implementation of our technique demonstrate its effectiveness on several real-world examples drawn from the domains of image processing, stencil computations, high-performance computing, and the class of tiled codes in general.

References

[1]
Samah Abu-Mahmeed, Cheryl McCosh, Zoran Budimli, Ken Kennedy, Kaushik Ravindran, Kevin Hogan, Paul Austin, Steve Rogers, and Jacob Kornerup. 2009. Scheduling tasks to maximize usage of aggregate variables in place. In Proceedings of the International Conference on Compiler Construction (CC’09).
[2]
Alfred V. Aho, Ravi Sethi, Jeffrey D. Ullman, and Monica S. Lam. 2006. Compilers: Principles, Techniques, and Tools (2nd ed.). Prentice Hall.
[3]
Christophe Alias. 2007. Bee+Cl@k. Available at http://compsys-tools.ens-lyon.fr/.
[4]
Christophe Alias, Fabrice Baray, and Alain Darte. 2007. Bee+Cl@k: An implementation of lattice-based array contraction in the source-to-source translator Rose. In Proceedings of the ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems. 73--82.
[5]
Vinayaka Bandishti, Irshad Pananilath, and Uday Bondhugula. 2012. Tiling stencil computations to maximize parallelism. In Proceedings of the International Conference on High Performance Computing, Networking, Storage, and Analysis. Article No. 40.
[6]
U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. 2008. Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model. In Proceedings of the Joint European Conferences on Theory and Practice of Software 17th International Conference on Compiler Construction (CC’08/ETAPS’08). 132--146.
[7]
Philippe Clauss, Federico Javier Fernandez, Diego Garbervetsky, and Sven Verdoolaege. 2009. Symbolic polynomial maximization over convex sets and its application to memory requirement estimation. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 17, 8, 983--996.
[8]
Alain Darte, Robert Schreiber, and Gilles Villard. 2005. Lattice-based memory allocation. IEEE Transactions on Computing 54, 10, 1242--1257.
[9]
Eddy de Greef, Francky Catthoor, and Hugo De Man. 1997. Memory size reduction through storage order optimization for embedded parallel multimedia applications. Parallel Computing 23, 12, 1811--1837.
[10]
P. Feautrier. 1992. Some efficient solutions to the affine scheduling problem: Part I, one-dimensional time. International Journal of Parallel Programming 21, 5, 313--348.
[11]
GNU. 2010. GLPK (GNU Linear Programming Kit). Retrieved February 27, 2016, from https://www.gnu. org/software/glpk/.
[12]
Tobias Grosser, Albert Cohen, Justin Holewinski, Ponuswamy Sadayappan, and Sven Verdoolaege. 2014. Hybrid hexagonal/classical tiling for GPUs. In Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization. ACM, New York, NY, 66.
[13]
Chris Harris and Mike Stephens. 1988. A combined corner and edge detector. In Proceedings of the 4th Alvey Vision Conference. 147--151.
[14]
Intel. 2013. Using Intel VTune Amplifier XE to Tune Software on the Intel Xeon Processor E5 Family. Retrieved February 27, 2016, from https://software.intel.com/en-us/articles/using-intel-vtune-amplifier-xe-to-tune-software-on-the-intel-xeon-processor-e5-family.
[15]
Intel. 2015. Intel VTune Amplifier XE 2015 (build 367957). Retrieved December 20, 2015, from https://software.intel.com/en-us/intel-vtune-amplifier-xe.
[16]
Vincent Lefebvre and Paul Feautrier. 1998. Automatic storage management for parallel programs. Parallel Computing 24, 3--4, 649--671.
[17]
Ravi Teja Mullapudi, Vinay Vasista, and Uday Bondhugula. 2015. PolyMage: Automatic optimization for image processing pipelines. In Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15).
[18]
Irshad Pananilath, Aravind Acharya, Vinay Vasista, and Uday Bondhugula. 2015. An optimizing code generator for a class of Lattice-Boltzmann computations. ACM Transactions on Architecture and Code Optimization 12, 2, Article No. 14.
[19]
Pluto. 2008. PLUTO: An Automatic Polyhedral parallelizer and locality optimizer for multicores. Available at http://pluto-compiler.sourceforge.net.
[20]
Fabien Quilleré and Sanjay V. Rajopadhye. 2000. Optimizing memory usage in the polyhedral model. ACM Transactions on Programming Languages and Systems 22, 5, 773--815.
[21]
Jonathan Ragan-Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, and Saman P. Amarasinghe. 2013. Halide: A language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13). 519--530.
[22]
Alexander Schrijver. 1986. Theory of Linear and Integer Programming. John Wiley & Sons.
[23]
M. Strout, L. Carter, J. Ferrante, and B. Simon. 1998. Schedule-independent storage mapping for loops. In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems. 24--33.
[24]
S. Succi. 2001. The Lattice Boltzmann Equation: For Fluid Dynamics and Beyond. Oxford University Press.
[25]
William Thies, Frédéric Vivien, and Saman Amarasinghe. 2007. A step towards unifying schedule and storage optimization. ACM Transactions on Programming Languages and Systems 29, 6, Article No. 34.
[26]
William Thies, Frédéric Vivien, Jeffrey Sheldon, and Saman P. Amarasinghe. 2001. A unified framework for schedule and storage optimization. In Proceedings of the ACM SIGPLAN Symposium on Programming Language Design and Implementation. 232--242.
[27]
Sven Verdoolaege. 2010. isl: An integer set library for the polyhedral model. In Mathematical Software—ICMS 2010. Lecture Notes in Computer Science, Vol. 6327. Springer, 299--302.
[28]
Doran Wilde and Sanjay V. Rajopadhye. 1996. Memory reuse analysis in the polyhedral model. In Proceedings of the 2nd International Euro-Par Conference on Parallel Processing. 389--397.

Cited By

View all
  • (2022)Lightweight Array Contraction by Trace-Based Polyhedral AnalysisHigh Performance Computing. ISC High Performance 2022 International Workshops10.1007/978-3-031-23220-6_2(20-32)Online publication date: 29-May-2022
  • (2019)Polyhedral Compilation for Multi-dimensional Stream ProcessingACM Transactions on Architecture and Code Optimization10.1145/333099916:3(1-26)Online publication date: 18-Jul-2019
  • (2018)DeLICM: scalar dependence removal at zero memory costProceedings of the 2018 International Symposium on Code Generation and Optimization - CGO 201810.1145/3179541.3168815(241-253)Online publication date: 2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Programming Languages and Systems
ACM Transactions on Programming Languages and Systems  Volume 38, Issue 3
May 2016
209 pages
ISSN:0164-0925
EISSN:1558-4593
DOI:10.1145/2914585
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 April 2016
Accepted: 01 November 2015
Revised: 01 August 2015
Received: 01 February 2015
Published in TOPLAS Volume 38, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Compilers
  2. array contraction
  3. memory optimization
  4. polyhedral framework
  5. storage mapping optimization

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)55
  • Downloads (Last 6 weeks)12
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)Lightweight Array Contraction by Trace-Based Polyhedral AnalysisHigh Performance Computing. ISC High Performance 2022 International Workshops10.1007/978-3-031-23220-6_2(20-32)Online publication date: 29-May-2022
  • (2019)Polyhedral Compilation for Multi-dimensional Stream ProcessingACM Transactions on Architecture and Code Optimization10.1145/333099916:3(1-26)Online publication date: 18-Jul-2019
  • (2018)DeLICM: scalar dependence removal at zero memory costProceedings of the 2018 International Symposium on Code Generation and Optimization - CGO 201810.1145/3179541.3168815(241-253)Online publication date: 2018

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media