Article

Towards metaprogramming for parallel systems on a chip

Authors:

Lee Howes,

Anton Lokhmotov,

Alastair F. Donaldson, and

Paul H. J. KellyAuthors Info & Claims

Euro-Par'09: Proceedings of the 2009 international conference on Parallel processing

August 2009

Pages 36 - 45

Published: 25 August 2009 Publication History

Abstract

We demonstrate that the performance of commodity parallel systems significantly depends on low-level details, such as storage layout and iteration space mapping, which motivates the need for tools and techniques that separate a high-level algorithm description from low-level mapping and tuning. We propose to build a tool based on the concept of decoupled Access/Execute metadata which allow the programmer to specify both execution constraints and memory access pattern of a computation kernel.

References

[1]

Howes, L.W., Lokhmotov, A., Donaldson, A.F., Kelly, P.H.: Deriving efficient data movement from decoupled Access/Execute specifications. In: Seznec, A., Emer, J., O'Boyle, M., Martonosi, M., Ungerer, T. (eds.) HiPEAC 2009. LNCS, vol. 5409, pp. 168-182. Springer, Heidelberg (2009).

Digital Library

Google Scholar

[2]

Lin, C., Snyder, L.: Principles of Parallel Programming. Addison-Wesley, MA (2008).

Digital Library

Google Scholar

[3]

NVIDIA: CUDA (2006-2009), http://www.nvidia.com/cuda

Google Scholar

[4]

Bik, A.J.: The Software Vectorization Handbook. Applying Multimedia Extensions for Maximum Performance. Intel Press, Hillsboro (2004).

Digital Library

Google Scholar

[5]

McCalpin, J.D.: STREAM: Sustainable memory bandwidth in high performance computers (1990-2009), http://www.cs.virginia.edu/stream/

Google Scholar

[6]

Ueng, S.-Z., Lathara, M., Baghsorkhi, S.S., Hwu, W.m.W.: CUDA-lite: Reducing GPU programming complexity. In: Amaral, J.N. (ed.) LCPC 2008. LNCS, vol. 5335, pp. 1-15. Springer, Heidelberg (2008).

Digital Library

Google Scholar

[7]

Fatahalian, K., Horn, D.R., Knight, T.J., Leem, L., Houston, M., Park, J.Y., Erez, M., Ren, M., Aiken, A., Dally, W.J., Hanrahan, P.: Sequoia: programming the memory hierarchy. In: Supercomputing'06, p. 83 (2006).

Digital Library

Google Scholar

[8]

Ryoo, S., Rodrigues, C.I., Stone, S.S., Stratton, J.A., Ueng, S.Z., Baghsorkhi, S.S., Hwu, W.m.W.: Program optimization carving for GPU computing. J. Parallel Distrib. Comput. 68(10), 1389-1401 (2008).

Digital Library

Google Scholar

[9]

The Khronos Group: OpenCL (2008-2009), http://www.khronos.org/opencl

Google Scholar

Cited By

View all

Reddy CKruse MCohen AZaks AMendelson BRauchwerger LHwu W(2016)Reduction DrawingProceedings of the 2016 International Conference on Parallel Architectures and Compilation10.1145/2967938.2967950(87-97)Online publication date: 11-Sep-2016
https://dl.acm.org/doi/10.1145/2967938.2967950
Membarth RLokhmotov ATeich J(2011)Generating GPU code from a high-level representation for image processing kernelsProceedings of the 2011 international conference on Parallel Processing10.1007/978-3-642-29737-3_31(270-280)Online publication date: 29-Aug-2011
https://dl.acm.org/doi/10.1007/978-3-642-29737-3_31

Recommendations

Distributed access to parallel file systems
Read More
Towards more pro-active access control in computer systems and networks

Access control is a core security technology which has been widely used in computer systems and networks to protect sensitive information and critical resources and to counter malicious attacks. Although many access control models have been developed in ...
Read More
Towards comprehensible and effective permission systems
Read More

Comments

Information & Contributors

Information

Published In

Euro-Par'09: Proceedings of the 2009 international conference on Parallel processing

August 2009

468 pages

ISBN:3642141218

Editors:
Hai-Xiang Lin
Delft University of Technology, Delft, The Netherlands
,
Michael Alexander
Scaledinfra Technologies, Vienna, Austria
,
Martti Forsell
VTT, Oulu, Finland
,
Andreas Knüpfer
Technische Universität Dresden, Dresden, Germany
,
Radu Prodan
Technical University of Innsbruck, Innsbruck, Austria
,
Leonel Sousa
Instituto Superior Tecnico/INESC-ID, Lisbon, Portugal
,
Achim Streit
Julich Supercomputing Centre, Germany

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 25 August 2009

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

Cited By

View all

Reddy CKruse MCohen AZaks AMendelson BRauchwerger LHwu W(2016)Reduction DrawingProceedings of the 2016 International Conference on Parallel Architectures and Compilation10.1145/2967938.2967950(87-97)Online publication date: 11-Sep-2016
https://dl.acm.org/doi/10.1145/2967938.2967950
Membarth RLokhmotov ATeich J(2011)Generating GPU code from a high-level representation for image processing kernelsProceedings of the 2011 international conference on Parallel Processing10.1007/978-3-642-29737-3_31(270-280)Online publication date: 29-Aug-2011
https://dl.acm.org/doi/10.1007/978-3-642-29737-3_31

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Recommendations

Distributed access to parallel file systems

Towards more pro-active access control in computer systems and networks

Towards comprehensible and effective permission systems

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations