Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Language virtualization for heterogeneous parallel computing

Published: 17 October 2010 Publication History

Abstract

As heterogeneous parallel systems become dominant, application developers are being forced to turn to an incompatiblemix of low level programming models (e.g. OpenMP, MPI, CUDA, OpenCL). However, these models do little to shield developers from the difficult problems of parallelization, data decomposition and machine-specific details. Most programmersare having a difficult time using these programming models effectively. To provide a programming modelthat addresses the productivity and performance requirements for the average programmer, we explore a domainspecificapproach to heterogeneous parallel programming.
We propose language virtualization as a new principle that enables the construction of highly efficient parallel domain specific languages that are embedded in a common host language. We define criteria for language virtualization and present techniques to achieve them.We present two concrete case studies of domain-specific languages that are implemented using our virtualization approach.

References

[1]
}}Scala. http://www.scala-lang.org.
[2]
}}AMD. The Industry-Changing Impact of Accelerated Computing. Website. http://sites.amd.com/us/Documents/AMD_fusion_Whitepaper.pdf.
[3]
}}S. Balay, W. D. Gropp, L. C. McInnes, and B. F. Smith. Efficient Management of Parallelism in Object Oriented Numerical Software Libraries. In E. Arge, A. M. Bruaset, and H. P. Langtangen, editors, Modern Software Tools in Scientific Computing, pages 163--202. Birkhäuser Press, 1997.
[4]
}}J. Bentley. Programming pearls: little languages. Commun. ACM, 29(8):711--721, 1986.
[5]
}}G. E. Blelloch and J. Greiner. A Provable Time and Space Efficient Implementation of NESL. In ACM SIGPLAN International Conference on Functional Programming, pages 213--225, May 1996.
[6]
}}D. L. Brown, W. D. Henshaw, and D. J. Quinlan. Overture: An object-oriented framework for solving partial differential equations on overlapping grids. In SIAM conference on Object Oriented Methods for Scientfic Computing, volume UCRL-JC-132017, 1999.
[7]
}}C. Calvert and D. Kulkarni. Essential LINQ. Addison-Wesley Professional, 2009.
[8]
}}J. Carette, O. Kiselyov, and C. chieh Shan. Finally tagless, partially evaluated. In Z. Shao, editor, APLAS, volume 4807 of Lecture Notes in Computer Science, pages 222--238. Springer, 2007.
[9]
}}S. Chakradhar, A. Raghunathan, and J. Meng. Best-effort parallel execution framework for recognition and mining applications. In Proc. of the 23rd Annual Int'l Symp. on Parallel and Distributed Processing (IPDPS'09), pages 1--12, 2009.
[10]
}}B. L. Chamberlain, D. Callahan, and H. P. Zima. Parallel programmability and the chapel language. IJHPCA, 21(3):291--312, 2007.
[11]
}}E. Chow, A. Cleary, and R. Falgout. Design of the hypre Preconditioner Library. In M. Henderson, C. Anderson, and S. Lyons, editors, SIAM Workshop on Object Oriented Methods for Inter-operable Scientific and Engineering Computing, pages 21--23, 1998.
[12]
}}C.-T. Chu, S. K. Kim, Y.-A. Lin, Y. Yu, G. Bradski, A. Y. Ng, and K. Olukotun. Map-reduce for machine learning on multicore. In NIPS '06, pages 281--288, 2006.
[13]
}}J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI, pages 137--150, 2004.
[14]
}}C. Elliott, S. Finne, and O. De Moor. Compiling embedded languages. Journal of Functional Programming, 13(03):455--481, 2003.
[15]
}}J. M. et. al. SISAL: Streams and iterators in a single assignment language, language reference manual. Technical Report M-146, Lawrence Livermore National Laboratory, March 1985.
[16]
}}B. Feigin and A. Mycroft. Jones optimality and hardware virtualization: a report on work in progress. In PEPM, pages 169--175, 2008.
[17]
}}M. Frigo. A fast fourier transform compiler. In PLDI, pages 169--180, 1999.
[18]
}}S. Gorlatch. Send-receive considered harmful: myths and realities of message passing. ACM Trans. Program. Lang. Syst., 26(1):47--56, 2004.
[19]
}}H. P. Graf, E. Cosatto, L. Bottou, I. Durdanovic, and V. Vapnik. Parallel support vector machines: The cascade svm. In NIPS ’04, 2004.
[20]
}}M. Guerrero, E. Pizzi, R. Rosenbaum, K. Swadi, and W. Taha. Implementing DSLs in metaOCaml. In OOPSLA '04: Companion to the 19th annual ACM SIGPLAN conference on Object-oriented programming systems, languages, and applications, pages 41--42, New York, NY, USA, 2004. ACM.
[21]
}}M. A. Heroux, R. A. Bartlett, V. E. Howle, R. J. Hoekstra, J. J. Hu, T. G. Kolda, R. B. Lehoucq, K. R. Long, R. P. Pawlowski, E. T. Phipps, A. G. Salinger, H. K. Thornquist, R. S. Tuminaro, J. M. Willenbring, A. Williams, and K. S. Stanley. An overview of the Trilinos project. ACM Trans. Math. Softw., 31(3):397--423, 2005.
[22]
}}C. Hofer, K. Ostermann, T. Rendel, and A. Moors. Polymorphic embedding of dsls. In Y. Smaragdakis and J. G. Siek, editors, GPCE, pages 137--148. ACM, 2008.
[23]
}}P. Hudak. Modular domain specific languages and tools. In Software Reuse, 1998. Proceedings. Fifth International Conference on, pages 134--142, 1998.
[24]
}}Intel. From a Few Cores to Many: A Tera-scale Computing Research Review. Website. http://download.intel.com/research/platform/terascale/terascale_overvie%w_paper.pdf.
[25]
}}M. Irwin and J. Shen, editors. Revitalizing Computer Architecture Research. Computing Research Association, dec 2005.
[26]
}}S. L. P. Jones, R. Leshchinskiy, G. Keller, and M. M. T. Chakravarty. Harnessing the Multicores: Nested Data Parallelism in Haskell. In R. Hariharan, M. Mukund, and V. Vinay, editors, FSTTCS, volume 2 of LIPIcs, pages 383--414. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2008.
[27]
}}G. L. S. Jr. Parallel programming and parallel abstractions in fortress. In IEEE PACT, page 157. IEEE Computer Society, 2005.
[28]
}}G. Karypis and V. Kumar. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J. Parallel Distrib. Comput., 48(1):71--95, 1998.
[29]
}}K. Kennedy, B. Broom, A. Chauhan, R. Fowler, J. Garvin, C. Koelbel, C. McCosh, and J. Mellor-Crummey. Telescoping languages: A system for automatic generation of domain languages. Proceedings of the IEEE, 93(3):387--408, 2005. This provides a current overview of the entire Telescoping Languages Project.
[30]
}}D. Leijen and E. Meijer. Domain specific embedded compilers. In DSL: Proceedings of the 2 nd conference on Domain-specific languages: Austin, Texas, United States. Association for Computing Machinery, Inc, One Astor Plaza, 1515 Broadway, New York, NY, 10036--5701, USA, 1999.
[31]
}}M. Odersky and M. Zenger. Scalable component abstractions. In R. E. Johnson and R. P. Gabriel, editors, OOPSLA, pages 41--57. ACM, 2005.
[32]
}}K. Olukotun, B. A. Nayfeh, L. Hammond, K. G. Wilson, and K. Chang. The case for a single-chip multiprocessor. In ASPLOS '96.
[33]
}}E. Pasalic, W. Taha, and T. Sheard. Tagless staged interpreters for typed languages. SIGPLAN Not., 37(9):218--229, 2002.
[34]
}}S. Peyton Jones, D. Vytiniotis, S. Weirich, and G. Washburn. Simple unification-based type inference for GADTs. SIGPLAN Not., 41(9):50--61, 2006.
[35]
}}M. Püschel, J. M. F. Moura, B. Singer, J. Xiong, J. Johnson, D. A. Padua, M. M. Veloso, and R. W. Johnson. Spiral: A generator for platform-adapted libraries of signal processing alogorithms. IJHPCA, 18(1):21--45, 2004.
[36]
}}D. Quinlan and R. Parsons. A P array classes for architecture independent finite differences computations. In ONNSKI, 1994.
[37]
}}D. J. Quinlan, B. Miller, B. Philip, and M. Schordan. Treating a user-defined parallel library as a domain-specific language. In IPDPS. IEEE Computer Society, 2002.
[38]
}}J. V. W. Reynders, P. J. Hinker, J. C. Cummings, S. R. Atlas, S. Banerjee, W. F. Humphrey, K. Keahey, M. Srikant, and M. Tholburn. POOMA: A Framework for Scientific Simulation on Parallel Architectures, 1996.
[39]
}}T. Rompf and M. Odersky. Lightweight Modular Staging: A Pragmatic Approach to Runtime Code Generation and Compiled DSLs. In GPCE, 2010.
[40]
}}V. A. Saraswat. X10: Concurrent programming for modern architectures. In APLAS, page 1, 2007.
[41]
}}S.-B. Scholz. Single Assignment C: efficient support for high-level array operations in a functional setting. J. Funct. Program., 13(6):1005--1059, 2003.
[42]
}}T. Schrijvers, S. Peyton Jones, M. Sulzmann, and D. Vytiniotis. Complete and decidable type inference for GADTs. In ICFP '09: Proceedings of the 14th ACM SIGPLAN international conference on Functional programming, pages 341--352, New York, NY, USA, 2009. ACM.
[43]
}}T. Sheard and S. Jones. Template meta-programming for Haskell. ACM SIGPLAN Notices, 37(12):60--75, 2002.
[44]
}}G. C. Sih and E. A. Lee. A compile-time scheduling heuristic for interconnection-constrained heterogeneous processor architectures. IEEE Trans. Parallel Distrib. Syst., 4(2):175--187, 1993.
[45]
}}G. L. Steele. Common Lisp the Language. Digital Press, Billerica, MA, 1984.
[46]
}}J. R. Stewart and H. C. Edwards. A framework approach for developing parallel adaptive multiphysics applications. Finite Elem. Anal. Des., 40(12):1599--1617, 2004.
[47]
}}H. Sutter. The free lunch is over: A fundamental turn toward concurrency in software. Dr. Dobb's Journal, 30(3), 2005.
[48]
}}W. M. Taha. Multistage programming: its theory and applications. PhD thesis, 1999. Supervisor-Sheard, Tim.
[49]
}}A. van Deursen, P. Klint, and J. Visser. Domain-specific languages: an annotated bibliography. SIGPLAN Not., 35(6):26--36, 2000.
[50]
}}D. Vandevoorde and N. Josuttis. C templates: the Complete Guide. Addison-Wesley Professional, 2003.
[51]
}}T. Veldhuizen. Expression templates, C gems, 1996.
[52]
}}T. L. Veldhuizen. Arrays in Blitz. In D. Caromel, R. R. Oldehoeft, and M. Tholburn, editors, ISCOPE, volume 1505 of Lecture Notes in Computer Science, pages 223--230. Springer, 1998.
[53]
}}T. L. Veldhuizen. Active Libraries and Universal Languages. PhD thesis, Indiana University Computer Science, May 2004.
[54]
}}R. C. Whaley, A. Petitet, and J. Dongarra. Automated empirical optimizations of software and the ATLAS project. Parallel Computing, 27(1-2):3--35, 2001.

Cited By

View all
  • (2025)Prediction of psychological intervention for college students in digital entertainment media environment based on artificial intelligence and parallel computing algorithmsEntertainment Computing10.1016/j.entcom.2024.10085852(100858)Online publication date: Jan-2025
  • (2022)Halide Code Generation Framework in PhylanxEuro-Par 2022: Parallel Processing Workshops10.1007/978-3-031-31209-0_3(32-45)Online publication date: 22-Aug-2022
  • (2021)A Study on Contributor to Sports Development Big Data Research Using Oral RecordsJournal of Multimedia Information System10.33851/JMIS.2021.8.4.3018:4(301-308)Online publication date: 31-Dec-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 45, Issue 10
OOPSLA '10
October 2010
957 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/1932682
Issue’s Table of Contents
  • cover image ACM Conferences
    OOPSLA '10: Proceedings of the ACM international conference on Object oriented programming systems languages and applications
    October 2010
    984 pages
    ISBN:9781450302036
    DOI:10.1145/1869459
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2010
Published in SIGPLAN Volume 45, Issue 10

Check for updates

Author Tags

  1. domain specific languages
  2. dynamic optimizations
  3. parallel programming

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)4
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Prediction of psychological intervention for college students in digital entertainment media environment based on artificial intelligence and parallel computing algorithmsEntertainment Computing10.1016/j.entcom.2024.10085852(100858)Online publication date: Jan-2025
  • (2022)Halide Code Generation Framework in PhylanxEuro-Par 2022: Parallel Processing Workshops10.1007/978-3-031-31209-0_3(32-45)Online publication date: 22-Aug-2022
  • (2021)A Study on Contributor to Sports Development Big Data Research Using Oral RecordsJournal of Multimedia Information System10.33851/JMIS.2021.8.4.3018:4(301-308)Online publication date: 31-Dec-2021
  • (2021)Handling Iterations in Distributed Dataflow SystemsACM Computing Surveys10.1145/347760254:9(1-38)Online publication date: 8-Oct-2021
  • (2014)DeliteACM Transactions on Embedded Computing Systems10.1145/258466513:4s(1-25)Online publication date: 1-Apr-2014
  • (2014)The Stratosphere platform for big data analyticsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-014-0357-y23:6(939-964)Online publication date: 1-Dec-2014
  • (2024)HiPy: Extracting High-Level Semantics from Python Code for Data ProcessingProceedings of the ACM on Programming Languages10.1145/36897378:OOPSLA2(736-762)Online publication date: 8-Oct-2024
  • (2018)AnyDSL: a partial evaluation framework for programming high-performance librariesProceedings of the ACM on Programming Languages10.1145/32764892:OOPSLA(1-30)Online publication date: 24-Oct-2018
  • (2018)SIMD intrinsics on managed language runtimesProceedings of the 2018 International Symposium on Code Generation and Optimization - CGO 201810.1145/3179541.3168810(2-15)Online publication date: 2018
  • (2018)SIMD intrinsics on managed language runtimesProceedings of the 2018 International Symposium on Code Generation and Optimization10.1145/3168810(2-15)Online publication date: 24-Feb-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media