Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2737924.2737994acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
research-article

LaminarIR: compile-time queues for structured streams

Published: 03 June 2015 Publication History

Abstract

Stream programming languages employ FIFO (first-in, first-out) semantics to model data channels between producers and consumers. A FIFO data channel stores tokens in a buffer that is accessed indirectly via read- and write-pointers. This indirect token-access decouples a producer’s write-operations from the read-operations of the consumer, thereby making dataflow implicit. For a compiler, indirect token-access obscures data-dependencies, which renders standard optimizations ineffective and impacts stream program performance negatively. In this paper we propose a transformation for structured stream programming languages such as StreamIt that shifts FIFO buffer management from run-time to compile-time and eliminates splitters and joiners, whose task is to distribute and merge streams. To show the effectiveness of our lowering transformation, we have implemented a StreamIt to C compilation framework. We have developed our own intermediate representation (IR) called LaminarIR, which facilitates the transformation. We report on the enabling effect of the LaminarIR on LLVM’s optimizations, which required the conversion of several standard StreamIt benchmarks from static to randomized input, to prevent computation of partial results at compile-time. We conducted our experimental evaluation on the Intel i7-2600K, AMD Opteron 6378, Intel Xeon Phi 3120A and ARM Cortex-A15 platforms. Our LaminarIR reduces data-communication on average by 35.9% and achieves platform-specific speedups between 3.73x and 4.98x over StreamIt. We reduce memory accesses by more than 60% and achieve energy savings of up to 93.6% on the Intel i7-2600K.

References

[1]
LaminarIR website. http://LaminarIR.github.io.
[2]
D. J. Abadi, Y. Ahmad, M. Balazinska, U. Cetintemel, M. Cherniack, J.-H. Hwang, W. Lindner, A. S. Maskey, A. Rasin, E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. The design of the Borealis stream processing engine. In Second Biennial Conference on Innovative Data Systems Research, CIDR ’05, pages 277–289, Asilomar, CA, 2005.
[3]
B. Alpern, M. N. Wegman, and F. K. Zadeck. Detecting equality of variables in programs. In Proceedings of the 15th ACM SIGPLANSIGACT Symposium on Principles of Programming Languages, POPL ’88, pages 1–11, New York, NY, USA, 1988. ACM.
[4]
A. Arasu, S. Babu, and J. Widom. The CQL continuous query language: Semantic foundations and query execution. The VLDB Journal, 15(2):121–142, June 2006.
[5]
J. Auerbach, D. F. Bacon, P. Cheng, and R. Rabbah. Lime: A Java-compatible and synthesizable language for heterogeneous architectures. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages and Applications, OOPSLA ’10, pages 89–108, New York, NY, USA, 2010. ACM.
[6]
S. S. Battacharyya, E. A. Lee, and P. K. Murthy. Software Synthesis from Dataflow Graphs. Kluwer Academic Publishers, Norwell, MA, USA, 1996.
[7]
S. S. Bhattacharyya, J. T. Buck, S. Ha, and E. A. Lee. Generating compact code from dataflow specifications of multirate signal processing algorithms. IEEE Trans. on Circuits and Systems — I: Fundamental Theory and Applications, 42:138–150, March 1995.
[8]
J. C. Bier, E. E. Goei, W. H. Ho, P. D. Lapsley, M. P. O’Reilly, G. C. Sih, and E. A. Lee. Gabriel: A design environment for DSP. IEEE Micro, 10(5):28–45, Sept. 1990.
[9]
J. Bosboom, S. Rajadurai, W.-F. Wong, and S. Amarasinghe. StreamJIT: A commensal compiler for high-performance stream programming. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA ’’14, pages 177–195, New York, NY, USA, 2014. ACM.
[10]
S. Browne, J. Dongarra, N. Garner, G. Ho, and P. Mucci. A portable programming interface for performance evaluation on modern processors. Int. J. High Perform. Comput. Appl., 14(3):189–204, Aug. 2000.
[11]
I. Buck, T. Foley, D. Horn, J. Sugerman, K. Fatahalian, M. Houston, and P. Hanrahan. Brook for GPUs: Stream computing on graphics hardware. In ACM SIGGRAPH 2004, SIGGRAPH ’04, pages 777– 786, New York, NY, USA, 2004. ACM.
[12]
M. K. Chen, X. F. Li, R. Lian, J. H. Lin, L. Liu, T. Liu, and R. Ju. Shangri-La: Achieving high performance from compiled network applications while enabling ease of programming. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’05, pages 224–236, New York, NY, USA, 2005. ACM.
[13]
S. M. Farhad, Y. Ko, B. Burgstaller, and B. Scholz. Orchestration by approximation: Mapping stream programs onto multicore architectures. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI, pages 357–368, New York, NY, USA, 2011. ACM.
[14]
M. I. Gordon. Compiler techniques for scalable performance of stream programs on multicore architectures. PhD thesis, Cambridge, MA, USA, 2010.
[15]
A. H. Hormati, Y. Choi, M. Kudlur, R. Rabbah, T. Mudge, and S. Mahlke. Flextream: Adaptive compilation of streaming applications for heterogeneous architectures. In Proceedings of the 2009 18th International Conference on Parallel Architectures and Compilation Techniques, PACT ’09, pages 214–223, Washington, DC, USA, 2009. IEEE Computer Society.
[16]
M. Kudlur and S. Mahlke. Orchestrating the execution of stream programs on multicore platforms. In Proceedings of the 2008 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI ’08, pages 114–124, New York, NY, USA, 2008. ACM.
[17]
E. A. Lee and D. G. Messerschmitt. Static scheduling of synchronous data flow programs for digital signal processing. IEEE Trans. Comput., 36(1):24–35, Jan. 1987.
[18]
W. R. Mark, R. S. Glanville, K. Akeley, and M. J. Kilgard. Cg: A system for programming graphics hardware in a C-like language. In ACM SIGGRAPH 2003, SIGGRAPH ’03, pages 896–907, New York, NY, USA, 2003. ACM.
[19]
C. Min and Y. I. Eom. DANBI: Dynamic scheduling of irregular stream programs for many-core systems. In Proceedings of the 22Nd International Conference on Parallel Architectures and Compilation Techniques, PACT ’13, pages 189–200, Piscataway, NJ, USA, 2013.
[20]
IEEE Press.
[21]
J. Sermulins, W. Thies, R. Rabbah, and S. Amarasinghe. Cache aware optimization of stream programs. In Proceedings of the 2005 ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems, LCTES ’05, pages 115–126, New York, NY, USA, 2005. ACM.
[22]
R. Soulé, M. I. Gordon, S. Amarasinghe, R. Grimm, and M. Hirzel. Dynamic expressivity with static optimization for streaming languages. In Proceedings of the 7th ACM International Conference on Distributed Event-based Systems, DEBS ’13, pages 159–170, New York, NY, USA, 2013. ACM.
[23]
J. H. Spring, J. Privat, R. Guerraoui, and J. Vitek. StreamFlex: Highthroughput stream programming in Java. pages 211–228, 2007.
[24]
W. Thies and S. Amarasinghe. An empirical characterization of stream programs and its implications for language and compiler design. In Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, PACT ’10, pages 365–376, New York, NY, USA, 2010. ACM.
[25]
W. Thies, M. Karczmarek, and S. P. Amarasinghe. StreamIt: A language for streaming applications. In Proceedings of the 11th International Conference on Compiler Construction, CC ’02, pages 179–196, London, UK, 2002. Springer-Verlag. Introduction Motivating Example LaminarIR Local Direct Access Transformation Global Direct Access Transformation Background and Notation Concrete SDF Semantics Auxiliary Semantics Experimental Results Performance Communication Elimination LLVM Optimization Statistics Related Work Conclusion

Cited By

View all
  • (2018)Efficient Algorithm for the Iteration Period Computation of Unfolded Synchronous Dataflow Graphs2018 International Symposium on Theoretical Aspects of Software Engineering (TASE)10.1109/TASE.2018.00013(36-43)Online publication date: Aug-2018
  • (2018)Towards Memory-Optimal Schedules for SDFEuro-Par 2017: Parallel Processing Workshops10.1007/978-3-319-75178-8_8(94-105)Online publication date: 8-Feb-2018
  • (2016)Mapping stream programs onto multicore platforms by local search and genetic algorithmComputer Languages, Systems and Structures10.1016/j.cl.2016.08.00746:C(182-205)Online publication date: 1-Nov-2016
  • Show More Cited By

Index Terms

  1. LaminarIR: compile-time queues for structured streams

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    PLDI '15: Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation
    June 2015
    630 pages
    ISBN:9781450334686
    DOI:10.1145/2737924
    • cover image ACM SIGPLAN Notices
      ACM SIGPLAN Notices  Volume 50, Issue 6
      PLDI '15
      June 2015
      630 pages
      ISSN:0362-1340
      EISSN:1558-1160
      DOI:10.1145/2813885
      • Editor:
      • Andy Gill
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 03 June 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. compiler optimization
    2. performance analysis
    3. program transformation
    4. stream programming
    5. synchronous data flow

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    PLDI '15
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 406 of 2,067 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 14 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Efficient Algorithm for the Iteration Period Computation of Unfolded Synchronous Dataflow Graphs2018 International Symposium on Theoretical Aspects of Software Engineering (TASE)10.1109/TASE.2018.00013(36-43)Online publication date: Aug-2018
    • (2018)Towards Memory-Optimal Schedules for SDFEuro-Par 2017: Parallel Processing Workshops10.1007/978-3-319-75178-8_8(94-105)Online publication date: 8-Feb-2018
    • (2016)Mapping stream programs onto multicore platforms by local search and genetic algorithmComputer Languages, Systems and Structures10.1016/j.cl.2016.08.00746:C(182-205)Online publication date: 1-Nov-2016
    • (2020)A Survey on Parallel Architectures and Programming Models2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO)10.23919/MIPRO48935.2020.9245341(999-1005)Online publication date: 28-Sep-2020
    • (2019)SoCodeCNN: Program Source Code for Visual CNN Classification Using Computer Vision MethodologyIEEE Access10.1109/ACCESS.2019.29494837(157158-157172)Online publication date: 2019
    • (2016)Mapping stream programs onto multicore platforms by local search and genetic algorithmComputer Languages, Systems and Structures10.1016/j.cl.2016.08.00746:C(182-205)Online publication date: 1-Nov-2016

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media