Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Compiler techniques for scalable performance of stream programs on multicore architectures
Publisher:
  • Massachusetts Institute of Technology
  • 201 Vassar Street, W59-200 Cambridge, MA
  • United States
Order Number:AAI0822890
Pages:
1
Bibliometrics
Skip Abstract Section
Abstract

Given the ubiquity of multicore processors, there is an acute need to enable the development of scalable parallel applications without unduly burdening programmers. Currently, programmers are asked not only to explicitly expose parallelism but also concern themselves with issues of granularity, load-balancing, synchronization, and communication. This thesis demonstrates that when algorithmic parallelism is expressed in the form of a stream program, a compiler can effectively and automatically manage the parallelism. Our compiler assumes responsibility for low-level architectural details, transforming implicit algorithmic parallelism into a mapping that achieves scalable parallel performance for a given multicore target.

Stream programming is characterized by regular processing of sequences of data, and it is a natural expression of algorithms in the areas of audio, video, digital signal processing, networking, and encryption. Streaming computation is represented as a graph of independent computation nodes that communicate explicitly over data channels. Our techniques operate on contiguous regions of the stream graph where the input and output rates of the nodes are statically determinable. Within a static region, the compiler first automatically adjusts the granularity and then exploits data, task, and pipeline parallelism in a holistic fashion. We introduce techniques that data-parallelize nodes that operate on overlapping sliding windows of their input, translating serializing state into minimal and parametrized inter-core communication. Finally, for nodes that cannot be data-parallelized due to state, we are the first to apply software-pipelining techniques at a coarse granularity to exploit pipeline parallelism between stateful nodes.

Our framework is evaluated in the context of the StreamIt programming language. StreamIt is a high-level stream programming language that has been shown to improve programmer productivity in implementing streaming algorithms. We employ the StreamIt Core benchmark suite of 12 real-world applications to demonstrate the effectiveness of our techniques for varying multi-core architectures. For a 16-core distributed memory multicore, we achieve a 14.9x mean speedup. For benchmarks that include sliding-window computation, our sliding-window data-parallelization techniques are required to enable scalable performance for a 16-core SMP multicore (14x mean speedup) and a 64-core distributed shared memory multicore (52x mean speedup). (Copies available exclusively from MIT Libraries, Rm. 14-0551, Cambridge, MA 02139-4307. Ph. 617-253-5668; Fax 617-253-1690.)

Cited By

  1. ACM
    Gonnord L, Henrio L, Morel L and Radanne G (2023). A Survey on Parallelism and Determinism, ACM Computing Surveys, 55:10, (1-28), Online publication date: 31-Oct-2023.
  2. Mansouri F, Huet S and Houzet D (2016). A domain-specific high-level programming model, Concurrency and Computation: Practice & Experience, 28:3, (750-767), Online publication date: 10-Mar-2016.
  3. ACM
    Millo J, Kofman E and Simone R (2015). Modeling and Analyzing Dataflow Applications on NoC-Based Many-Core Architectures, ACM Transactions on Embedded Computing Systems, 14:3, (1-25), Online publication date: 21-May-2015.
  4. ACM
    Ko Y, Burgstaller B and Scholz B (2015). LaminarIR: compile-time queues for structured streams, ACM SIGPLAN Notices, 50:6, (121-130), Online publication date: 7-Aug-2015.
  5. ACM
    Ko Y, Burgstaller B and Scholz B LaminarIR: compile-time queues for structured streams Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, (121-130)
  6. ACM
    Wang Z, Tournavitis G, Franke B and O'boyle M (2014). Integrating profile-driven parallelism detection and machine-learning-based mapping, ACM Transactions on Architecture and Code Optimization, 11:1, (1-26), Online publication date: 1-Feb-2014.
  7. ACM
    Bartenstein T and Liu Y Rate types for stream programs Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, (213-232)
  8. ACM
    Bosboom J, Rajadurai S, Wong W and Amarasinghe S StreamJIT Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, (177-195)
  9. ACM
    Bartenstein T and Liu Y (2014). Rate types for stream programs, ACM SIGPLAN Notices, 49:10, (213-232), Online publication date: 31-Dec-2015.
  10. ACM
    Bosboom J, Rajadurai S, Wong W and Amarasinghe S (2014). StreamJIT, ACM SIGPLAN Notices, 49:10, (177-195), Online publication date: 31-Dec-2015.
  11. ACM
    Yusuf I and Schmidt H Parameterised architectural patterns for providing cloud service fault tolerance with accurate costings Proceedings of the 16th International ACM Sigsoft symposium on Component-based software engineering, (121-130)
  12. Bartenstein T and Liu Y Green streams for data-intensive software Proceedings of the 2013 International Conference on Software Engineering, (532-541)
  13. Min C and Eom Y DANBI Proceedings of the 22nd international conference on Parallel architectures and compilation techniques, (189-200)
  14. Bui D and Lee E StreaMorph Proceedings of the Eleventh ACM International Conference on Embedded Software, (1-10)
  15. ACM
    Hashemi M, Foroozannejad M, Ghiasi S and Etzel C FORMLESS Proceedings of the 13th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory for Embedded Systems, (71-78)
  16. ACM
    Hashemi M, Foroozannejad M, Ghiasi S and Etzel C (2012). FORMLESS, ACM SIGPLAN Notices, 47:5, (71-78), Online publication date: 18-May-2012.
  17. ACM
    Cohen A, Gérard L and Pouzet M Programming parallelism with futures in lustre Proceedings of the tenth ACM international conference on Embedded software, (197-206)
  18. ACM
    Thies W and Amarasinghe S An empirical characterization of stream programs and its implications for language and compiler design Proceedings of the 19th international conference on Parallel architectures and compilation techniques, (365-376)
Contributors
  • Massachusetts Institute of Technology
  • Massachusetts Institute of Technology

Recommendations