research-article

Open access

PARSECSs: Evaluating the Impact of Task Parallelism in the PARSEC Benchmark Suite

Authors:

Dimitrios Chasapis,

Miquel Moretó,

Eduard Ayguadé,

Jesús Labarta,

Mateo ValeroAuthors Info & Claims

ACM Transactions on Architecture and Code Optimization (TACO), Volume 12, Issue 4

Article No.: 41, Pages 1 - 22

https://doi.org/10.1145/2829952

Published: 22 December 2015 Publication History

Abstract

In this work, we show how parallel applications can be implemented efficiently using task parallelism. We also evaluate the benefits of such parallel paradigm with respect to other approaches. We use the PARSEC benchmark suite as our test bed, which includes applications representative of a wide range of domains from HPC to desktop and server applications. We adopt different parallelization techniques, tailored to the needs of each application, to fully exploit the task-based model. Our evaluation shows that task parallelism achieves better performance than thread-based parallelization models, such as Pthreads. Our experimental results show that we can obtain scalability improvements up to 42% on a 16-core system and code size reductions up to 81%. Such reductions are achieved by removing from the source code application specific schedulers or thread pooling systems and transferring these responsibilities to the runtime system software.

Supplementary Material

TACO1204-41 (taco1204-41.pdf)

Slide deck associated with this paper

Download
618.46 KB

References

[1]

Aaron B. Adcock, Blair D. Sullivan, Oscar R. Hernandez, and Michael W. Mahoney. 2013. Evaluating OpenMP tasking at scale for the computation of graph hyperbolicity. In OpenMP in the Era of Low Power Devices and Accelerators. Lecture Notes in Computer Science, Vol. 8122. Springer, 71--83.

[2]

Malte Appeltauer, Robert Hirschfeld, Michael Haupt, Jens Lincke, and Michael Perscheid. 2009. A comparison of context-oriented programming languages. In Proceedings of the International Workshop on Context-Oriented Programming (COP’09). ACM, New York, NY, Article No. 6.

Digital Library

[3]

Eduard Ayguadé, Nawal Copty, Alejandro Duran, Jay Hoeflinger, Yuan Lin, Federico Massaioli, Xavier Teruel, Priya Unnikrishnan, and Guansong Zhang. 2009. The design of OpenMP tasks. IEEE Transactions on Parallel and Distributed Systems 20, 3, 404--418.

Digital Library

[4]

Eduard Ayguadé, Alejandro Duran, Jay Hoeflinger, Federico Massaioli, and Xavier Teruel. 2008. An experimental evaluation of the new OpenMP tasking model. In Languages and Compilers for Parallel Computing. Springer-Verlag, 63--77.

Digital Library

[5]

Prithviraj Banerjee. 1994. Parallel Algorithms for VLSI Computer-Aided Design. Prentice Hall.

Digital Library

[6]

Pieter Bellens, Josep M. Perez, Rosa M. Badia, and Jesus Labarta. 2006. CellSs: A programming model for the cell BE architecture. In Proceedings of the ACM/IEEE Conference on Supercomputing (SC’06). Article No. 86. http://doi.acm.org/10.1145/1188455.1188546

Digital Library

[7]

Christian Bienia. 2011. Benchmarking Modern Multiprocessors. Ph.D. Dissertation. Princeton University, Princeton, NJ.

Digital Library

[8]

Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. 2008. The PARSEC benchmark suite: Characterization and architectural implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques (PACT’08). ACM, New York, NY 72--81.

Digital Library

[9]

Fischer Black and Myron S. Scholes. 1973. The pricing of options and corporate liabilities. Journal of Political Economy 81, 3, 637--654. http://ideas.repec.org/a/ucp/jpolec/v81y1973i3p637-54.html.

[10]

Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. 1995. Cilk: An efficient multithreaded runtime system. ACM SIGPLAN Notices 30, 8, 207--216.

Digital Library

[11]

David R. Butenhof. 1997. Programming with POSIX Threads. Addison Wesley Longman.

Digital Library

[12]

Barbara Chapman. 2007. The multicore programming challenge. In Advanced Parallel Processing Technologies. Springer.

Digital Library

[13]

Barbara Chapman, Gabriele Jost, and Ruud van der Pas. 2007. Using OpenMP: Portable Shared Memory Parallel Programming (Scientific and Engineering Computation). MIT Press, Cambridge, MA.

Digital Library

[14]

Cristian Coarfa, Yuri Dotsenko, John Mellor-Crummey, François Cantonnet, Tarek El-Ghazawi, Ashrujit Mohanti, Yiyi Yao, and Daniel Chavarría-Miranda. 2005. An evaluation of global address space languages: Co-array fortran and unified parallel C. In Proceedings of the 10th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP’05). ACM, New York, NY, 36--47.

Digital Library

[15]

Henry Cook, Miquel Moreto, Sarah Bird, Khanh Dao, David A. Patterson, and Krste Asanovic. 2013. A hardware evaluation of cache partitioning to improve utilization and energy-efficiency while preserving responsiveness. ACM SIGARCH Compututer Architecture News 41, 3, 308--319.

Digital Library

[16]

Jack Dongarra, Robert Graybill, William Harrod, Robert F. Lucas, Ewing L. Lusk, Piotr Luszczek, Janice McMahon, Allan Snavely, Jeffrey S. Vetter, Katherine A. Yelick, Sadaf R. Alam, Roy L. Campbell, Laura Carrington, Tzu-Yi Chen, Omid Khalili, Jeremy S. Meredith, and Mustafa M. Tikir. 2008. DARPA’s HPCS program—history, models, tools, languages. Advances in Computers 72, 1--100.

[17]

Alejandro Duran, Eduard Ayguadé, Rosa M. Badia, Jesús Labarta, Luis Martinell, Xavier Martorell, and Judit Planas. 2011. OmpSs: A proposal for programming heterogeneous multi-core architectures. Parallel Processing Letters 21, 2, 173--193.

[18]

Alejandro Duran, Roger Ferrer, Eduard Ayguadé, Rosa M. Badia, and Jesus Labarta. 2009. A proposal to extend the OpenMP tasking model with dependent tasks. International Journal of Parallel Programming 37, 3, 292--305.

Digital Library

[19]

Kayvon Fatahalian, Daniel Reiter Horn, Timothy J. Knight, Larkhoon Leem, Mike Houston, Ji Young Park, Mattan Erez, Manman Ren, Alex Aiken, William J. Dally, and Pat Hanrahan. 2006. Sequoia: Programming the memory hierarchy. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (SC’06). ACM, New York, NY, Article 83.

Digital Library

[20]

Gosta Grahne and Jianfei Zhu. 2003. Efficiently using prefix-trees in mining frequent itemsets. In Proceedings of the Workshop on Frequent Itemset Mining Implementations (FIMI’03). http://dblp.uni-trier.de/db/conf/fimi/fimi2003.html#GrahneZ03.

[21]

Jiawei Han, Jian Pei, and Yiwen Yin. 2000. Mining frequent patterns without candidate generation. SIGMOD Record 29, 2, 1--12.

Digital Library

[22]

David Heath, Robert Jarrow, and Andrew Morton. 1992. Bond pricing and the term structure of interest rates: A new methodology for contingent claims valuation. Econometrica 60, 1, 77--105. http://ideas.repec.org/a/ecm/emetrp/v60y1992i1p77-105.html.

[23]

James Christopher Jenista, Yong Hun Eom, and Brian Charles Demsky. 2011. OoOJava: Software out-of-order execution. ACM SIGPLAN Notices 46, 8, 57--68.

Digital Library

[24]

Ian Karlin, Abhinav Bhatele, Jeff Keasler, Bradford L. Chamberlain, Jonathan Cohen, Zachary Devito, Riyaz Haque, Dan Laney, Edward Luke, Felix Wang, David Richards, Martin Schulz, and Charles H. Still. 2013. Exploring traditional and emerging parallel programming models using a proxy application. In Proceedings of the 2013 IEEE 27th International Symposium on Parallel and Distributed Processing (IPDPS’13). IEEE, Los Alamitos, CA, 919--932.

Digital Library

[25]

I-Ting Angelina Lee, Charles E. Leiserson, Tao B. Schardl, Jim Sukha, and Zhunping Zhang. 2013. On-the-fly pipeline parallelism. In Proceedings of the 25th Annual ACM Symposium on Parallelism in Algorithms and Architectures (SPAA’13). ACM, New York, NY, 140--151. http://doi.acm.org/10.1145/2486159.2486174

Digital Library

[26]

Qin Lv, William Josephson, Zhe Wang, Moses Charikar, and Kai Li. 2006. Ferret: A toolkit for content-based similarity search of feature-rich data. SIGOPS Operating Systems Review 40, 4, 317--330.

Digital Library

[27]

Matthias Müller, David Charypar, and Markus Gross. 2003. Particle-based fluid simulation for interactive applications. In Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA’03). 154--159. http://dl.acm.org/citation.cfm?id=846276.846298

Digital Library

[28]

Dan Nagle. 2005. MPI—The Complete Reference, vol. 1, The MPI Core, 2nd ed., Scientific and Engineering Computation Series, by Marc Snir, Steve Otto, Steven Huss-Lederman, David Walker and Jack Dongarra. Scientific Programming 13, 1, 57--63.

Digital Library

[29]

OpenMP Architecture Review Board. 2013. OpenMP Application Program Interface Version 3.0. http://www.openmp.org/mp-documents/openmp-4.5.pdf.

[30]

Judit Planas, Rosa M. Badia, Eduard Ayguadé, and Jesus Labarta. 2009. Hierarchical task-based programming with StarSs. International Journal of High Performance Computing Applications 23, 3, 284--299.

Digital Library

[31]

Artur Podobas and Mats Brorsson. 2010. A comparison of some recent task-based parallel programming models. In Proceedings of the 3rd Workshop on Programmability Issues for Multi-Core Computers (MULTIPROG’10).

[32]

Sean Quinlan and Sean Dorward. 2002. Awarded best paper&excl; Venti: A new approach to archival data storage. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST’02). Article No. 7. http://dl.acm.org/citation.cfm?id=1083323.1083333.

Digital Library

[33]

Eftychios Sifakis, Igor Neverov, and Ronald Fedkiw. 2005. Automatic determination of facial muscle activations from sparse motion capture marker data. ACM Transactions on Graphics 24, 3, 417--425.

Digital Library

[34]

Christi Symeonidou, Polyvios Pratikakis, Angelos Bilas, and Dimitrios S. Nikolopoulos. 2013. DRASync: Distributed region-based memory allocation and synchronization. In Proceedings of the 20th European MPI Users’ Group Meeting (EuroMPI’13). ACM, New York, NY, 49--54.

Digital Library

[35]

George Tzenakis, Angelos Papatriantafyllou, John Kesapides, Polyvios Pratikakis, Hans Vandierendonck, and Dimitrios S. Nikolopoulos. 2012. BDDT: Block-level dynamic dependence analysis for deterministic task-based parallelism. ACM SIGPLAN Notices 47, 8, 301--302.

Digital Library

[36]

Hans Vandierendonck, Kallia Chronaki, and Dimitrios S. Nikolopoulos. 2013. Deterministic scale-free pipeline parallelism with hyperqueues. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’13). ACM, New York, NY, Article No. 32. http://doi.acm.org/10.1145/2503210.2503233

Digital Library

[37]

Hans Vandierendonck, Polyvios Pratikakis, and Dimitrios S. Nikolopoulos. 2011. Parallel programming of general-purpose programs using task-based programming models. In Proceedings of the 3rd USENIX Conference on Hot Topic in Parallelism (HotPar’11). 13. http://dl.acm.org/citation.cfm?id=2001252.2001265.

Digital Library

[38]

Stéphane Zuckerman, Joshua Suetterlein, Rob Knauerhase, and Guang R. Gao. 2011. Using a “Codelet” program execution model for exascale machines: Position paper. In Proceedings of the 1st International Workshop on Adaptive Self-Tuning Computing Systems for the Exaflop Era (EXADAPT’11). ACM, New York, NY, 64--69. http://doi.acm.org/10.1145/2000417.2000424.

Digital Library

Cited By

Mastoras AYzelman AChen QHuang ZSi M(2023)Studying the expressiveness and performance of parallelization abstractions for linear pipelinesProceedings of the 14th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3582514.3582522(29-38)Online publication date: 25-Feb-2023
https://dl.acm.org/doi/10.1145/3582514.3582522
Sharma YGupta SMoulik S(2023)An Interplay of Energy and Temperature Minimization Techniques for Heterogeneous Multiprocessor SystemsTENCON 2023 - 2023 IEEE Region 10 Conference (TENCON)10.1109/TENCON58879.2023.10322452(629-634)Online publication date: 31-Oct-2023
https://doi.org/10.1109/TENCON58879.2023.10322452
Costero LIgual FOlcoz K(2023)Dynamic power budget redistribution under a power cap on multi-application environmentsSustainable Computing: Informatics and Systems10.1016/j.suscom.2023.10086538(100865)Online publication date: Apr-2023
https://doi.org/10.1016/j.suscom.2023.100865
Show More Cited By

Index Terms

PARSECSs: Evaluating the Impact of Task Parallelism in the PARSEC Benchmark Suite
1. Computing methodologies
  1. Concurrent computing methodologies
    1. Concurrent programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Concurrent programming languages

Recommendations

ParSim: A Tool for Workload Modeling and Reproduction of Parallel Applications
MASCOTS '14: Proceedings of the 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems

In this paper, we present ParSim, a software tool designed for modeling parallel applications. The tool reproduces a workload, including its parallel characteristics specified by the users, on the target multicore or multiCPU system, and analyzes the ...
Application of the ParalleX execution model to stencil-based problems

In the prospect of the upcoming exa-scale era with millions of execution units, the question of how to deal with this level of parallelism efficiently is of time-critical relevance. State-of-the-Art parallelization techniques such as OpenMP and MPI are ...
Using HPX and LibGeoDecomp for scaling HPC applications on heterogeneous supercomputers
ScalA '13: Proceedings of the Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems

With the general availability of PetaFLOP clusters and the advent of heterogeneous machines equipped with special accelerator cards such as the Xeon Phi[2], computer scientist face the difficult task of improving application scalability beyond what is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Architecture and Code Optimization

ACM Transactions on Architecture and Code Optimization Volume 12, Issue 4

January 2016

848 pages

ISSN:1544-3566

EISSN:1544-3973

DOI:10.1145/2836331

Editor:
Koen De Bosschere
Ghent University

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 December 2015

Accepted: 01 September 2015

Revised: 01 September 2015

Received: 01 March 2015

Published in TACO Volume 12, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

Spanish Miinistry of Science and Innovation
Eropean Unions 7th FP, ERC
Spanish Goverment Severo Ochoa
Secretary for Universities and Research of the Ministry of Economy and Knowledge of the Goverment of Catalonia
Juan de la Cierva post-doctoral fellowship
Marie Curie Actions of the 7th R&D Framework Programme of teh European Union

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

34
Total Citations
View Citations
1,296
Total Downloads

Downloads (Last 12 months)140
Downloads (Last 6 weeks)22

Reflects downloads up to 11 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Mastoras AYzelman AChen QHuang ZSi M(2023)Studying the expressiveness and performance of parallelization abstractions for linear pipelinesProceedings of the 14th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3582514.3582522(29-38)Online publication date: 25-Feb-2023
https://dl.acm.org/doi/10.1145/3582514.3582522
Sharma YGupta SMoulik S(2023)An Interplay of Energy and Temperature Minimization Techniques for Heterogeneous Multiprocessor SystemsTENCON 2023 - 2023 IEEE Region 10 Conference (TENCON)10.1109/TENCON58879.2023.10322452(629-634)Online publication date: 31-Oct-2023
https://doi.org/10.1109/TENCON58879.2023.10322452
Costero LIgual FOlcoz K(2023)Dynamic power budget redistribution under a power cap on multi-application environmentsSustainable Computing: Informatics and Systems10.1016/j.suscom.2023.10086538(100865)Online publication date: Apr-2023
https://doi.org/10.1016/j.suscom.2023.100865
Kagaris DDutta SEyerman S(2022)Execution Time Estimation of Multithreaded Programs With Critical SectionsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.314345533:10(2470-2481)Online publication date: 1-Oct-2022
https://doi.org/10.1109/TPDS.2022.3143455
Mustafa D(2022)A Survey of Performance Tuning Techniques and Tools for Parallel ApplicationsIEEE Access10.1109/ACCESS.2022.314784610(15036-15055)Online publication date: 2022
https://doi.org/10.1109/ACCESS.2022.3147846
Poole DGalvez Vallejo JGordon M(2022)A Task-Based Approach to Parallel Restricted Hartree–Fock CalculationsJournal of Chemical Theory and Computation10.1021/acs.jctc.1c0082018:4(2144-2161)Online publication date: 4-Apr-2022
https://doi.org/10.1021/acs.jctc.1c00820
Hunold SKraßnitzer K(2022)A Quantitative Analysis of OpenMP Task Runtime SystemsBenchmarking, Measuring, and Optimizing10.1007/978-3-031-31180-2_1(3-18)Online publication date: 7-Nov-2022
https://dl.acm.org/doi/10.1007/978-3-031-31180-2_1
Guidotti NCeyrat PBarreto JMonteiro JRodrigues RFonseca RMartorell XPeña A(2021)Particle-In-Cell Simulation Using Asynchronous TaskingEuro-Par 2021: Parallel Processing10.1007/978-3-030-85665-6_30(482-498)Online publication date: 25-Aug-2021
https://doi.org/10.1007/978-3-030-85665-6_30
Mendonça GLiao CPereira FAyguadé EHwu WBadia RHofstee H(2020)AutoParBenchProceedings of the 34th ACM International Conference on Supercomputing10.1145/3392717.3392744(1-10)Online publication date: 29-Jun-2020
https://dl.acm.org/doi/10.1145/3392717.3392744
Liu NZang BChen HGupta RShen X(2020)No barrier in the roadProceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3332466.3374535(348-361)Online publication date: 19-Feb-2020
https://dl.acm.org/doi/10.1145/3332466.3374535
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents