Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3323165.3323209acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
research-article
Public Access

Data Races and the Discrete Resource-time Tradeoff Problem with Resource Reuse over Paths

Published: 17 June 2019 Publication History
  • Get Citation Alerts
  • Abstract

    A determinacy race occurs if two or more logically parallel instructions access the same memory location and at least one of them tries to modify its content. Races are often undesirable as they can lead to nondeterministic and incorrect program behavior. A data race is a special case of a determinacy race which can be eliminated by associating a mutual-exclusion lock with the memory location in question or allowing atomic accesses to it. However, such solutions can reduce parallelism by serializing all accesses to that location. For associative and commutative updates to a memory cell, one can instead use a reducer, which allows parallel race-free updates at the expense of using some extra space. More extra space usually leads to more parallel updates, which in turn contributes to potentially lowering the overall execution time of the program. We start by asking the following question. Given a fixed budget of extra space for mitigating the cost of races in a parallel program, which memory locations should be assigned reducers and how should the space be distributed among those reducers in order to minimize the overall running time? We argue that under reasonable conditions the races of a program can be captured by a directed acyclic graph (DAG), with nodes representing memory cells and arcs representing read-write dependencies between cells. We then formulate our original question as an optimization problem on this DAG. We concentrate on a variation of this problem where space reuse among reducers is allowed by routing every unit of extra space along a (possibly different) source to sink path of the DAG and using it in the construction of multiple (possibly zero) reducers along the path. We consider two different ways of constructing a reducer and the corresponding duration functions (i.e., reduction time as a function of space budget). We generalize our race-avoiding space-time tradeoff problem to a discrete resource-time tradeoff problem with general non-increasing duration functions and resource reuse over paths of the given DAG. For general DAGs, we show that even if the entire DAG is available offline the problem is strongly NP-hard under all three duration functions, and we give approximation algorithms for solving the corresponding optimization problems. We also prove hardness of approximation for the general resource-time tradeoff problem and give a pseudo-polynomial time algorithm for series-parallel DAGs.

    References

    [1]
    Martin Aigner, Christoph M. Kirsch, Michael Lippautz, and Ana Sokolova. 2015. Fast, multicore-scalable, low-fragmentation memory allocation through large virtual memory and global data structures. In ACM SIGPLAN Notices, Vol. 50. ACM, 451--469.
    [2]
    Can Akkan, Andreas Drexl, and Alf Kimms. 2005. Network decomposition-based benchmark results for the discrete time--cost tradeoff problem. European Journal of Operational Research, Vol. 165, 2 (2005), 339--358.
    [3]
    Emery D. Berger, Kathryn S. McKinley, Robert D. Blumofe, and Paul R. Wilson. 2000. Hoard: A scalable memory allocator for multithreaded applications. In ACM SIGARCH Computer Architecture News, Vol. 28. ACM, 117--128.
    [4]
    Eric Blayo, Laurent Debreu, Gregory Mounie, and Denis Trystram. 1999. Dynamic load balancing for ocean circulation model with adaptive meshing. In European Conference on Parallel Processing. Springer, 303--312.
    [5]
    Robert D. Blumofe, Christopher F. Joerg, Bradley C. Kuszmaul, Charles E. Leiserson, Keith H. Randall, and Yuli Zhou. 1996. Cilk: An efficient multithreaded runtime system. J. Parallel and Distrib. Comput., Vol. 37, 1 (1996), 55--69.
    [6]
    OpenMP Architecture Review Board. 1997. OpenMP: A proposed industry standard API for shared memory programming. White Paper (1997). http://www.openmp.org/specs/mp-documents/paper/paper.ps.
    [7]
    Leonardo Dagum and Ramesh Menon. 1998. OpenMP: An industry-standard API for shared-memory programming. Computing in Science & Engineering 1 (1998), 46--55.
    [8]
    Rathish Das, Shih-Yu Tsai, Sharmila Duppala, Jayson Lynch, Esther M. Arkin, Rezaul Chowdhury, Joseph S. B. Mitchell, and Steven Skiena. 2019. Data races and the discrete resource-time tradeoff problem with resource reuse over paths. arXiv preprint arXiv:1904.09283 (2019).
    [9]
    Prabuddha De, E. James Dunne, Jay B. Ghosh, and Charles E. Wells. 1997. Complexity of the discrete time-cost tradeoff problem for project networks. Operations Research, Vol. 45, 2 (1997), 302--306.
    [10]
    Jianzhong Du and Joseph Y-T Leung. 1989. Complexity of scheduling parallel task systems. SIAM Journal on Discrete Mathematics, Vol. 2, 4 (1989), 473--487.
    [11]
    Pierre-Francc ois Dutot, Grégory Mounié, and Denis Trystram. 2004. Scheduling parallel tasks: Approximation algorithms.
    [12]
    Mingdong Feng and Charles E. Leiserson. 1999. Efficient detection of determinacy races in Cilk programs. Theory of Computing Systems, Vol. 32, 3 (1999), 301--326.
    [13]
    Matteo Frigo, Pablo Halpern, Charles E. Leiserson, and Stephen Lewin-Berlin. 2009. Reducers and other Cilk
    [14]
    hyperobjects. In Proceedings of the 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures. ACM, 79--90.
    [15]
    Matteo Frigo, Charles E. Leiserson, and Keith H. Randall. 1998. The implementation of the Cilk-5 multithreaded language. ACM Sigplan Notices, Vol. 33, 5 (1998), 212--223.
    [16]
    Delbert R. Fulkerson. 1961. A network flow computation for project cost curves. Management Science, Vol. 7, 2 (1961), 167--178.
    [17]
    Michael R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness .W. H. Freeman & Co., NY, USA.
    [18]
    Thomas A. Henzinger, Christoph M. Kirsch, Hannes Payer, Ali Sezgin, and Ana Sokolova. 2013. Quantitative relaxation of concurrent data structures. In ACM SIGPLAN Notices, Vol. 48. ACM, 317--328.
    [19]
    Klaus Jansen and Hu Zhang. 2006. An approximation algorithm for scheduling malleable tasks under general precedence constraints. ACM Transactions on Algorithms (TALG), Vol. 2, 3 (2006), 416--434.
    [20]
    James E. Kelley Jr. 1961. Critical-path planning and scheduling: Mathematical basis. Operations Research, Vol. 9, 3 (1961), 296--320.
    [21]
    James E. Kelley Jr. and Morgan R. Walker. 1959. Critical-path planning and scheduling. In papers presented at the December 1--3, 1959, Eastern Joint IRE-AIEE-ACM Computer Conference. ACM, 160--173.
    [22]
    Jan Karel Lenstra and A. H. G. Rinnooy Kan. 1978. Complexity of scheduling under precedence constraints. Operations Research, Vol. 26, 1 (1978), 22--35.
    [23]
    Renaud Lepère, Grégory Mounié, and Denis Trystram. 2002. An approximation algorithm for scheduling trees of malleable tasks. European Journal of Operational Research, Vol. 142, 2 (2002), 242--249.
    [24]
    Renaud Lepere, Denis Trystram, and Gerhard J. Woeginger. 2002. Approximation algorithms for scheduling malleable tasks under precedence constraints. International Journal of Foundations of Computer Science, Vol. 13, 04 (2002), 613--627.
    [25]
    Rolf H. Möhring. 1989. Computationally tractable classes of ordered sets. In Algorithms and Order. Springer, 105--193.
    [26]
    Robert H. B. Netzer and Barton P. Miller. 1992. What are race conditions?: Some issues and formalizations. ACM Letters on Programming Languages and Systems (LOPLAS), Vol. 1, 1 (1992), 74--88.
    [27]
    D. Panagiotakopoulos. 1977. A CPM time-cost computational algorithm for arbitrary activity cost functions. INFOR: Information Systems and Operational Research, Vol. 15, 2 (1977), 183--195.
    [28]
    Steve Phillips Jr. and Mohamed I. Dessouky. 1977. Solving the project time/cost tradeoff problem using the minimal cut concept. Management Science, Vol. 24, 4 (1977), 393--400.
    [29]
    James Reinders. 2007. Intel Threading Building Blocks: outfitting C+ for multi-core processor parallelism .O'Reilly Media, Inc.
    [30]
    Don R. Robinson. 1975. A dynamic programming solution to cost-time tradeoff for CPM. Management Science, Vol. 22, 2 (1975), 158--166.
    [31]
    Thomas J. Schaefer. 1978. The complexity of satisfiability problems. In Proceedings of the 10th Annual ACM Symposium on Theory of Computing. ACM, 216--226.
    [32]
    Scott Schneider, Christos D. Antonopoulos, and Dimitrios S. Nikolopoulos. 2006. Scalable locality-conscious multithreaded memory allocation. In Proceedings of the 5th International Symposium on Memory Management. ACM, 84--94.
    [33]
    Nir Shavit. 2011. Data structures in the multicore age. Commun. ACM, Vol. 54, 3 (2011), 76--84.
    [34]
    Martin Skutella. 1998. Approximation algorithms for the discrete time-cost tradeoff problem. Mathematics of Operations Research, Vol. 23, 4 (1998), 909--929.
    [35]
    John Turek, Joel L. Wolf, and Philip S. Yu. 1992. Approximate algorithms scheduling parallelizable tasks. In Proceedings of the 4th Annual ACM Symposium on Parallel Algorithms and Architectures. ACM, 323--332.

    Cited By

    View all
    • (2022)Automatic HBM ManagementProceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3490148.3538570(147-159)Online publication date: 11-Jul-2022
    • (2021)Low-Span Parallel Algorithms for the Binary-Forking ModelProceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3409964.3461802(22-34)Online publication date: 6-Jul-2021
    • (2020)How to Manage High-Bandwidth Memory AutomaticallyProceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3350755.3400233(187-199)Online publication date: 6-Jul-2020

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SPAA '19: The 31st ACM Symposium on Parallelism in Algorithms and Architectures
    June 2019
    410 pages
    ISBN:9781450361842
    DOI:10.1145/3323165
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    • EATCS: European Association for Theoretical Computer Science

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 June 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. data races
    2. discrete resource-time tradeoff
    3. makespan
    4. reducers
    5. resource reuse
    6. scheduling
    7. space-time tradeoff

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SPAA '19

    Acceptance Rates

    SPAA '19 Paper Acceptance Rate 34 of 109 submissions, 31%;
    Overall Acceptance Rate 447 of 1,461 submissions, 31%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)93
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Automatic HBM ManagementProceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3490148.3538570(147-159)Online publication date: 11-Jul-2022
    • (2021)Low-Span Parallel Algorithms for the Binary-Forking ModelProceedings of the 33rd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3409964.3461802(22-34)Online publication date: 6-Jul-2021
    • (2020)How to Manage High-Bandwidth Memory AutomaticallyProceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3350755.3400233(187-199)Online publication date: 6-Jul-2020

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media