research-article

Open access

Many Sequential Iterative Algorithms Can Be Parallel and (Nearly) Work-efficient

Authors:

Zheqi Shen,

Zijin Wan,

Yan Gu,

Yihan SunAuthors Info & Claims

SPAA '22: Proceedings of the 34th ACM Symposium on Parallelism in Algorithms and Architectures

Pages 273 - 286

https://doi.org/10.1145/3490148.3538574

Published: 11 July 2022 Publication History

PDF eReader

Abstract

Some recent papers showed that many sequential iterative algorithms can be directly parallelized, by identifying the dependences between the input objects. This approach yields many simple and practical parallel algorithms, but there are still challenges to achieve work-efficiency and high-parallelism. Work-efficiency means that the number of operations is asymptotically the same as the best sequential solution. This can be hard for certain problems where the number of dependences between objects is asymptotically more than optimal sequential work, and we cannot even afford the cost to generate them. To achieve high-parallelism, we always want it to process as many objects as possible in parallel. The goal is to achieve O (D) span for a problem with the deepest dependence length D. We refer to this property as round-efficiency. This paper presents work-efficient and round-efficient algorithms for a variety of classic problems and propose general approaches to do so.

To efficiently parallelize many sequential iterative algorithms, we propose the phase-parallel framework. The framework assigns a rank to each object and processes the objects based on the order of their ranks. All objects with the same rank can be processed in parallel. To enable work-efficiency and high parallelism, we use two types of general techniques. Type 1 algorithms aim to use range queries to extract all objects with the same rank to avoid evaluating all the dependences. We discuss activity selection, and Dijkstra's algorithm using Type 1 framework. Type 2 algorithms aim to wake up an object when the last object it depends on is finished. We discuss activity selection, longest increasing subsequence (LIS), greedy maximal independent set (MIS), and many other algorithms using Type 2 framework.

All of our algorithms are (nearly) work-efficient and round-efficient, and some of them (e.g., LIS) are the first to achieve the both. Many of them improve the previous best bounds. Moreover, we implement many of them for experimental studies. On inputs with reasonable dependence depth, our algorithms are highly parallelized and significantly outperform their sequential counterparts.

References

[1]

Openstreetmap © openstreetmap contributors. https://www.openstreetmap.org/, 2010.

Abstract

References

Cited By

Index Terms

Recommendations

Parallelizing Subroutines in Sequential Programs

Efficient Processor Assignment Algorithms and Loop Transformations for Executing Nested Parallel Loops on Multiprocessors

A Comparison of 12 Parallel FORTRAN Dialects

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations