research-article

Public Access

Implementing Arbitrary/Common Concurrent Writes of CRCW PRAM

Authors:

Wael R Elwasif,

David E BernholdtAuthors Info & Claims

ICPP Workshops '21: 50th International Conference on Parallel Processing Workshop

Article No.: 37, Pages 1 - 9

https://doi.org/10.1145/3458744.3474041

Published: 23 September 2021 Publication History

All formats PDF

Abstract

The Parallel Random Access Machines (PRAM) abstraction is the simplest and most elegant algorithmic model for the design and analysis of parallel algorithms. It consists of different models categorized based on the underlying memory access mode used, the most powerful of which is the Concurrent Read Concurrent Write (CRCW) model. A PRAM algorithm describes a series of rounds, each of which consists of a collection of operations that can be executed concurrently within the same time step. However, the lack of support for concurrent memory accesses and the prevalence of asynchronous programming models led to the belief that implementing CRCW PRAM algorithms is unattainable and prompted many to avoid this model except for theoretical studies of optimal performance.

In this work, we study the arbitrary and common concurrent writes in the CRCW PRAM model and explore implementation challenges on general-purpose systems. Moreover, we examine current practices for implementing common/arbitrary concurrent writes and propose a new efficient lightweight and thread-safe method to implement concurrent writes through leveraging atomic instructions. To demonstrate the efficacy of our method, we developed OpenMP kernels for classical CRCW PRAM algorithms and provide experimental results and comparisons based on run time performance measured over the x86 multicore architecture. Our results show a performance speedup compared to current practices up to 4.5x across all our benchmarks.

References

[1]

Baruch Awerbuch and Yossi Shiloach. 1987. New Connectivity and MSF Algorithms for Shuffle-Exchange Network and PRAM. IEEE Trans. Comput. C-36, 10 (1987), 1258–1263.

Digital Library

[2]

Ravi B. Boppana. 1989. Optimal Separations Between Concurrent-Write Parallel Machines. In Proc. of the 21st Ann. ACM Symp. on Theory of Computing, May 14-17, 1989, Seattle, Washigton, USA, David S. Johnson (Ed.). ACM.

Digital Library

[3]

Stefan D. Bruda and Yuanqiao Zhang. 2010. Collapsing the Hierarchy of Parallel Computational Models. Int. J. Found. Comput. Sci. 21, 3 (2010).

[4]

B. S. Chlebus, K. Diks, T. Hagerup, and T. Radzik. 1988. Efficient simulations between concurrent-read concurrent-write pram models. In Mathematical Foundations of Computer Science 1988, Michal P. Chytil, Václav Koubek, and Ladislav Janiga(Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg.

[5]

David Culler, Richard Karp, David Patterson, Abhijit Sahay, Klaus Erik Schauser, Eunice Santos, Eunice Santos, Eunice Santos, Eunice Santos, Ramesh Subramonian, and Thorsten von Eicken. 1993. LogP: towards a realistic model of parallel computation. SIGPLAN Not. 28, 7 (1993).

[6]

James Edwards and Uzi Vishkin. 2012. Better Speedups Using Simpler Parallel Programming for Graph Connectivity and Biconnectivity. In Proc. of the 2012 Int. Workshop on Programming Models and Applications for Multicores and Manycores.

Digital Library

[7]

James Edwards and Uzi Vishkin. 2013. Brief Announcement: Truly Parallel Burrows-wheeler Compression and Decompression. In Proc. of the 25th Annu. ACM Symp. on Parallelism in Algorithms and Architectures.

[8]

Oak Ridge Computing Leadership Facility. 2021. Andes cluster. https://www.olcf.ornl.gov/olcf-resources/compute-systems/andes/

[9]

Faith E. Fich, Russell Impagliazzo, Bruce Kapron, Valerie King, and Miroslaw Kutylowski. 1993. Limits on the power of parallel random access machines with weak forms of write conflict resolution. In STACS 93, P. Enjalbert, A. Finkel, and K. W. Wagner (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg.

[10]

Faith E. Fich, Prabhakar Ragde, and Avi Wigderson. 1988. Simulations among Concurrent-Write PRAMs. Algorithmica 3, 1–4 (nov 1988).

Digital Library

[11]

Toshiyuki Fujiwara, Kazuo Iwama, and Chuzo Iwamoto. 2004. Partially effective randomization in simulations between ARBITRARY and COMMON PRAMs. J. Parallel and Distrib. Comput. 64, 3 (2004).

Digital Library

[12]

F. Ghanim, U. Vishkin, and R. Barua. 2018. Easy PRAM-Based High-Performance Parallel Programming with ICE. IEEE Transactions on Parallel and Distributed Systems 29, 2 (2018).

[13]

J. Gil and Y. Matias. 1994. Fast and Efficient Simulations among CRCW PRAMs. J. Parallel and Distrib. Comput. 23, 2 (1994).

Digital Library

[14]

Torben Hagerup. 1992. Fast and optimal simulations between CRCW PRAMs. In STACS 92, Alain Finkel and Matthias Jantzen (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg.

[15]

Torben Hagerup and T. Radzik. 1990. Every Robust CRCW PRAM Can Efficiently Simulate a PRIORITY PRAM. In Proceedings of the Second Annual ACM Symposium on Parallel Algorithms and Architectures(SPAA ’90). Association for Computing Machinery, New York, NY, USA.

[16]

J. JaJa. 1992. An Introduction to Parallel Algorithms. Addison-Wesley Publishing Company.

Digital Library

[17]

Rodinia Project. 2021. Rodinia Benchmark Suite. http://www.cs.virginia.edu/rodinia/doku.php?id=start

[18]

Prabhakar Radge. 1992. Processor-Time Tradeoffs in PRAM Simulations. J. Comput. Syst. Sci. 44, 1 (Feb. 1992).

[19]

Yossi Shiloach and Uzi Vishkin. 1982. An O(logn) parallel connectivity algorithm. Journal of Algorithms 3, 1 (1982). https://www.sciencedirect.com/science/article/pii/0196677482900086

[20]

Julian Shun and Guy E. Blelloch. 2014. A Simple Parallel Cartesian Tree Algorithm and Its Application to Parallel Suffix Tree Constr.1, 1 (Oct. 2014).

[21]

U. Vishkin, G. Caragea, and B. Lee. 2008. Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-On-Chip Platform. In Handbook on Parallel Computing (Editors: S. Rajasekaran, J. Reif). Chapman and Hall/CRC Press.

[22]

X. Wen and U. Vishkin. 2008. FPGA-based prototype of a PRAM-on-chip processor. In Proc. ACM Computing Frontiers.

[23]

S. B. Yang, S. K. Dhall, and S. Lakshmivarahan. 1991. Simple randomized parallel algorithms for finding a maximal matching in an undirected graph. In IEEE Proceedings of the SOUTHEASTCON ’91. 579–581 vol.1.

Index Terms

Implementing Arbitrary/Common Concurrent Writes of CRCW PRAM

Index terms have been assigned to the content through auto-classification.

Recommendations

Partially effective randomization in simulations between arbitrary and common PRAMs

It is known that Θ(log n/log log n) steps are needed to simulate one step of ARBITRARY CRCW PRAMs by COMMON CRCW PRAMs, but it was open whether there is a faster simulation when randomization is allowed. This paper gives both positive and negative ...
Work-Time Optimal k-Merge Algorithms on the PRAM

For 2 k n, the k-merge problem is to merge a collection of k sorted sequences of total length n into a new sorted sequence. The k-merge problem is fundamental as it provides a common generalization of both merging and sorting. The main contribution of ...
The Queue-Read Queue-Write PRAM Model: Accounting for Contention in Parallel Algorithms

This paper introduces the queue-read queue-write ({\sc qrqw}) parallel random access machine ({\sc pram}) model, which permits concurrent reading and writing to shared-memory locations, but at a cost proportional to the number of readers/writers to any ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICPP Workshops '21: 50th International Conference on Parallel Processing Workshop

August 2021

314 pages

ISBN:9781450384414

DOI:10.1145/3458744

Copyright © 2021 ACM.

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 September 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

U.S. Department of Energy

Conference

ICPP 2021

ICPP 2021: 50th International Conference on Parallel Processing

August 9 - 12, 2021

IL, Lemont, USA

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
588
Total Downloads

Downloads (Last 12 months)340
Downloads (Last 6 weeks)29

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents