OMPify: Automated Conversion from Serial to Shared-Memory Parallelization

The full OMPify paper can be found here.

There is an ever-present need for shared memory parallelization schemes to exploit the full potential of multi-core architectures. The most common parallelization API addressing this need today is OpenMP. Nevertheless, writing parallel code manually is complex and effort-intensive. Thus, many deterministic source-to-source (S2S) compilers have emerged, intending to automate the process of translating serial to parallel code. However, recent studies have shown that these compilers are impractical in many scenarios. In this work, we combine the latest advancements in the field of AI and natural language processing (NLP) with the vast amount of open-source code to address the problem of automatic parallelization. Specifically, we propose a novel approach, called OMPify, to detect and predict the OpenMP pragmas and shared- memory attributes in parallel code, given its serial version. OMPify is based on a Transformer-based model that leverages a graph-based representation of source code that exploits the inherent structure of code. We evaluated our tool by predicting the parallelization pragmas and attributes of a large corpus of (over 54,000) snippets of serial code written in C and C++ languages (Open-OMP-Plus). Our results demonstrate that OMPify outperforms existing approaches - the general-purposed and popular ChatGPT and targeted PragFormer models - in terms of F1 score and accuracy. Specifically, OMPify achieves up to 90% accuracy on commonly-used OpenMP benchmark tests such as NAS, SPEC, and PolyBench. Additionally, we performed an ablation study to assess the impact of different model components and present interesting insights derived from the study. Lastly, we also explored the potential of using data augmentation and curriculum learning techniques to improve the model's robustness and generalization capabilities.

In this repository, you will find the dataset and source code required to reproduce the results we obtained.

Name		Name	Last commit message	Last commit date
Latest commit History 249 Commits
CompCoder		CompCoder
HPCorpus		HPCorpus
Tokompiler @ adbc092		Tokompiler @ adbc092
code-lms @ f5b8e5a		code-lms @ f5b8e5a
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OMPify: Automated Conversion from Serial to Shared-Memory Parallelization

All Publications

About

Releases

Packages

Languages

License

talkad/OMPify

Folders and files

Latest commit

History

Repository files navigation

OMPify: Automated Conversion from Serial to Shared-Memory Parallelization

All Publications

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages