Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3332466.3374507acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
poster

Optimizing GPU programs by partial evaluation

Published: 19 February 2020 Publication History

Abstract

While GPU utilization allows one to speed up computations to the orders of magnitude, memory management remains the bottleneck making it often a challenge to achieve the desired performance. Hence, different memory optimizations are leveraged to make memory being used more effectively. We propose an approach automating memory management utilizing partial evaluation, a program transformation technique that enables data accesses to be pre-computed, optimized, and embedded into the code, saving memory transactions. An empirical evaluation of our approach shows that the transformed program could be up to 8 times as efficient as the original one in the case of CUDA C naïve string pattern matching algorithm implementation.

References

[1]
Neil D. Jones. 1996. An Introduction to Partial Evaluation. ACM Comput. Surv. 28, 3 (1996), 480--503.
[2]
Neil D. Jones, Carsten K. Gomard, and Peter Sestoft. 1993. Partial Evaluation and Automatic Program Generation. Prentice-Hall, Inc., Upper Saddle River, NJ, USA.
[3]
Gary Kessler. 2019. GCK'S FILE SIGNATURES TABLE. https://www.garykessler.net/library/file_sigs.html. Accessed: 2019-10-31.
[4]
Roland Leissa, Klaas Boesche, Sebastian Hack, Arsène Pérard-Gayot, Richard Membarth, Philipp Slusallek, André Müller, and Bertil Schmidt. 2018. AnyDSL: A Partial Evaluation Framework for Programming High-performance Libraries. Proc. ACM Program. Lang. 2, OOPSLA, Article 119 (Oct. 2018), 30 pages
[5]
Digambar Povar and V. K. Bhadran. 2011. Forensic Data Carving. In Digital Forensics and Cyber Crime, Ibrahim Baggili (Ed.). Springer Berlin Heidelberg, Berlin, Heidelberg, 137--148.
[6]
Putt Sakdhnagool, Amit Sabne, and Rudolf Eigenmann. 2019. RegDem: Increasing GPU Performance via Shared Memory Register Spilling. ArXiv abs/1907.02894 (2019).
[7]
Eugene Sharygin, Ruben Buchatskiy, Roman Zhuykov, and Arseny Sher. 2018. Runtime Specialization of PostgreSQL Query Executor. In Perspectives of System Informatics, Alexander K. Petrenko and Andrei Voronkov (Eds.). Springer International Publishing, Cham, 375--386.
[8]
Xinfeng Xie, Jason Cong, and Yun Liang. 2018. ICCAD : U : Optimizing GPU Shared Memory Allocation in Automated Cto-CUDA Compilation.
[9]
Junzhe Zhang, Sai Ho Yeung, Yao Shu, Bingsheng He, and Wei Wang. 2019. Efficient Memory Management for GPU-based Deep Learning Systems. arXiv:cs.DC/1903.06631

Cited By

View all
  • (2024)ClangOz: Parallel constant evaluation of C++ map and reduce operationsJournal of Computer Languages10.1016/j.cola.2024.10129881(101298)Online publication date: Nov-2024
  • (2024)Performance Analysis of Compiler Support for Parallel Evaluation of C++ Constant ExpressionsSoftware, System, and Service Engineering10.1007/978-3-031-51075-5_6(129-152)Online publication date: 3-Jan-2024
  • (2021)The JaSpe specializer: BT-objects and the interprocedural aspect of the binding-time analysis algorithmProgram Systems: Theory and ApplicationsПрограммные системы: теория и приложения10.25209/2079-3316-2021-12-4-3-3212:4(3-32)Online publication date: 2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '20: Proceedings of the 25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
February 2020
454 pages
ISBN:9781450368186
DOI:10.1145/3332466
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 February 2020

Check for updates

Author Tags

  1. CUDA
  2. GPU
  3. partial evaluation

Qualifiers

  • Poster

Funding Sources

  • ?????????? ???? ??????????????? ???????????? (????)

Conference

PPoPP '20

Acceptance Rates

PPoPP '20 Paper Acceptance Rate 28 of 121 submissions, 23%;
Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ClangOz: Parallel constant evaluation of C++ map and reduce operationsJournal of Computer Languages10.1016/j.cola.2024.10129881(101298)Online publication date: Nov-2024
  • (2024)Performance Analysis of Compiler Support for Parallel Evaluation of C++ Constant ExpressionsSoftware, System, and Service Engineering10.1007/978-3-031-51075-5_6(129-152)Online publication date: 3-Jan-2024
  • (2021)The JaSpe specializer: BT-objects and the interprocedural aspect of the binding-time analysis algorithmProgram Systems: Theory and ApplicationsПрограммные системы: теория и приложения10.25209/2079-3316-2021-12-4-3-3212:4(3-32)Online publication date: 2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media