Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Guided linking: dynamic linking without the costs

Published: 13 November 2020 Publication History

Abstract

Dynamic linking is extremely common in modern software systems, thanks to the flexibility and space savings it offers. However, this flexibility comes at a cost: it’s impossible to perform interprocedural optimizations that involve calls to a dynamic library. The basic problem is that the run-time behavior of the dynamic linker can’t be predicted at compile time, so the compiler can make no assumptions about how such calls will behave.
This paper introduces guided linking, a technique for optimizing dynamically linked software when some information about the dynamic linker’s behavior is known in advance. The developer provides an arbitrary set of programs, libraries, and plugins to our tool, along with constraints that limit the possible dynamic linking behavior of the software. By taking advantage of the constraints, our tool enables any existing optimization to be applied across dynamic linking boundaries. For example, the NoOverride constraint can be applied to a function when the developer knows it will never be overridden with a different definition at run time; guided linking then enables the function to be inlined into its callers in other libraries. We also introduce a novel code size optimization that deduplicates identical functions even across different parts of the software set.
By applying guided linking to the Python interpreter and its dynamically loaded modules, supplying the constraint that no other programs or modules will be used, we increase speed by an average of 9%. By applying guided linking to a dynamically linked distribution of Clang and LLVM, and using the constraint that no other software will use the LLVM libraries, we can increase speed by 5% and reduce file size by 13%. If we relax the constraint to allow other software to use the LLVM libraries, we can still increase speed by 5% and reduce file size by 5%. If we use guided linking to combine 11 different versions of the Boost library, using minimal constraints, we can reduce the total library size by 57%.

Supplementary Material

Auxiliary Presentation Video (oopsla20main-p80-p-video.mp4)
This is the presentation video of our talk at OOPSLA 2020 on Guided Linking. Guided Linking is a technique for optimizing dynamically linked software when some information about the dynamic linker’s behavior is known in advance. It enables any existing optimization to be applied across dynamic linking boundaries. We also introduce a novel code size optimization that deduplicates identical functions even across different parts of the software set. By applying guided linking to the Python interpreter and its dynamically loaded modules, we increase speed by an average of 9%. By applying guided linking to Clang and LLVM, we can increase speed by 5% and reduce file size by 13%. If we use guided linking to combine 11 different versions of the Boost library, we can reduce the total library size by 57%.

References

[1]
Ioannis Agadakos, Di Jin, David Williams-King, Vasileios P. Kemerlis, and Georgios Portokalidis. 2019. Nibbler: Debloating Binary Shared Libraries. InProceedings of the 35th Annual Computer Security Applications Conference (ACSAC '19). ACM, New York, NY, USA, 70-83. https://doi.org/10.1145/3359789.3359823
[2]
Varun Agrawal, Abhiroop Dabral, Tapti Palit, Yongming Shen, and Michael Ferdman. 2015. Architectural Support for Dynamic Linking. InProceedings of the Twentieth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '15). ACM, New York, NY, USA, 691-702. https://doi.org/10.1145/2694344. 2694392
[3]
Andrea Arcangeli, Izik Eidus, and Chris Wright. 2009. Increasing memory density by using KSM. InProc. Linux Symp. ( 2009 ). 19-28. https://www.kernel.org/doc/ols/2009/ols2009-pages-19-28.pdf
[4]
Marc Auslander and Martin Hopkins. 1982. An Overview of the PL.8 Compiler. IPnroceedings of the 1982 SIGPLAN Symposium on Compiler Construction (SIGPLAN '82). ACM, New York, NY, USA, 22-31. https://doi.org/10.1145/872726.806977
[5]
Berkeley R. Churchill, Oded Padon, Rahul Sharma, and Alex Aiken. 2019. Semantic Program Alignment for Equivalence Checking. InProceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2019 ). ACM, New York, NY, USA, 1027-1040. https://doi.org/10.1145/3314221.3314596
[6]
Christian S. Collberg, John H. Hartman, Sridivya Babu, and Sharath K. Udupa. 2005. SLINKY: Static Linking Reloaded. In Proceedings of the 2005 USENIX Annual Technical Conference (USENIX ATC '05). 309-322. https://www.usenix.org/ conference/2005-usenix-annual-technical-conference/slinky-static-linking-reloaded
[7]
Keith D. Cooper, Ken Kennedy, and Linda Torczon. 1986. The Impact of Interprocedural Analysis and Optimization in theRn Programming Environment. ACM Trans. Program. Lang. Syst. 8, 4 (Oct. 1986 ), 491-523. https://doi.org/10.1145/6465.6489
[8]
Keith D. Cooper and Nathaniel McIntosh. 1999. Enhanced Code Compression for Embedded RISC Processors. InProceedings of the ACM SIGPLAN 1999 Conference on Programming Language Design and Implementation (PLDI '99). ACM, New York, NY, USA, 139-149. https://doi.org/10.1145/301618.301655
[9]
Manjeet Dahiya and Sorav Bansal. 2017. Black-Box Equivalence Checking Across Compiler Optimizations. PIrnoceedings of the 15th Asian Symposium on Programming Languages and Systems (APLAS 2017) (Lecture Notes in Computer Science, Vol. 10695 ). Springer, Cham, 127-147. https://doi.org/10.1007/978-3-319-71237-6_7
[10]
Nicolai Davidsson, Andre Pawlowski, and Thorsten Holz. 2019. Towards Automated Application-Specific Software Stacks. In Proceedings of the 24th European Symposium on Research in Computer Security (ESORICS 2019 ) (Lecture Notes in Computer Science, Vol. 11736 ). Springer, Cham, 88-109. https://doi.org/10.1007/978-3-030-29962-0_5
[11]
Bjorn De Suter, Bruno De Bus, and Koen De Bosschere. 2005. Link-Time Binary Rewriting Techniques for Program Compaction. ACM Trans. Program. Lang. Syst. 27, 5 (Sept. 2005 ), 882-945. https://doi.org/10.1145/1086642.1086645
[12]
Bjorn De Suter, Bruno De Bus, Koen De Bosschere, and Saumya Debray. 2001. Combining Global Code and Data Compaction. InProceedings of the ACM SIGPLAN Workshop on Languages, Compilers and Tools for Embedded Systems (LCTES '01). ACM, New York, NY, USA, 29-38. https://doi.org/10.1145/384197.384204
[13]
Saumya K. Debray, William Evans, Robert Muth, and Bjorn De Suter. 2000. Compiler Techniques for Code Compaction. ACM Trans. Program. Lang. Syst. 22, 2 (March 2000 ), 378-415. https://doi.org/10.1145/349214.349233
[14]
Will Dietz and Vikram Adve. 2018. Software Multiplexing: Share Your Libraries and Statically Link Them Too. Proc. ACM Program. Lang. 2, OOPSLA, Article 154 (Oct. 2018 ), 26 pages. https://doi.org/10.1145/3276524
[15]
Ulrich Drepper. 2011. How To Write Shared Libraries. Retrieved Oct. 2020 from https://akkadia.org/drepper/dsohowto.pdf
[16]
Christopher W. Fraser, Eugene W. Myers, and Alan L. Wendt. 1984. Analyzing and Compressing Assembly Code. IPnroceedings of the 1984 SIGPLAN Symposium on Compiler Construction (SIGPLAN '84). ACM, New York, NY, USA, 117-121. https://doi.org/10.1145/502874.502886
[17]
IBM Corporation. 2018. Code optimization with the IBM XL compilers on Power architectures. Retrieved Oct. 2020 from http://www-01.ibm.com/support/docview.wss? uid=swg27005174&aid=1
[18]
Intel Corporation. 2020. Intel® C+ + Compiler 19.1 Developer Guide and Reference. Retrieved Oct. 2020 from https://software.intel.com/content/www/us/en/develop/documentation/cpp-compiler-developer-guide-andreference/top/optimization-and-programming-guide/interprocedural-optimization-ipo.html
[19]
Jakub Jelínek. 2004.Prelink. Technical Report. Red Hat, Inc. https://people.redhat.com/jakub/prelink.pdf
[20]
Changhee Jung, Duk-Kyun Woo, Kanghee Kim, and Sung-Soo Lim. 2007. Performance Characterization of Prelinking and Preloading for Embedded Systems. InProceedings of the 7th ACM & IEEE International Conference on Embedded Software (EMSOFT '07). ACM, New York, NY, USA, 213-220. https://doi.org/10.1145/1289927.1289961
[21]
Stephen Kell, Dominic P. Mulligan, and Peter Sewell. 2016. The Missing Link: Explaining ELF Static Linking, Semantically. In Proceedings of the 2016 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2016 ). ACM, New York, NY, USA, 607-623. https://doi.org/10.1145/2983990.2983996
[22]
Raghavan Komondoor and Susan Horwitz. 2001. Using Slicing to Identify Duplication in Source Code. PInroceedings of the 8th International Symposium on Static Analysis (SAS 2001) (Lecture Notes in Computer Science, Vol. 2126 ). Springer, Berlin, Heidelberg, 40-56. https://doi.org/10.1007/3-540-47764-0_3
[23]
Chris Latner and Vikram S. Adve. 2004. LLVM: A Compilation Framework for Lifelong Program Analysis and Transformation. In Proceedings of the International Symposium on Code Generation and Optimization (CGO 2004 ). 75-86. https://doi.org/10.1109/CGO. 2004.1281665
[24]
John R. Levine. 1999. Linkers and Loaders (1st ed.). Morgan Kaufmann, San Francisco, CA, USA. https://linker.iecc.com/
[25]
Jay P. Lim and Santosh Nagarakate. 2019. Automatic Equivalence Checking for Assembly Implementations of Cryptography Libraries. In Proceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2019 ). IEEE, 37-49. https://doi.org/10.1109/CGO. 2019.8661180
[26]
Linux man-pages project 2020. Linux Programmer's Manual: ld.so(8). Linux man-pages project. Retrieved Oct. 2020 from https://man7.org/linux/man-pages/man8/ld.so.8.html
[27]
Gregory Malecha, Ashish Gehani, and Natarajan Shankar. 2015. Automated Software Winnowing. InProceedings of the 30th Annual ACM Symposium on Applied Computing (SAC '15). ACM, New York, NY, USA, 1504-1511. https://doi.org/ 10.1145/2695664.2695751
[28]
Collin R. Mulliner and Mathias Neugschwandtner. 2015. CodeFreeze: Breaking Payloads with Runtime Code Stripping and Image Freezing. InBlack Hat USA. https://www.mulliner.org/security/codefreeze/
[29]
Michael N. Nelson and Graham Hamilton. 1993. High Performance Dynamic Linking Through Caching. InProceedings of the USENIX Summer 1993 Technical Conference (USENIX ATC). https://www.usenix.org/legacy/publications/library/ proceedings/cinci93/nelson.html
[30]
OpenWRT Community. 2020. 4 /32 Warning. https://openwrt.org/supported_devices/432_warning
[31]
Douglas B. Orr, John Bonn, Jay Lepreau, and Robert Mecklenburg. 1993. Fast and Flexible Shared Libraries. PInroceedings of the Summer 1993 Usenix Conference. 237-252. https://www.cs.utah.edu/flux/papers/shlibs.html
[32]
Python Software Foundation. 2020a. The Python Performance Benchmark Suite. https://pyperformance.readthedocs.io/
[33]
Python Software Foundation. 2020b. Python pyperf module. https://pyperf.readthedocs.io/en/latest/
[34]
Valery Reznic. 2016. Statifier. Retrieved 2018 from http://statifier.sourceforge.net/
[35]
Valery Reznic. 2018. Ermine: Linux Portable Application Creator. http://www.magicermine.com/
[36]
Rodrigo C. O. Rocha, Pavlos Petoumenos, Zheng Wang, Murray Cole, and Hugh Leather. 2019. Function Merging by Sequence Alignment. InProceedings of the 2019 IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2019 ). IEEE, 149-163. https://doi.org/10.1109/CGO. 2019.8661174
[37]
scut. 2003. reducebind.c. Retrieved Oct. 2020 from https://packetstormsecurity.com/files/30760/reducebind.c.html
[38]
Hashim Sharif, Muhammad Abubakar, Ashish Gehani, and Fareed Zafar. 2018. TRIMMER: Application Specialization for Code Debloating. InProceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering (ASE 2018 ). ACM, New York, NY, USA, 329-339. https://doi.org/10.1145/3238147.3238160
[39]
Amitabh Srivastava and David W. Wall. 1992. A Practical System for Intermodule Code Optimization at Link-Time. WRL Research Report 92/6. Digital Western Research Laboratory. Retrieved Oct. 2020 from https://www.hpl.hp.com/ techreports/Compaq-DEC/WRL-92-6.pdf
[40]
hTomas A. Standish, Dennis F. Kibler, and James M. Neighbors. 1976. Improving and Refining Programs by Program Manipulation. InProceedings of the 1976 Annual Conference (ACM '76). ACM, New York, NY, USA, 509-516. https: //doi.org/10.1145/800191.805652
[41]
Tobias J.K. Edler von Koch, Björn Franke, Pranav Bhandarkar, and Anshuman Dasgupta. 2014. Exploiting Function Similarity for Code Size Reduction. InProceedings of the 2014 SIGPLAN/SIGBED Conference on Languages, Compilers and Tools for Embedded Systems (LCTES '14). ACM, New York, NY, USA, 85-94. https://doi.org/10.1145/2597809.2597811
[42]
Carl A. Waldspurger. 2002. Memory Resource Management in VMware ESX Server. InProceedings of the 5th ACM Symposium on Operating Systems Design and Implementation (OSDI '02). ACM, New York, NY, USA, 181-194. https: //doi.org/10.1145/844128.844146
[43]
Andreas Ziegler, Julian Geus, Bernhard Heinloth, Timo Hönig, and Daniel Lohmann. 2019. Honey, I Shrunk the ELFs: Lightweight Binary Tailoring of Shared Libraries. ACM Trans. Embed. Comput. Syst. 18, 5s, Article 102 (Oct. 2019 ), 23 pages. https://doi.org/10.1145/3358222

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 4, Issue OOPSLA
November 2020
3108 pages
EISSN:2475-1421
DOI:10.1145/3436718
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2020
Published in PACMPL Volume 4, Issue OOPSLA

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Code deduplication
  2. Dynamic Linking
  3. IR
  4. LLVM
  5. LTO
  6. Link-Time Optimization
  7. Plugins
  8. Shared Libraries

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 3,169
    Total Downloads
  • Downloads (Last 12 months)547
  • Downloads (Last 6 weeks)47
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media