Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3475738.3480939acmconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article

Low-overhead multi-language dynamic taint analysis on managed runtimes through speculative optimization

Published: 29 September 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Dynamic taint analysis (DTA) is a popular program analysis technique with applications to diverse fields such as software vulnerability detection and reverse engineering. It consists of marking sensitive data as tainted and tracking its propagation at runtime. While DTA has been implemented on top of many different analysis platforms, these implementations generally incur significant slowdown from taint propagation. Since a purely dynamic analysis cannot predict which instructions will operate on tainted values at runtime, programs have to be fully instrumented for taint propagation even when they never actually observe tainted values. We propose leveraging speculative optimizations to reduce slowdown on the peak performance of programs instrumented for DTA on a managed runtime capable of dynamic compilation.
    In this paper, we investigate how speculative optimizations can reduce the peak performance impact of taint propagation on programs executed on a managed runtime. We also explain how a managed runtime can implement DTA to be amenable to such optimizations. We implemented our ideas in TruffleTaint, a DTA platform which supports both dynamic languages like JavaScript and languages like C and C++ which are typically compiled statically. We evaluated TruffleTaint on several benchmarks from the popular Computer Language Benchmarks Game and SPECint 2017 benchmark suites. Our evaluation shows that TruffleTaint is often able to avoid slowdown entirely when programs do not operate on tainted data, and that it exhibits slowdown of on average ∼2.10x and up to ∼5.52x when they do, which is comparable to state-of-the-art taint analysis platforms optimized for performance.

    References

    [1]
    2021. The Computer Language Benchmarks Game. https://benchmarksgame-team.pages.debian.net/benchmarksgame/index.html Accessed: 2021-05-09.
    [2]
    2021. GeeksforGeeks: Check if n is divisible by power of 2 without using arithmetic operators. https://www.geeksforgeeks.org/check-n-divisible-power-2-without-using-arithmetic-operators/ Accessed: 2021-05-22.
    [3]
    2021. GraalVM JavaScript Runtime. https://www.graalvm.org/reference-manual/js/ Accessed: 2021-05-09.
    [4]
    2021. GraalVM LLVM Runtime. https://www.graalvm.org/reference-manual/llvm/ Accessed: 2021-05-09.
    [5]
    2021. The LLHTTP parser for HTTP headers. https://github.com/nodejs/llhttp Accessed: 2021-05-18.
    [6]
    2021. LLVM Data-Flow Sanitizer. https://clang.llvm.org/docs/DataFlowSanitizer.html Accessed: 2021-05-17.
    [7]
    2021. Node.js. http://www.nodejs.org/ Accessed: 2021-05-09.
    [8]
    2021. Safe and Sandboxed Execution of Native Code. https://medium.com/graalvm/safe-and-sandboxed-execution-of-native-code-f6096b35c360 Accessed: 2021-05-09.
    [9]
    2021. SPEC CPU 2017 Benchmark Suite. https://www.spec.org/cpu2017/ Accessed: 2021-05-13.
    [10]
    2021. Truffle Compiler Flags, Including –engine.IterativePartialEscape. https://www.graalvm.org/graalvm-as-a-platform/language-implementation-framework/Options Accessed: 2021-06-04.
    [11]
    2021. The Truffle Framework. https://www.graalvm.org/graalvm-as-a-platform/language-implementation-framework/ Accessed: 2021-05-15.
    [12]
    F. Araujo and K. W. Hamlen. 2015. Compiler-instrumented, Dynamic Secret-Redaction of Legacy Processes for Attacker Deception. In 24th USENIX Security Symposium, USENIX Security 15, Washington, D.C., USA, August 12-14, 2015, J. Jung and T. Holz (Eds.). USENIX Association, 145–159. https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/araujo
    [13]
    J. Bell and G. E. Kaiser. 2014. Phosphor: illuminating dynamic data flow in commodity jvms. In Proceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications, OOPSLA 2014, part of SPLASH 2014, Portland, OR, USA, October 20-24, 2014, A. P. Black and T. D. Millstein (Eds.). ACM, 83–101. https://doi.org/10.1145/2660193.2660212
    [14]
    E. Bosman, A. Slowinska, and H. Bos. 2011. Minemu: The World’s Fastest Taint Tracker. In Recent Advances in Intrusion Detection - 14th International Symposium, RAID 2011, Menlo Park, CA, USA, September 20-21, 2011. Proceedings, R. Sommer, D. Balzarotti, and G. Maier (Eds.) (Lecture Notes in Computer Science, Vol. 6961). Springer, 1–20. https://doi.org/10.1007/978-3-642-23644-0_1
    [15]
    J. Cai, P. Zou, J. Ma, and J. He. 2016. SwordDTA: A dynamic taint analysis tool for software vulnerability detection. Wuhan University Journal of Natural Sciences, 21 (2016), 02, 10–20. https://doi.org/10.1007/s11859-016-1133-1
    [16]
    J. Cai, P. Zou, D. Xiong, and J. He. 2015. A guided fuzzing approach for security testing of network protocol software. In 2015 6th IEEE International Conference on Software Engineering and Service Science (ICSESS). 726–729. https://doi.org/10.1109/ICSESS.2015.7339160
    [17]
    B. Chess and J. West. 2008. Dynamic taint propagation: Finding vulnerabilities without attacking. Inf. Secur. Tech. Rep., 13, 1 (2008), 33–39. https://doi.org/10.1016/j.istr.2008.02.003
    [18]
    E. Chin and D. A. Wagner. 2009. Efficient character-level taint tracking for Java. In Proceedings of the 6th ACM Workshop On Secure Web Services, SWS 2009, Chicago, Illinois, USA, November 13, 2009, E. Damiani, S. Proctor, and A. Singhal (Eds.). ACM, 3–12. https://doi.org/10.1145/1655121.1655125
    [19]
    J. A. Clause, W. Li, and A. Orso. 2007. Dytan: a generic dynamic taint analysis framework. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2007, London, UK, July 9-12, 2007, D. S. Rosenblum and S. G. Elbaum (Eds.). ACM, 196–206. https://doi.org/10.1145/1273463.1273490
    [20]
    J. A. Clause and A. Orso. 2009. Penumbra: automatically identifying failure-relevant inputs using dynamic tainting. In Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, ISSTA 2009, Chicago, IL, USA, July 19-23, 2009, G. Rothermel and L. K. Dillon (Eds.). ACM, 249–260. https://doi.org/10.1145/1572272.1572301
    [21]
    W. Cui, M. Peinado, K. Chen, H. J. Wang, and L. Irún-Briz. 2008. Tupni: automatic reverse engineering of input formats. In Proceedings of the 2008 ACM Conference on Computer and Communications Security, CCS 2008, Alexandria, Virginia, USA, October 27-31, 2008, P. Ning, P. F. Syverson, and S. Jha (Eds.). ACM, 391–402. https://doi.org/10.1145/1455770.1455820
    [22]
    A. Davanian, Z. Qi, Y. Qu, and H. Yin. 2019. DECAF++: Elastic Whole-System Dynamic Taint Analysis. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses, RAID 2019, Chaoyang District, Beijing, China, September 23-25, 2019. USENIX Association, 31–45. https://www.usenix.org/conference/raid2019/presentation/davanian
    [23]
    M. L. Van de Vanter, C. Seaton, M. Haupt, C. Humer, and T. Würthinger. 2018. Fast, Flexible, Polyglot Instrumentation Support for Debuggers and other Tools. Art Sci. Eng. Program., 2, 3 (2018), 14. https://doi.org/10.22152/programming-journal.org/2018/2/14
    [24]
    J. Galea and D. Kroening. 2020. The Taint Rabbit: Optimizing Generic Taint Analysis with Dynamic Fast Path Generation. In ASIA CCS ’20: The 15th ACM Asia Conference on Computer and Communications Security, Taipei, Taiwan, October 5-9, 2020, H. Sun, S. Shieh, G. Gu, and G. Ateniese (Eds.). ACM, 622–636. https://doi.org/10.1145/3320269.3384764
    [25]
    M. Grimmer, S. Marr, M. Kahlhofer, C. Wimmer, T. Würthinger, and H. Mössenböck. 2017. Applying Optimizations for Dynamically-typed Languages to Java. In Proceedings of the 14th International Conference on Managed Languages and Runtimes, ManLang 2017, Prague, Czech Republic, September 27 - 29, 2017. ACM, 12–22. https://doi.org/10.1145/3132190.3132202
    [26]
    D. Hedin, A. Birgisson, L. Bello, and A. Sabelfeld. 2014. JSFlow: tracking information flow in JavaScript and its APIs. In Symposium on Applied Computing, SAC 2014, Gyeongju, Republic of Korea - March 24 - 28, 2014, Y. Cho, S. Y. Shin, S. Kim, C. Hung, and J. Hong (Eds.). ACM, 1663–1671. https://doi.org/10.1145/2554850.2554909
    [27]
    A. Henderson, A. Prakash, L. Yan, X. Hu, X. Wang, R. Zhou, and H. Yin. 2014. Make it work, make it right, make it fast: building a platform-neutral whole-system dynamic binary analysis platform. In International Symposium on Software Testing and Analysis, ISSTA ’14, San Jose, CA, USA - July 21 - 26, 2014, C. S. Pasareanu and D. Marinov (Eds.). ACM, 248–258. https://doi.org/10.1145/2610384.2610407
    [28]
    A. Ho, M. A. Fetterman, C. Clark, A. Warfield, and S. Hand. 2006. Practical taint-based protection using demand emulation. In Proceedings of the 2006 EuroSys Conference, Leuven, Belgium, April 18-21, 2006, Y. Berbers and W. Zwaenepoel (Eds.). ACM, 29–41. https://doi.org/10.1145/1217935.1217939
    [29]
    K. Jee, G. Portokalidis, V. P. Kemerlis, S. Ghosh, D. I. August, and A. D. Keromytis. 2012. A General Approach for Efficiently Accelerating Software-based Dynamic Data Flow Tracking on Commodity Hardware. In 19th Annual Network and Distributed System Security Symposium, NDSS 2012, San Diego, California, USA, February 5-8, 2012. The Internet Society. https://www.ndss-symposium.org/ndss2012/general-approach-efficiently-accelerating-software-based-dynamic-data-flow-tracking-commodity
    [30]
    M. Gyung Kang, S. McCamant, P. Poosankam, and D. Song. 2011. DTA++: Dynamic Taint Analysis with Targeted Control-Flow Propagation. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2011, San Diego, California, USA, 6th February - 9th February 2011. The Internet Society.
    [31]
    R. Karim, F. Tip, A. Sochurková, and K. Sen. 2020. Platform-Independent Dynamic Taint Analysis for JavaScript. IEEE Trans. Software Eng., 46, 12 (2020), 1364–1379. https://doi.org/10.1109/TSE.2018.2878020
    [32]
    V. P. Kemerlis, G. Portokalidis, K. Jee, and A. D. Keromytis. 2012. libdft: practical dynamic data flow tracking for commodity systems. In Proceedings of the 8th International Conference on Virtual Execution Environments, VEE 2012, London, UK, March 3-4, 2012 (co-located with ASPLOS 2012), S. Hand and D. Da Silva (Eds.). ACM, 121–132. https://doi.org/10.1145/2151024.2151042
    [33]
    C. Kerschbaumer, E. Hennigan, P. Larsen, S. Brunthaler, and M. Franz. 2013. Information flow tracking meets just-in-time compilation. ACM Trans. Archit. Code Optim., 10, 4 (2013), 38:1–38:25. https://doi.org/10.1145/2541228.2555295
    [34]
    J. Kreindl, D. Bonetta, L. Stadler, D. Leopoldseder, and H. Mössenböck. 2020. Multi-language dynamic taint analysis in a polyglot virtual machine. In MPLR ’20: 17th International Conference on Managed Programming Languages and Runtimes, Virtual Event, UK, November 4-6, 2020, S. Marr (Ed.). ACM, 15–29. https://doi.org/10.1145/3426182.3426184
    [35]
    L. Lam and T. Chiueh. 2006. A General Dynamic Information Flow Tracking Framework for Security Applications. In 22nd Annual Computer Security Applications Conference (ACSAC 2006), 11-15 December 2006, Miami Beach, Florida, USA. IEEE Computer Society, 463–472. https://doi.org/10.1109/ACSAC.2006.6
    [36]
    S. Lekies, B. Stock, and M. Johns. 2013. 25 million flows later: large-scale detection of DOM-based XSS. In 2013 ACM SIGSAC Conference on Computer and Communications Security, CCS’13, Berlin, Germany, November 4-8, 2013, A. Sadeghi, V. D. Gligor, and Moti Yung (Eds.). ACM, 1193–1204. https://doi.org/10.1145/2508859.2516703
    [37]
    D. Leopoldseder, R. Schatz, L. Stadler, M. Rigger, T. Würthinger, and H. Mössenböck. 2018. Fast-path loop unrolling of non-counted loops to enable subsequent compiler optimizations. In Proceedings of the 15th International Conference on Managed Languages & Runtimes, ManLang 2018, Linz, Austria, September 12-14, 2018, E. Tilevich and H. Mössenböck (Eds.). ACM, 2:1–2:13. https://doi.org/10.1145/3237009.3237013
    [38]
    D. Leopoldseder, L. Stadler, T. Würthinger, J. Eisl, D. Simon, and H. Mössenböck. 2018. Dominance-based duplication simulation (DBDS): code duplication to enable compiler optimizations. In Proceedings of the 2018 International Symposium on Code Generation and Optimization, CGO 2018, Vösendorf / Vienna, Austria, February 24-28, 2018, J. Knoop, M. Schordan, T. Johnson, and M. F. P. O’Boyle (Eds.). ACM, 126–137. https://doi.org/10.1145/3168811
    [39]
    B. Livshits. 2012. Dynamic Taint Tracking in Managed Runtimes. Microsoft Research.
    [40]
    S. Muchnick. 1997. Advanced compiler design and implementation. Morgan Kaufmann Publishers, San Francisco, Calif. isbn:9781558603202
    [41]
    R. Muth, S. A. Watterson, and S. K. Debray. 2000. Code Specialization Based on Value Profiles. In Static Analysis, 7th International Symposium, SAS 2000, Santa Barbara, CA, USA, June 29 - July 1, 2000, Proceedings, J. Palsberg (Ed.) (Lecture Notes in Computer Science, Vol. 1824). Springer, 340–359. https://doi.org/10.1007/978-3-540-45099-3_18
    [42]
    J. Newsome and D. X. Song. 2005. Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2005, San Diego, California, USA. The Internet Society. https://www.ndss-symposium.org/ndss2005/dynamic-taint-analysis-automatic-detection-analysis-and-signaturegeneration-exploits-commodity/
    [43]
    A. Prokopec, G. Duboscq, D. Leopoldseder, and T. Würthinger. 2019. An Optimization-Driven Incremental Inline Substitution Algorithm for Just-in-Time Compilers. In IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2019, Washington, DC, USA, February 16-20, 2019, M. T. Kandemir, A. Jimborean, and T. Moseley (Eds.). IEEE, 164–179. https://doi.org/10.1109/CGO.2019.8661171
    [44]
    F. Qin, C. Wang, Z. Li, H. Kim, Y. Zhou, and Y. Wu. 2006. LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks. In 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 9-13 December 2006, Orlando, Florida, USA. IEEE Computer Society, 135–148. https://doi.org/10.1109/MICRO.2006.29
    [45]
    M. Rigger, R. Schatz, R. Mayrhofer, M. Grimmer, and H. Mössenböck. 2018. Sulong, and Thanks for All the Bugs: Finding Errors in C Programs by Abstracting from the Native Execution Model. In Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS 2018, Williamsburg, VA, USA, March 24-28, 2018, X. Shen, J. Tuck, R. Bianchini, and V. Sarkar (Eds.). ACM, 377–391. https://doi.org/10.1145/3173162.3173174
    [46]
    E. J. Schwartz, T. Avgerinos, and D. Brumley. 2010. All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask). In 31st IEEE Symposium on Security and Privacy, S&P 2010, 16-19 May 2010, Berleley/Oakland, California, USA. IEEE Computer Society, 317–331. https://doi.org/10.1109/SP.2010.26
    [47]
    L. Stadler, T. Würthinger, and H. Mössenböck. 2014. Partial Escape Analysis and Scalar Replacement for Java. In 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2014, Orlando, FL, USA, February 15-19, 2014, D. R. Kaeli and T. Moseley (Eds.). ACM, 165. https://dl.acm.org/citation.cfm?id=2544157
    [48]
    G. Wondracek, P. M. Comparetti, C. Krügel, and E. Kirda. 2008. Automatic Network Protocol Analysis. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2008, San Diego, California, USA, 10th February - 13th February 2008. The Internet Society. https://www.ndss-symposium.org/ndss2008/automatic-network-protocol-analysis/
    [49]
    A. Wöß, C. Wirth, D. Bonetta, C. Seaton, C. Humer, and H. Mössenböck. 2014. An Object Storage Model for the Truffle Language Implementation Framework. In Proceedings of the 2014 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools (PPPJ ’14). Association for Computing Machinery, New York, NY, USA. 133–144. isbn:9781450329262 https://doi.org/10.1145/2647508.2647517
    [50]
    T. Würthinger, C. Wimmer, C. Humer, A. Wöß, L. Stadler, C. Seaton, G. Duboscq, D. Simon, and M. Grimmer. 2017. Practical partial evaluation for high-performance dynamic language runtimes. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, A. Cohen and M. T. Vechev (Eds.). ACM, 662–676. https://doi.org/10.1145/3062341.3062381
    [51]
    T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. 2013. One VM to rule them all. In ACM Symposium on New Ideas in Programming and Reflections on Software, Onward! 2013, part of SPLASH ’13, Indianapolis, IN, USA, October 26-31, 2013, A. L. Hosking, P. Th. Eugster, and R. Hirschfeld (Eds.). ACM, 187–204. https://doi.org/10.1145/2509578.2509581
    [52]
    R. Zhang, S. Huang, and Z. Qi. 2011. Efficient Taint Analysis with Taint Behavior Summary. In Third International Conference on Communications and Mobile Computing, CMC 2011, Qingdao, China, 18-20 April 2011, D. Yuan, M. Cao, C. Wang, and H. Huang (Eds.). IEEE Computer Society, 11–14. https://doi.org/10.1109/CMC.2011.76

    Cited By

    View all
    • (2023)UMLsecRT: Reactive Security Monitoring of Java Applications With Round-Trip EngineeringIEEE Transactions on Software Engineering10.1109/TSE.2023.332636650:1(16-47)Online publication date: 23-Oct-2023

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MPLR 2021: Proceedings of the 18th ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes
    September 2021
    135 pages
    ISBN:9781450386753
    DOI:10.1145/3475738
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 September 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. C/C++
    2. Dynamic Taint Analysis
    3. GraalVM
    4. JavaScript
    5. Performance

    Qualifiers

    • Research-article

    Conference

    MPLR '21
    Sponsor:

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)33
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)UMLsecRT: Reactive Security Monitoring of Java Applications With Round-Trip EngineeringIEEE Transactions on Software Engineering10.1109/TSE.2023.332636650:1(16-47)Online publication date: 23-Oct-2023

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media