Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3546918.3546927acmotherconferencesArticle/Chapter ViewAbstractPublication PagessplashConference Proceedingsconference-collections
research-article
Open access

Dynamic Taint Analysis with Label-Defined Semantics

Published: 30 November 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Dynamic taint analysis is a popular analysis technique which tracks the propagation of specific values while a program executes. To this end, a taint label is attached to these values and is dynamically propagated to any values derived from them. Frequent application of this analysis technique in many fields has led to the development of general-purpose analysis platforms with taint propagation capabilities. However, these platforms generally limit analysis developers to a specific implementation language, to specific propagation semantics or to specific taint label representations.
    In this paper we present label-defined dynamic taint analysis, a language-agnostic approach for specifying the properties of a dynamic taint analysis in terms of propagated taint labels. This approach enables analysis platforms to support arbitrary adaptations to these properties by delegating propagation decisions to propagated taint labels and thus to provide more flexibility to analysis developers than other analysis platforms. We implemented this approach in TruffleTaint, a GraalVM-based taint analysis platform, and integrated it with GraalVM’s language interoperability and tooling support. We further integrated our approach with GraalVM’s performance optimizations. Our performance evaluation shows that label-defined taint analysis can reach peak performance similar to that of equivalent engine-integrated taint analyses. In addition to supporting the convenient reimplementation of existing dynamic taint analyses, our approach enables new capabilities for these analyses. It also enabled us to implement a novel tooling infrastructure for analysis developers as well as tooling support for end users.

    References

    [1]
    2021. The Computer Language Benchmarks Game. https://benchmarksgame-team.pages.debian.net/benchmarksgame/index.html. Accessed: 2021-05-09.
    [2]
    2022. LLVM Data-Flow Sanitizer. https://clang.llvm.org/docs/DataFlowSanitizer.html. Accessed: 2022-03-04.
    [3]
    2022. Truffle Libraries Documentation. https://www.graalvm.org/22.1/graalvm-as-a-platform/language-implementation-framework/TruffleLibraries. Accessed: 2022-05-21.
    [4]
    M. Aldrich, E. Shi, A. Turcotte, and F. Tip. 2022. Augur. https://github.com/nuprl/augur. Accessed: 2022-05-21.
    [5]
    F. Araujo and K. W. Hamlen. 2015. Compiler-instrumented, Dynamic Secret-Redaction of Legacy Processes for Attacker Deception. In 24th USENIX Security Symposium, USENIX Security 15, Washington, D.C., USA, August 12-14, 2015, J. Jung and T. Holz (Eds.). USENIX Association, 145–159. https://www.usenix.org/conference/usenixsecurity15/technical-sessions/presentation/araujo
    [6]
    E. Bosman, A. Slowinska, and H. Bos. 2011. Minemu: The World’s Fastest Taint Tracker. In Recent Advances in Intrusion Detection - 14th International Symposium, RAID 2011, Menlo Park, CA, USA, September 20-21, 2011. Proceedings(Lecture Notes in Computer Science, Vol. 6961), R. Sommer, D. Balzarotti, and G. Maier (Eds.). Springer, 1–20. https://doi.org/10.1007/978-3-642-23644-0_1
    [7]
    J. Cai, P. Zou, J. Ma, and J. He. 2016. SwordDTA: A dynamic taint analysis tool for software vulnerability detection. Wuhan University Journal of Natural Sciences 21 (02 2016), 10–20. https://doi.org/10.1007/s11859-016-1133-1
    [8]
    P. Chen and H. Chen. 2018. Angora: Efficient Fuzzing by Principled Search. In 2018 IEEE Symposium on Security and Privacy, SP 2018, Proceedings, 21-23 May 2018, San Francisco, California, USA. IEEE Computer Society, 711–725. https://doi.org/10.1109/SP.2018.00046
    [9]
    X. Cheng and D. Devecsery. 2022. Creating concise and efficient dynamic analyses with ALDA. In ASPLOS ’22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022 - 4 March 2022, B. Falsafi, M. Ferdman, S. Lu, and T. F. Wenisch (Eds.). ACM, 740–752. https://doi.org/10.1145/3503222.3507760
    [10]
    J. A. Clause, W. Li, and A. Orso. 2007. Dytan: a generic dynamic taint analysis framework. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis, ISSTA 2007, London, UK, July 9-12, 2007, D. S. Rosenblum and S. G. Elbaum (Eds.). ACM, 196–206. https://doi.org/10.1145/1273463.1273490
    [11]
    J. A. Clause and A. Orso. 2009. Penumbra: automatically identifying failure-relevant inputs using dynamic tainting. In Proceedings of the Eighteenth International Symposium on Software Testing and Analysis, ISSTA 2009, Chicago, IL, USA, July 19-23, 2009, G. Rothermel and L. K. Dillon (Eds.). ACM, 249–260. https://doi.org/10.1145/1572272.1572301
    [12]
    W. Cui, M. Peinado, K. Chen, H. J. Wang, and L. Irún-Briz. 2008. Tupni: automatic reverse engineering of input formats. In Proceedings of the 2008 ACM Conference on Computer and Communications Security, CCS 2008, Alexandria, Virginia, USA, October 27-31, 2008, P. Ning, P. F. Syverson, and S. Jha (Eds.). ACM, 391–402. https://doi.org/10.1145/1455770.1455820
    [13]
    A. Davanian, Z. Qi, Y. Qu, and H. Yin. 2019. DECAF++: Elastic Whole-System Dynamic Taint Analysis. In 22nd International Symposium on Research in Attacks, Intrusions and Defenses, RAID 2019, Chaoyang District, Beijing, China, September 23-25, 2019. USENIX Association, 31–45. https://www.usenix.org/conference/raid2019/presentation/davanian
    [14]
    M. L. Van de Vanter, C. Seaton, M. Haupt, C. Humer, and T. Würthinger. 2018. Fast, Flexible, Polyglot Instrumentation Support for Debuggers and other Tools. Art Sci. Eng. Program. 2, 3 (2018), 14. https://doi.org/10.22152/programming-journal.org/2018/2/14
    [15]
    J. Galea and D. Kroening. 2020. The Taint Rabbit: Optimizing Generic Taint Analysis with Dynamic Fast Path Generation. In ASIA CCS ’20: The 15th ACM Asia Conference on Computer and Communications Security, Taipei, Taiwan, October 5-9, 2020, H. Sun, S. Shieh, G. Gu, and G. Ateniese (Eds.). ACM, 622–636. https://doi.org/10.1145/3320269.3384764
    [16]
    E. Gamma. 1995. Design Patterns. Addison-Wesley Publishing Co.
    [17]
    M. Grimmer, S. Marr, M. Kahlhofer, C. Wimmer, T. Würthinger, and H. Mössenböck. 2017. Applying Optimizations for Dynamically-typed Languages to Java. In Proceedings of the 14th International Conference on Managed Languages and Runtimes, ManLang 2017, Prague, Czech Republic, September 27 - 29, 2017. ACM, 12–22. https://doi.org/10.1145/3132190.3132202
    [18]
    D. Hedin, L. Bello, and A. Sabelfeld. 2016. Information-flow security for JavaScript and its APIs. J. Comput. Secur. 24, 2 (2016), 181–234. https://doi.org/10.3233/JCS-160544
    [19]
    K. Hough and J. Bell. 2021. A Practical Approach for Dynamic Taint Tracking with Control-Flow Relationships. ACM Trans. Softw. Eng. Methodol. 31, 2, Article 26 (dec 2021), 43 pages. https://doi.org/10.1145/3485464
    [20]
    K. Hough, J. Bell, and G. Kaiser. 2022. Phosphor: Dynamic Taint Tracking for the JVM. https://github.com/gmu-swe/phosphor. Accessed: 2022-05-21.
    [21]
    B. Kang, T. Kim, B. Kang, E. G. Im, and M. Ryu. 2014. TASEL: dynamic taint analysis with selective control dependency. In Proceedings of the 2014 Conference on Research in Adaptive and Convergent Systems, RACS 2014, Towson, Maryland, USA, October 5-8, 2014, C. Lu, E. S. Nadimi, S. Kim, and W. Wang (Eds.). ACM, 272–277. https://doi.org/10.1145/2663761.2664219
    [22]
    M. Gyung Kang, S. McCamant, P. Poosankam, and D. Song. 2011. DTA++: Dynamic Taint Analysis with Targeted Control-Flow Propagation. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2011, San Diego, California, USA, 6th February - 9th February 2011. The Internet Society.
    [23]
    R. Karim, F. Tip, A. Sochurková, and K. Sen. 2020. Platform-Independent Dynamic Taint Analysis for JavaScript. IEEE Trans. Software Eng. 46, 12 (2020), 1364–1379. https://doi.org/10.1109/TSE.2018.2878020
    [24]
    V. P. Kemerlis, G. Portokalidis, K. Jee, and A. D. Keromytis. 2012. libdft: practical dynamic data flow tracking for commodity systems. In Proceedings of the 8th International Conference on Virtual Execution Environments, VEE 2012, London, UK, March 3-4, 2012 (co-located with ASPLOS 2012), S. Hand and D. Da Silva (Eds.). ACM, 121–132. https://doi.org/10.1145/2151024.2151042
    [25]
    J. Kreindl, D. Bonetta, L. Stadler, D. Leopoldseder, and H. Mössenböck. 2020. Multi-language dynamic taint analysis in a polyglot virtual machine. In MPLR ’20: 17th International Conference on Managed Programming Languages and Runtimes, Virtual Event, UK, November 4-6, 2020, S. Marr (Ed.). ACM, 15–29. https://doi.org/10.1145/3426182.3426184
    [26]
    J. Kreindl, D. Bonetta, L. Stadler, D. Leopoldseder, and H. Mössenböck. 2021. Low-overhead multi-language dynamic taint analysis on managed runtimes through speculative optimization. In MPLR ’21: 18th ACM SIGPLAN International Conference on Managed Programming Languages and Runtimes, Münster, Germany, September 29-30, 2021, H. Kuchen and J. Singer (Eds.). ACM, 70–87. https://doi.org/10.1145/3475738.3480939
    [27]
    S. Lekies, B. Stock, and M. Johns. 2013. 25 million flows later: large-scale detection of DOM-based XSS. In 2013 ACM SIGSAC Conference on Computer and Communications Security, CCS’13, Berlin, Germany, November 4-8, 2013, A. Sadeghi, V. D. Gligor, and Moti Yung (Eds.). ACM, 1193–1204. https://doi.org/10.1145/2508859.2516703
    [28]
    S. Muchnick. 1997. Advanced compiler design and implementation. Morgan Kaufmann Publishers, San Francisco, Calif.
    [29]
    J. Newsome and D. X. Song. 2005. Dynamic Taint Analysis for Automatic Detection, Analysis, and SignatureGeneration of Exploits on Commodity Software. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2005, San Diego, California, USA. The Internet Society. https://www.ndss-symposium.org/ndss2005/dynamic-taint-analysis-automatic-detection-analysis-and-signaturegeneration-exploits-commodity/
    [30]
    F. Qin, C. Wang, Z. Li, H. Kim, Y. Zhou, and Y. Wu. 2006. LIFT: A Low-Overhead Practical Information Flow Tracking System for Detecting Security Attacks. In 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 9-13 December 2006, Orlando, Florida, USA. IEEE Computer Society, 135–148. https://doi.org/10.1109/MICRO.2006.29
    [31]
    E. J. Schwartz, T. Avgerinos, and D. Brumley. 2010. All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask). In 31st IEEE Symposium on Security and Privacy, S&P 2010, 16-19 May 2010, Berleley/Oakland, California, USA. IEEE Computer Society, 317–331. https://doi.org/10.1109/SP.2010.26
    [32]
    A. Slowinska and H. Bos. 2009. Pointless tainting?: evaluating the practicality of pointer tainting. In Proceedings of the 2009 EuroSys Conference, Nuremberg, Germany, April 1-3, 2009, W. Schröder-Preikschat, J. Wilkes, and R. Isaacs (Eds.). ACM, 61–74. https://doi.org/10.1145/1519065.1519073
    [33]
    L. Stadler, T. Würthinger, and H. Mössenböck. 2014. Partial Escape Analysis and Scalar Replacement for Java. In 12th Annual IEEE/ACM International Symposium on Code Generation and Optimization, CGO 2014, Orlando, FL, USA, February 15-19, 2014, D. R. Kaeli and T. Moseley (Eds.). ACM, 165. https://dl.acm.org/citation.cfm?id=2544157
    [34]
    D. Thomas, C. Fowler, and A. Hunt. 2005. Programming Ruby - the pragmatic programmer’s guide (2. ed.). O’Reilly.
    [35]
    G. Wondracek, P. M. Comparetti, C. Krügel, and E. Kirda. 2008. Automatic Network Protocol Analysis. In Proceedings of the Network and Distributed System Security Symposium, NDSS 2008, San Diego, California, USA, 10th February - 13th February 2008. The Internet Society. https://www.ndss-symposium.org/ndss2008/automatic-network-protocol-analysis/
    [36]
    T. Würthinger, C. Wimmer, C. Humer, A. Wöß, L. Stadler, C. Seaton, G. Duboscq, D. Simon, and M. Grimmer. 2017. Practical partial evaluation for high-performance dynamic language runtimes. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI 2017, Barcelona, Spain, June 18-23, 2017, A. Cohen and M. T. Vechev (Eds.). ACM, 662–676. https://doi.org/10.1145/3062341.3062381
    [37]
    T. Würthinger, C. Wimmer, A. Wöß, L. Stadler, G. Duboscq, C. Humer, G. Richards, D. Simon, and M. Wolczko. 2013. One VM to rule them all. In ACM Symposium on New Ideas in Programming and Reflections on Software, Onward! 2013, part of SPLASH ’13, Indianapolis, IN, USA, October 26-31, 2013, A. L. Hosking, P. Th. Eugster, and R. Hirschfeld (Eds.). ACM, 187–204. https://doi.org/10.1145/2509578.2509581
    [38]
    Q. Zhang, J. McCullough, J. Ma, N. Schear, M. Vrable, A. Vahdat, A. C. Snoeren, G. M. Voelker, and S. Savage. 2010. Neon: system support for derived data management. In Proceedings of the 6th International Conference on Virtual Execution Environments, VEE 2010, Pittsburgh, Pennsylvania, USA, March 17-19, 2010, M. E. Fiuczynski, E. D. Berger, and A. Warfield (Eds.). ACM, 63–74. https://doi.org/10.1145/1735997.1736008

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    MPLR '22: Proceedings of the 19th International Conference on Managed Programming Languages and Runtimes
    September 2022
    161 pages
    ISBN:9781450396967
    DOI:10.1145/3546918
    This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 November 2022

    Check for updates

    Author Tags

    1. Dynamic Taint Analysis
    2. GraalVM
    3. TruffleTaint

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    MPLR '22

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,046
    • Downloads (Last 6 weeks)723

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media