Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3663529.3663824acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Open access

Automated End-to-End Dynamic Taint Analysis for WhatsApp

Published: 10 July 2024 Publication History

Abstract

Taint analysis aims to track data flows in systems, with potential use cases for security, privacy and performance. This paper describes an end-to-end dynamic taint analysis solution for WhatsApp. We use exploratory UI testing to generate realistic interactions and inputs, serving as data sources on the clients and then we track data propagation towards sinks on both client and server sides. Finally, a reporting pipeline localizes tainted flows in the source code, applies deduplication, filters false positives based on production call sites, and files tasks to code owners. Applied to WhatsApp, our approach found 89 flows that were fixed by engineers, and caught 50% of all privacy-related flows that required escalation, including instances that would have been difficult to uncover by conventional testing.

References

[1]
Khaled Ahmed, Yingying Wang, Mieszko Lis, and Julia Rubin. 2023. ViaLin: Path-Aware Dynamic Taint Analysis for Android. In Proc. of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (FSE). ACM, 1598–1610. https://doi.org/10.1145/3611643.3616330
[2]
Nadia Alshahwan, Arianna Blasi, Kinga Bojarczuk, Andrea Ciancone, Natalija Gucevska, Mark Harman, Simon Schellaert, Inna Harper, Yue Jia, Michal Krolikowski, Will Lewis, Dragos Martac, Rubmary Rojas, and Kate Ustiuzhanina. 2024. Enhancing Testing at Meta with Rich-State Simulated Populations. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering: Software Engineering in Practice. (Accepted, in press)
[3]
Steven Arzt, Siegfried Rasthofer, Christian Fritz, Eric Bodden, Alexandre Bartel, Jacques Klein, Yves Le Traon, Damien Octeau, and Patrick McDaniel. 2014. FlowDroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps. ACM SIGPLAN Notices, 49, 6 (2014), 259–269. https://doi.org/10.1145/2666356.2594299
[4]
Subarno Banerjee, Siwei Cui, Michael Emmi, Antonio Filieri, Liana Hadarean, Peixuan Li, Linghui Luo, Goran Piskachev, Nicolás Rosner, Aritra Sengupta, Omer Tripp, and Jingbo Wang. 2023. Compositional Taint Analysis for Enforcing Security Policies at Scale. In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 1985–1996. https://doi.org/10.1145/3611643.3613889
[5]
Marcin Copik, Alexandru Calotoiu, Tobias Grosser, Nicolas Wicki, Felix Wolf, and Torsten Hoefler. 2021. Extracting Clean Performance Models from Tainted Programs. In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. ACM, 403–417. https://doi.org/10.1145/3437801.3441613
[6]
Ákos Hajdu, Matteo Marescotti, Thibault Suzanne, Ke Mao, Radu Grigore, Per Gustafsson, and Dino Distefano. 2022. InfERL: Scalable and Extensible Erlang Static Analysis. In Proceedings of the 21st ACM SIGPLAN International Workshop on Erlang. ACM, 33–39. https://doi.org/10.1145/3546186.3549929
[7]
William G. J. Halfond, Alessandro Orso, and Panagiotis Manolios. 2006. Using Positive Tainting and Syntax-Aware Evaluation to Counter SQL Injection Attacks. In Proceedings of the 14th ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 175–185. https://doi.org/10.1145/1181775.1181797
[8]
Jingfei Kong, Cliff C. Zou, and Huiyang Zhou. 2006. Improving Software Security via Runtime Instruction-Level Taint Checking. In Proceedings of the 1st Workshop on Architectural and System Support for Improving Software Dependability. ACM, 18–24. https://doi.org/10.1145/1181309.1181313
[9]
Francesco Logozzo, Manuel Fahndrich, Ibrahim Mosaad, and Pieter Hooimeijer. 2019. Zoncolan: How Facebook uses static analysis to detect and prevent security issues. https://engineering.fb.com/2019/08/15/security/zoncolan/
[10]
Linghui Luo, Rajdeep Mukherjee, Omer Tripp, Martin Schäf, Qiang Zhou, and Daniel Sanchez. 2023. Long-term static analysis rule quality monitoring using true negatives. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering: Software Engineering in Practice. IEEE, 315–326. https://doi.org/10.1109/ICSE-SEIP58684.2023.00034
[11]
Ke Mao, Cons T Åhs, Sopot Cela, Dino Distefano, Nick Gardner, Radu Grigore, Per Gustafsson, Ákos Hajdu, Timotej Kapus, Matteo Marescotti, Gabriela Cunha Sampaio, and Thibault Suzanne. 2024. PrivacyCAT: Privacy-Aware Code Analysis at Scale. In Proceedings of the IEEE/ACM 46th International Conference on Software Engineering: Software Engineering in Practice. (Accepted, in press)
[12]
Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective automated testing for Android applications. In Proceedings of the 25th International Symposium on Software Testing and Analysis. ACM, 94–105. https://doi.org/10.1145/2931037.2931054
[13]
Ke Mao, Timotej Kapus, Lambros Petrou, Ákos Hajdu, Matteo Marescotti, Andreas Löscher, Mark Harman, and Dino Distefano. 2022. FAUSTA: Scaling Dynamic Analysis with Traffic Generation at WhatsApp. In Proceedings of 15th IEEE Conference on Software Testing, Verification and Validation. IEEE, 267–278. https://doi.org/10.1109/ICST53961.2022.00036
[14]
Sydur Rahaman, Iulian Neamtiu, and Xin Yin. 2021. Algebraic-Datatype Taint Tracking, with Applications to Understanding Android Identifier Leaks. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 70–82. https://doi.org/10.1145/3468264.3468550
[15]
Edward J. Schwartz, Thanassis Avgerinos, and David Brumley. 2010. All You Ever Wanted to Know about Dynamic Taint Analysis and Forward Symbolic Execution (but Might Have Been Afraid to Ask). In 2010 IEEE Symposium on Security and Privacy. 317–331. https://doi.org/10.1109/SP.2010.26
[16]
John W. Stamey and Ryan A. Rossi. 2009. Automatically Identifying Relations in Privacy Policies. In Proceedings of the 27th ACM International Conference on Design of Communication. ACM, 233–238. https://doi.org/10.1145/1621995.1622041
[17]
Zeya Tan and Wei Song. 2023. PTPDroid: Detecting Violated User Privacy Disclosures to Third-Parties of Android Apps. In Proceedings of the 2023 IEEE/ACM 45th International Conference on Software Engineering. 473–485. https://doi.org/10.1109/ICSE48619.2023.00050
[18]
Omer Tripp, Marco Pistoia, Stephen J. Fink, Manu Sridharan, and Omri Weisman. 2009. TAJ: Effective Taint Analysis of Web Applications. In Proceedings of the 30th ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM, 87–97. https://doi.org/10.1145/1542476.1542486
[19]
Jie Wang, Yunguang Wu, Gang Zhou, Yiming Yu, Zhenyu Guo, and Yingfei Xiong. 2020. Scaling Static Taint Analysis to Industrial SOA Applications: A Case Study at Alibaba. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 1477–1486. https://doi.org/10.1145/3368089.3417059
[20]
Wenyu Wang, Dengfeng Li, Wei Yang, Yurui Cao, Zhenwen Zhang, Yuetang Deng, and Tao Xie. 2018. An Empirical Study of Android Test Generation Tools in Industrial Cases. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 738–748. https://doi.org/10.1145/3238147.3240465
[21]
Xiaoyin Wang, Xue Qin, Mitra Bokaei Hosseini, Rocky Slavin, Travis D. Breaux, and Jianwei Niu. 2018. GUILeak: Tracing Privacy Policy Claims on User Input Data for Android Applications. In Proceedings of the 40th International Conference on Software Engineering. ACM, 37–47. https://doi.org/10.1145/3180155.3180196
[22]
Chengxu Yang, Yuanchun Li, Mengwei Xu, Zhenpeng Chen, Yunxin Liu, Gang Huang, and Xuanzhe Liu. 2021. TaintStream: Fine-Grained Taint Tracking for Big Data Platforms through Dynamic Code Translation. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 806–817. https://doi.org/10.1145/3468264.3468532
[23]
Sebastian Zimmeck and Steven M. Bellovin. 2014. Privee: An Architecture for Automatically Analyzing Web Privacy Policies. In Proceedings of the 23rd USENIX Conference on Security Symposium. USENIX Association, 1–16.

Cited By

View all
  • (2024)Enhancing Compositional Static Analysis with Dynamic AnalysisProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695599(2121-2129)Online publication date: 27-Oct-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
FSE 2024: Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering
July 2024
715 pages
ISBN:9798400706585
DOI:10.1145/3663529
This work is licensed under a Creative Commons Attribution-NoDerivs International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 July 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Taint analysis
  2. simulation
  3. testing

Qualifiers

  • Research-article

Conference

FSE '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)269
  • Downloads (Last 6 weeks)51
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Enhancing Compositional Static Analysis with Dynamic AnalysisProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695599(2121-2129)Online publication date: 27-Oct-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media