Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3324884.3416582acmconferencesArticle/Chapter ViewAbstractPublication PagesaseConference Proceedingsconference-collections
research-article

Automated third-party library detection for Android applications: are we there yet?

Published: 27 January 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Third-party libraries (TPLs) have become a significant part of the Android ecosystem. Developers can employ various TPLs with different functionalities to facilitate their app development. Unfortunately, the popularity of TPLs also brings new challenges and even threats. TPLs may carry malicious or vulnerable code, which can infect popular apps to pose threats to mobile users. Besides, the code of third-party libraries could constitute noises in some downstream tasks (e.g., malware and repackaged app detection). Thus, researchers have developed various tools to identify TPLs. However, no existing work has studied these TPL detection tools in detail; different tools focus on different applications with performance differences, but little is known about them.
    To better understand existing TPL detection tools and dissect TPL detection techniques, we conduct a comprehensive empirical study to fill the gap by evaluating and comparing all publicly available TPL detection tools based on four criteria: effectiveness, efficiency, code obfuscation-resilience capability, and ease of use. We reveal their advantages and disadvantages based on a systematic and thorough empirical study. Furthermore, we also conduct a user study to evaluate the usability of each tool. The results show that LibScout outperforms others regarding effectiveness, LibRadar takes less time than others and is also regarded as the most easy-to-use one, and LibPecker performs the best in defending against code obfuscation techniques. We further summarize the lessons learned from different perspectives, including users, tool implementation, and researchers. Besides, we enhance these open-sourced tools by fixing their limitations to improve their detection ability. We also build an extensible framework that integrates all existing available TPL detection tools, providing online service for the research community. We make publicly available the evaluation dataset and enhanced tools. We believe our work provides a clear picture of existing TPL detection techniques and also give a road-map for future directions.

    References

    [1]
    2007. survey. Guidelines for performing systematic literature reviews in software engineering.
    [2]
    2010-2019. AppBrain. https://www.appbrain.com/stats/libraries/.
    [3]
    2013. ART. https://source.android.com/devices/tech/dalvik.
    [4]
    2013. sdhash. http://roussev.net/sdhash/sdhash.html.
    [5]
    2016. Androguard. https://github.com/androguard/androguard.
    [6]
    2016. LibRadar. https://github.com/pkumza/LibRadar
    [7]
    2017. LibD. https://github.com/IIE-LibD/libd
    [8]
    2019. Allatori. http://www.allatori.com/
    [9]
    2019. Apktool. https://ibotpeaches.github.io/Apktool/.
    [10]
    2019. App Future. https://www.smashingmagazine.com/2017/02/current-trends-future-prospects-mobile-app-market/
    [11]
    2019. Benchmark data. https://github.com/presto-osu/orlis-orcis/tree/master/orlis/open_source_benchmarks
    [12]
    2019. BitBucket. https://bitbucket.org/
    [13]
    2019. DashO. https://www.preemptive.com/products/dasho/overview
    [14]
    2019. dex2jar. https://github.com/pxb1988/dex2jar
    [15]
    2019. F-Droid. https://f-droid.org/en/packages/
    [16]
    2019. Github. https://github.com/
    [17]
    2019. Google Mvn. https://dl.google.com/dl/android/maven2/index.html
    [18]
    2019. Jcenter. https://jcenter.bintray.com/
    [19]
    2019. Library Scraper. https://github.com/reddr/LibScout/blob/master/scripts/library-scraper.py
    [20]
    2019. Maven. https://mvnrepository.com/
    [21]
    2019. Proguard. https://www.guardsquare.com/en/products/proguard
    [22]
    2019. Soot. https://github.com/Sable/soot
    [23]
    2019. statista. https://www.statista.com/statistics/266210/number-of-available-applications-in-the-google-play-store/.
    [24]
    2020. Android App Bundle. https://developer.android.com/platform/technology/app-bundle.
    [25]
    2020. F1 score. https://en.wikipedia.org/wiki/F1_score.
    [26]
    2020. gurobi. https://www.gurobi.com/.
    [27]
    2020. kotlin. https://kotlinlang.org/.
    [28]
    2020. LibDetect. https://sites.google.com/view/libdetect
    [29]
    2020. LibDetect. https://sites.google.com/view/libdetect/.
    [30]
    2020. LibID updated code. https://github.com/MIchicho/LibID
    [31]
    2020. National Vulnerability Database. https://nvd.nist.gov/
    [32]
    2020. Questionnaire of User Study. https://forms.gle/ueJAkuone9ZnCXn68.
    [33]
    Michael Backes, Sven Bugiel, and Erik Derr. 2016. Reliable Third-Party Library Detection in Android and Its Security Applications. In CCS.
    [34]
    Salman A. Baset, Shih-Wei Li, Philippe Suter, and Omer Tripp. 2017. Identifying Android Library Dependencies in the Presence of Code Obfuscation and Minimization. In Proceedings of the 39th International Conference on Software Engineering Companion.
    [35]
    M. Baykara and E. Colak. 2018. A review of cloned mobile malware applications for Android devices. In Proc. ISDFS. 1--5.
    [36]
    Kai Chen, Peng Liu, and Yingjun Zhang. 2014. Achieving accuracy and scalability simultaneously in detecting application clones on Android markets. In Proceedings of the 36th International Conference on Software Engineering. ACM, 175--186.
    [37]
    Kai Chen, Peng Liu, and Y. Zhang. 2014. Achieving Accuracy and Scalability Simultaneously in Detecting Application Clones on Android Markets. In Proc. ICSE.
    [38]
    Sen Chen, Lingling Fan, Guozhu Meng, Ting Su, Minhui Xue, Yinxing Xue, Yang Liu, and Lihua Xu. 2020. An Empirical Assessment of Security Risks of Global Android Banking Apps. In Proceedings of the 42st International Conference on Software Engineering. IEEE Press, 596--607.
    [39]
    Sen Chen, Ting Su, Lingling Fan, Guozhu Meng, Minhui Xue, Yang Liu, and Lihua Xu. 2018. Are mobile banking apps secure? What can be improved?. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. ACM, 797--802.
    [40]
    Shauvik Roy Choudhary, Alessandra Gorla, and Alessandro Orso. 2015. Automated Test Input Generation for Android: Are We There Yet?. In Proc. ASE.
    [41]
    Lingling Fan, Ting Su, Sen Chen, Guozhu Meng, Yang Liu, Lihua Xu, and Geguang Pu. 2018. Efficiently manifesting asynchronous programming errors in android apps. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. ACM, 486--497.
    [42]
    Lingling Fan, Ting Su, Sen Chen, Guozhu Meng, Yang Liu, Lihua Xu, Geguang Pu, and Zhendong Su. 2018. Large-scale analysis of framework-specific exceptions in Android apps. In 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE). IEEE, 408--419.
    [43]
    Hongmu Han, Ruixuan Li, and Junwei Tang. 2018. Identify and Inspect Libraries in Android Applications. Wireless Personal Communications vol 103, pp491--503 (2018).
    [44]
    S. Hanna, L. Huang, E. Wu, S. Li, C. Chen, and D. Song. 2012. Juxtapp: a scalable system for detecting code reuse among Android applications. In Proc. DIMVA.
    [45]
    C. Kai, W. Peng, L. Yeonjoon, Wang XiaoFeng, Zhang Nan, Huang Heqing, Zou Wei, and Liu Peng. 2015. Finding unknown malice in 10 seconds: Mass vetting for new threats at the google-play scale. In Proc. USENIX Security.
    [46]
    P. Kong, L. Li, J. Gao, K. Liu, T. F. Bissyandé, and J. Klein. 2019. Automated Testing of Android Apps: A Systematic Literature Review. IEEE Transactions on Reliability 68, 1 (March 2019), 45--66.
    [47]
    Li Li, Taegawende Bissyandé, Jacques Klein, and Yves Le Traon. 2016. An Investigation into the Use of Common Libraries in Android Apps. In SANER.
    [48]
    L. Li, T. F. Bissyande, and J. Klein. 2019. Rebooting Research on Detecting Repackaged Android Apps: Literature Review and Benchmark. IEEE Transactions on Software Engineering (2019), 1--1.
    [49]
    M. Li, P. Wang, W. Wang, S. Wang, D. Wu, J. Liu, R. Xue, W. Huo, and W. Zou. 2018. Large-scale Third-party Library Detection in Android Markets. IEEE Transactions on Software Engineering (2018), 1--1.
    [50]
    Menghao Li, Wei Wang, Pei Wang, Shuai Wang, Dinghao Wu, Jian Liu, Rui Xue, and Wei Huo. 2017. LibD: Scalable and Precise Third-party Library Detection in Android Markets. In Proc. ICSE.
    [51]
    J. Lin, B. Liu, N. Sadeh, and J.I. Hong. 2014. Modeling users mobile app privacy preferences: Restoring usability in a sea of permission settings. In Proc. SOUPS.
    [52]
    B. Liu, B. Liu, H. Jin, and R. Govindan. 2015. Efficient privilege de-escalation for ad libraries in mobile apps. In MobiSys.
    [53]
    Ziang Ma, Haoyu Wang, Yao Guo, and Xiangqun Chen. 2016. LibRadar: Fast and Accurate Detection of Third-party Libraries in Android Apps. In Proc. ICSE-C.
    [54]
    Annamalai Narayanan, Lihui Chen, and Chee Keong Chan. 2014. AdDetect: Automated detection of Android ad libraries using semantic analysis. In Proc. ISSNIP.
    [55]
    Yuru Shao, Xiapu Luo, Chenxiong Qian, Pengfei Zhu, and Lei Zhang. 2014. Towards a scalable resource-driven approach for detecting repackaged Android applications. In Proc. ACSAC.
    [56]
    C. Soh, H. B. K. Tan, Y. L. Arnatovich, A. Narayanan, and L. Wang. 2016. LibSift: Automated Detection of Third-Party Libraries in Android Applications. In APSEC.
    [57]
    Ting Su, Lingling Fan, Sen Chen, Yang Liu, Lihua Xu, Geguang Pu, and Zhendong Su. 2020. Why My App Crashes Understanding and Benchmarking Framework-specific Exceptions of Android apps. IEEE Transactions on Software Engineering (2020).
    [58]
    Haoyu Wang and Yao Guo. 2017. Understanding Third-party Libraries in Mobile App Analysis. In Proc. ICSE-C.
    [59]
    Yan Wang, Haowei Wu, Hailong Zhang, and Atanas Rountev. 2018. ORLIS: Obfuscation-resilient Library Detection for Android. In Proc. MOBILESoft.
    [60]
    Claes Wohlin. 2014. Guidelines for Snowballing in Systematic Literature Studies and a Replication in Software Engineering. In Proc. 18thInt. Conf. Eval. Assessment Softw. Eng.
    [61]
    Xia Zeng, Dengfeng Li, Wujie Zheng, Fan Xia, Yuetang Deng, Wing Lam, Wei Yang, and Tao Xie. 2016. Automated Test Input Generation for Android: Are We Really There yet in an Industrial Case?. In Proc. FSE.
    [62]
    Xian Zhan, Tao Zhang, and Yutian Tang. 2019. A Comparative Study of Android Repackaged Apps Detection Techniques. In Proc. SANER.
    [63]
    Fangfang Zhang, Heqing Huang, Sencun Zhu, Dinghao Wu, and Peng Liu. 2014. ViewDroid: Towards Obfuscation-Resilient Mobile Application Repackaging Detection. In Proc. ACM WiSec.
    [64]
    Yuan Zhang, Jiarun Dai, Xiaohan Zhang, Sirong Huang, Zhemin Yang, Min Yang, and Hao Chen. 2018. Detecting third-party libraries in Android applications with high precision and recall. In SANER.
    [65]
    W. Zhou, Y. Zhou, M. Grace, X. Jiang, and S. Zou. 2013. Fast, scalable detection of Piggybacked mobile applications. In Proc. CODASPY.
    [66]
    W. Zhou, Y. Zhou, X. Jiang, and P. Ning. 2012. Detecting repackaged smartphone applications in third-party Android marketplaces. In Proc. CODASPY.

    Cited By

    View all
    • (2024)Global Prosperity or Local Monopoly? Understanding the Geography of App PopularityProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644935(322-334)Online publication date: 15-Apr-2024
    • (2024)AndroLibZoo: A Reliable Dataset of Libraries Based on Software Dependency AnalysisProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644866(32-36)Online publication date: 15-Apr-2024
    • (2024)A Comprehensive Study of Learning-based Android Malware Detectors under Challenging EnvironmentsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623320(1-13)Online publication date: 20-May-2024
    • Show More Cited By

    Index Terms

    1. Automated third-party library detection for Android applications: are we there yet?

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering
        December 2020
        1449 pages
        ISBN:9781450367684
        DOI:10.1145/3324884
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        In-Cooperation

        • IEEE CS

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 27 January 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Android
        2. empirical study
        3. library detection
        4. third-party library

        Qualifiers

        • Research-article

        Conference

        ASE '20
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 82 of 337 submissions, 24%

        Upcoming Conference

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)171
        • Downloads (Last 6 weeks)9
        Reflects downloads up to

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Global Prosperity or Local Monopoly? Understanding the Geography of App PopularityProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644935(322-334)Online publication date: 15-Apr-2024
        • (2024)AndroLibZoo: A Reliable Dataset of Libraries Based on Software Dependency AnalysisProceedings of the 21st International Conference on Mining Software Repositories10.1145/3643991.3644866(32-36)Online publication date: 15-Apr-2024
        • (2024)A Comprehensive Study of Learning-based Android Malware Detectors under Challenging EnvironmentsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623320(1-13)Online publication date: 20-May-2024
        • (2024)Android malware detection method based on graph attention networks and deep fusion of multimodal featuresExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121617237:PCOnline publication date: 1-Mar-2024
        • (2023)LibAM: An Area Matching Framework for Detecting Third-Party Libraries in BinariesACM Transactions on Software Engineering and Methodology10.1145/362529433:2(1-35)Online publication date: 23-Dec-2023
        • (2023)Software Composition Analysis for Vulnerability Detection: An Empirical Study on Java ProjectsProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616299(960-972)Online publication date: 30-Nov-2023
        • (2023)Third-Party Library Dependency for Large-Scale SCA in the C/C++ Ecosystem: How Far Are We?Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598143(1383-1395)Online publication date: 12-Jul-2023
        • (2023)Precise and Efficient Patch Presence Test for Android Applications against Code ObfuscationProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598061(347-359)Online publication date: 12-Jul-2023
        • (2023)Are Mobile Advertisements in Compliance with App’s Age Group?Proceedings of the ACM Web Conference 202310.1145/3543507.3583534(3132-3141)Online publication date: 30-Apr-2023
        • (2023)Scalably Detecting Third-Party Android Libraries With Two-Stage Bloom FilteringIEEE Transactions on Software Engineering10.1109/TSE.2022.321562849:4(2272-2284)Online publication date: 1-Apr-2023
        • Show More Cited By

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media