Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3355369.3355585acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
research-article
Public Access

Opening the Blackbox of VirusTotal: Analyzing Online Phishing Scan Engines

Published: 21 October 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Online scan engines such as VirusTotal are heavily used by researchers to label malicious URLs and files. Unfortunately, it is not well understood how the labels are generated and how reliable the scanning results are. In this paper, we focus on VirusTotal and its 68 third-party vendors to examine their labeling process on phishing URLs. We perform a series of measurements by setting up our own phishing websites (mimicking PayPal and IRS) and submitting the URLs for scanning. By analyzing the incoming network traffic and the dynamic label changes at VirusTotal, we reveal new insights into how VirusTotal works and the quality of their labels. Among other things, we show that vendors have trouble flagging all phishing sites, and even the best vendors missed 30% of our phishing sites. In addition, the scanning results are not immediately updated to VirusTotal after the scanning, and there are inconsistent results between VirusTotal scan and some vendors' own scanners. Our results reveal the need for developing more rigorous methodologies to assess and make use of the labels obtained from VirusTotal.

    Supplementary Material

    peng (peng.zip)
    Supplemental movie, appendix, image and software files for, Opening the Blackbox of VirusTotal: Analyzing Online Phishing Scan Engines

    References

    [1]
    Digital ocean. https://www.digitalocean.com/.
    [2]
    Irs login page. https://sa.www4.irs.gov/ola/.
    [3]
    Joe sandbox. https://www.joesecurity.org/.
    [4]
    Jotti's malware scan. https://virusscan.jotti.org/.
    [5]
    Namesilo. https://www.namesilo.com/.
    [6]
    Paypal login page. https://www.paypal.com/us/signin.
    [7]
    Virscan. http://VirSCAN.org.
    [8]
    Virustotal. https://www.virustotal.com/.
    [9]
    Virustotal faq. https://support.virustotal.com/hc/en-us/articles/115002122285-AV-product-on-VirusTotal-detects-a-file-and-its-equivalent-commercial-version-does-not.
    [10]
    Virustotal public api v2.0. https://www.virustotal.com/en/documentation/public-api/.
    [11]
    Virustotal vendors. https://support.virustotal.com/hc/en-us/articles/115002146809-Contributors.
    [12]
    Akhawe, D., and Felt, A. P. Alice in warningland: A large-scale field study of browser security warning effectiveness. In Proc. of USENIX Security (2013).
    [13]
    Aonzo, S., Merlo, A., Tavella, G., and Fratantonio, Y. Phishing attacks on modern android. In Proc. of CCS (2018).
    [14]
    Ardi, C., and Heidemann, J. Auntietuna: Personalized content-based phishing detection. In NDSS Usable Security Workshop (USEC) (2016).
    [15]
    Cai, Z., and Yap, R. H. Inferring the detection logic and evaluating the effectiveness of android anti-virus apps. In Proc. of CODASPY (2016).
    [16]
    Catakoglu, O., Balduzzi, M., and Balzarotti, D. Automatic extraction of indicators of compromise for web applications. In Proc. of WWW (2016).
    [17]
    Chen, Y., Nadji, Y., Romero-Gómez, R., Antonakakis, M., and Dagon, D. Measuring network reputation in the ad-bidding process. In Proc. of DIMVA (2017).
    [18]
    Cheng, B., Ming, J., Fu, J., Peng, G., Chen, T., Zhang, X., and Marion, J.-Y. Towards paving the way for large-scale windows malware analysis: Generic binary unpacking with orders-of-magnitude performance boost. In Proc. of CCS (2018).
    [19]
    Dong, Z., Kapadia, A., Blythe, J., and Camp, L.J. Beyond the lock icon: real-time detection of phishing websites using public key certificates. In Proc. of eCrime (2015).
    [20]
    Hong, G., Yang, Z., Yang, S., Zhang, L., Nan, Y., Zhang, Z., Yang, M., Zhang, Y., Qian, Z., and Duan, H. How you get shot in the back: A systematical study about cryptojacking in the real world. In Proc. of CCS (2018).
    [21]
    Invernizzi, L., Thomas, K., Kapravelos, A., Comanescu, O., Picod, J., and Bursztein, E. Cloak of visibility: Detecting when machines browse a different web. In Proc. of IEEE S&P (2016).
    [22]
    Kantchelian, A., Tschantz, M. C., Afroz, S., Miller, B., Shankar, V., Bachwani, R., Joseph, A. D., and Tygar, J. D. Better malware ground truth: Techniques for weighting anti-virus vendor labels. In Proc. of AISec (2015).
    [23]
    Kim, D., Kwon, B. J., and Dumitraş, T. Certified malware: Measuring breaches of trust in the windows code-signing pki. In Proc. of CCS (2017).
    [24]
    Kim, D., Kwon, B. J., Kozák, K., Gates, C., and DumitraÈŹ, T. The broken shield: Measuring revocation effectiveness in the windows code-signing pki. In Proc. of USENIX Security (2018).
    [25]
    Kleitman, S., Law, M. K., and Kay, J. ItâĂŹs the deceiver and the receiver: Individual differences in phishing susceptibility and false positives with item profiling. PLOS One (2018).
    [26]
    Korczynski, D., and Yin, H. Capturing malware propagations with code injections and code-reuse attacks. In Proc. of CCS (2017).
    [27]
    Kwon, B. J., Mondal, J., Jang, J., Bilge, L., and Dumitraş, T. The dropper effect: Insights into malware distribution with downloader graph analytics. In Proc. of CCS (2015).
    [28]
    Lever, C., Kotzias, P., Balzarotti, D., Caballero, J., and Antonakakis, M. A lustrum of malware network communication: Evolution and insights. In Proc. of IEEE S&P (2017).
    [29]
    Li, B., Vadrevu, P., Lee, K. H., Perdisci, R., Liu, J., Rahbarinia, B., Li, K., and Antonakakis, M. Jsgraph: Enabling reconstruction of web attacks via efficient tracking of live in-browser javascript executions. In Proc. of NDSS (2018).
    [30]
    Miramirkhani, N., Barron, T., Ferdman, M., and Nikiforakis, N. Panning for gold.com: Understanding the dynamics of domain dropcatching. In Proc. of WWW (2018).
    [31]
    Neupane, A., Saxena, N., Kuruvilla, K., Georgescu, M., and Kana, R. K. Neural signatures of user-centered security: An fmri study of phishing, and malware warnings. In Proc. of NDSS (2014).
    [32]
    Oest, A., Safaei, Y., Doupé, A., Ahn, G., Wardman, B., and Tyers, K. Phishfarm: A scalable framework for measuring the effectiveness of evasion techniques against browser phishing blacklists. In Proc. of IEEE S&P (2019).
    [33]
    Oprea, A., Li, Z., Norris, R., and Bowers, K. Made: Security analytics for enterprise threat detection. In Proc. of ACSAC (2018).
    [34]
    Peng, P., Xu, C., Quinn, L., Hu, H., Viswanath, B., and Wang, G. What happens after you leak your password: Understanding credential sharing on phishing sites. In Proc. of AsiaCCS (2019).
    [35]
    Razaghpanah, A., Nithyanand, R., Vallina-Rodriguez, N., Sundaresan, S., Allman, M., Kreibich, C., and Gill, P. Apps, trackers, privacy, and regulators: A global study of the mobile tracking ecosystem. In Proc. of NDSS (2018).
    [36]
    Sarabi, A., and Liu, M. Characterizing the internet host population using deep learning: A universal and lightweight numerical embedding. In Proc. of IMC (2018).
    [37]
    Schwartz, E. J., Cohen, C. F., Duggan, M., Gennari, J., Havrilla, J. S., and Hines, C. Using logic programming to recover c++ classes and methods from compiled executables. In Proc. of CCS (2018).
    [38]
    Sharif, M., Urakawa, J., Christin, N., Kubota, A., and Yamada, A. Predicting impending exposure to malicious content from user behavior. In Proc. of CCS (2018).
    [39]
    Szurdi, J., and Christin, N. Email typosquatting. In Proc. of IMC (2017).
    [40]
    Tian, K., Jan, S. T. K., Hu, H., Yao, D., and Wang, G. Needle in a haystack: Tracking down elite phishing domains in the wild. In Proc. of IMC (2018).
    [41]
    Wang, H., Liu, Z., Liang, J., Vallina-Rodriguez, N., Guo, Y., Li, L., Tapiador, J., Cao, J., and Xu, G. Beyond google play: A large-scale comparative study of chinese android app markets. In Proc. of IMC (2018).
    [42]
    Wang, L., Nappa, A., Caballero, J., Ristenpart, T., and Akella, A. Whowas: A platform for measuring web deployments on iaas clouds. In Proc. of IMC (2014).
    [43]
    Whittaker, C., Ryner, B., and Nazif, M. Large-scale automatic classification of phishing pages. In Proc. of NDSS (2010).
    [44]
    Wong, M. Y., and Lie, D. Tackling runtime-based obfuscation in android with tiro. In Proc. of USENIX Security (2018).
    [45]
    Xu, D., Ming, J., Fu, Y., and Wu, D. Vmhunt: A verifiable approach to partially-virtualized binary code simplification. In Proc. of CCS (2018).
    [46]
    Xu, Z., Nappa, A., Baykov, R., Yang, G., Caballero, J., and Gu, G. Autoprobe: Towards automatic active malicious server probing using dynamic binary analysis. In Proc. of CCS (2014).
    [47]
    Zuo, C., and Lin, Z. Smartgen: Exposing server urls of mobile apps with selective symbolic execution. In Proc. of WWW (2017).

    Cited By

    View all
    • (2024)Securing the WebJournal of Information Security and Cybercrimes Research10.26735/UGSQ66207:1(05-28)Online publication date: 2-Jun-2024
    • (2024)Understanding Characteristics of Phishing Reports from Experts and Non-Experts on TwitterIEICE Transactions on Information and Systems10.1587/transinf.2023EDP7221E107.D:7(807-824)Online publication date: 1-Jul-2024
    • (2024)A Large Scale Study and Classification of VirusTotal Reports on Phishing and Malware URLsACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365504252:1(55-56)Online publication date: 13-Jun-2024
    • Show More Cited By

    Index Terms

    1. Opening the Blackbox of VirusTotal: Analyzing Online Phishing Scan Engines

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      IMC '19: Proceedings of the Internet Measurement Conference
      October 2019
      497 pages
      ISBN:9781450369480
      DOI:10.1145/3355369
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 21 October 2019

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      IMC '19
      IMC '19: ACM Internet Measurement Conference
      October 21 - 23, 2019
      Amsterdam, Netherlands

      Acceptance Rates

      IMC '19 Paper Acceptance Rate 39 of 197 submissions, 20%;
      Overall Acceptance Rate 277 of 1,083 submissions, 26%

      Upcoming Conference

      IMC '24
      ACM Internet Measurement Conference
      November 4 - 6, 2024
      Madrid , AA , Spain

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)450
      • Downloads (Last 6 weeks)52

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Securing the WebJournal of Information Security and Cybercrimes Research10.26735/UGSQ66207:1(05-28)Online publication date: 2-Jun-2024
      • (2024)Understanding Characteristics of Phishing Reports from Experts and Non-Experts on TwitterIEICE Transactions on Information and Systems10.1587/transinf.2023EDP7221E107.D:7(807-824)Online publication date: 1-Jul-2024
      • (2024)A Large Scale Study and Classification of VirusTotal Reports on Phishing and Malware URLsACM SIGMETRICS Performance Evaluation Review10.1145/3673660.365504252:1(55-56)Online publication date: 13-Jun-2024
      • (2024)A Large Scale Study and Classification of VirusTotal Reports on Phishing and Malware URLsAbstracts of the 2024 ACM SIGMETRICS/IFIP PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems10.1145/3652963.3655042(55-56)Online publication date: 10-Jun-2024
      • (2024)TIPCE: A Longitudinal Threat Intelligence Platform Comprehensiveness AnalysisProceedings of the Fourteenth ACM Conference on Data and Application Security and Privacy10.1145/3626232.3653278(349-360)Online publication date: 19-Jun-2024
      • (2024)VeriSMS: A Message Verification System for Inclusive Patient Outreach against Phishing AttacksProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642027(1-17)Online publication date: 11-May-2024
      • (2024)VT-SOS: A Cost-effective URL Warning utilizing VirusTotal as a Second Opinion ServiceNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575506(1-5)Online publication date: 6-May-2024
      • (2024)An Exploration of shared code execution for malware analysis2024 International Conference on Artificial Intelligence, Computer, Data Sciences and Applications (ACDSA)10.1109/ACDSA59508.2024.10467679(1-9)Online publication date: 1-Feb-2024
      • (2024)C2-Eye: framework for detecting command and control (C2) connection of supply chain attacksInternational Journal of Information Security10.1007/s10207-024-00850-y23:4(2531-2545)Online publication date: 29-Apr-2024
      • (2024)Crawling to the Top: An Empirical Evaluation of Top List UsePassive and Active Measurement10.1007/978-3-031-56249-5_12(277-306)Online publication date: 20-Mar-2024
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media