Abstract
The ever-increasing number of malware families and polymorphic variants creates a pressing need for automatic tools to cluster the collected malware into families and generate behavioral signatures for their detection. Among these, network traffic is a powerful behavioral signature and network signatures are widely used by network administrators. In this paper we present FIRMA, a tool that given a large pool of network traffic obtained by executing unlabeled malware binaries, generates a clustering of the malware binaries into families and a set of network signatures for each family. Compared with prior tools, FIRMA produces network signatures for each of the network behaviors of a family, regardless of the type of traffic the malware uses (e.g., HTTP, IRC, SMTP, TCP, UDP). We have implemented FIRMA and evaluated it on two recent datasets comprising nearly 16,000 unique malware binaries. Our results show that FIRMA’s clustering has very high precision (100% on a labeled dataset) and recall (97.7%). We compare FIRMA’s signatures with manually generated ones, showing that they are as good (often better), while generated in a fraction of the time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abouelhoda, M.I., Kurtz, S., Ohlebusch, E.: Replacing suffix trees with enhanced suffix arrays. Journal of Discrete Algorithms 2(1) (2004)
Anubis: Analyzing unknown binaries, http://anubis.iseclab.org/
Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007)
Bayer, U., Comparetti, P.M., Hlauschek, C., Kruegel, C., Kirda, E.: Scalable, behavior-based malware clustering. In: NDSS (2009)
Caballero, J., Grier, C., Kreibich, C., Paxson, V.: Measuring pay-per-install: The commoditization of malware distribution. In: Usenixsecurity (2011)
Caballero, J., Johnson, N.M., McCamant, S., Song, D.: Binary code extraction and interface identification for security applications. In: NDSS (2010)
Caballero, J., Yin, H., Liang, Z., Song, D.: Polyglot: Automatic extraction of protocol message format using dynamic binary analysis. In: CCS (2007)
Chvatal, V.: A greedy heuristic for the set-covering problem. Mathematics of Operations Research 4(3) (1979)
Cui, W., Peinado, M., Wang, H.J., Locasto, M.: shieldgen: Automatic data patch generation for unknown vulnerabilities with informed probing, Oakland (2007)
Dreger, H., Feldmann, A., Mai, M., Paxson, V., Sommer, R.: Dynamic application-layer protocol analysis for network intrusion detection. In: Usenixsecurity (2006)
Graziano, M., Leita, C., Balzarotti, D.: Towards network containment in malware analysis systems. In: ACSAC (2012)
Grier, C., et al.: Manufacturing compromise: The emergence of exploit-as-a-service. In: CCS (2012)
Gu, G., Perdisci, R., Zhang, J., Lee, W.: Botminer: Clustering analysis of network traffic for protocol and structure independent botnet detection. In: Usenixsecurity (2008)
Guo, F., Ferrie, P., Chiueh, T.-C.: A study of the packer problem and its solutions. In: Lippmann, R., Kirda, E., Trachtenberg, A. (eds.) RAID 2008. LNCS, vol. 5230, pp. 98–115. Springer, Heidelberg (2008)
Haffner, P., Sen, S., Spatscheck, O., Wang, D.: acas: Automated construction of application signatures. In: Minenet (2005)
Jang, J., Brumley, D., Venkataraman, S.: Bitshred: Feature hashing malware for scalable triage and semantic analysis. In: CCS (2011)
John, J.P., Moshchuk, A., Gribble, S.D., Krishnamurthy, A.: Studying spamming botnets using botlab. In: NSDI (2009)
Kim, H.-A., Karp, B.: Autograph: Toward automated, distributed worm signature detection. In: Usenixsecurity (2004)
Kirda, E., Kruegel, C., Banks, G., Vigna, G., Kemmerer, R.A.: Behavior-based spyware detection. In: Usenixsecurity (2006)
Kreibich, C., Crowcroft, J.: Honeycomb - creating intrusion detection signatures using honeypots. In: Hotnets (2003)
Kreibich, C., Weaver, N., Kanich, C., Cui, W., Paxson, V.: gq: Practical containment for measuring modern malware systems. In: IMC (2011)
Li, Z., Sanghi, M., Chavez, B., Chen, Y., Kao, M.-Y.: Hamsa: Fast signature generation for zero-day polymorphic worms with provable attack resilience, Oakland (2006)
The malicia project, http://malicia-project.com/ .
Nappa, A., Rafique, M.Z., Caballero, J.: Driving in the cloud: An analysis of drive-by download operations and abuse reporting. In: Rieck, K., Stewin, P., Seifert, J.-P. (eds.) DIMVA 2013. LNCS, vol. 7967, pp. 1–20. Springer, Heidelberg (2013)
Newsome, J., Karp, B., Song, D.: Polygraph: Automatically generating signatures for polymorphic worms, Oakland (2005)
Perdisci, R., Lee, W., Feamster, N.: Behavioral clustering of http-based malware and signature generation using malicious network traces. In: NSDI (2010)
Perdisci, R., Vamo, M.U.: Towards a fully automated malware clustering validity analysis. In: ACSAC (2012)
Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008)
Rieck, K., Schwenk, G., Limmer, T., Holz, T., Laskov, P.: Botzilla: Detecting the phoning home of malicious software. In: ACM Symposium on Applied Computing (2010)
Rossow, C., Dietrich, C.J.: proVeX: Detecting botnets with encrypted command and control channels. In: Rieck, K., Stewin, P., Seifert, J.-P. (eds.) DIMVA 2013. LNCS, vol. 7967, pp. 21–40. Springer, Heidelberg (2013)
Rossow, C., Dietrich, C.J., Bos, H., Cavallaro, L., van Steen, M., Freiling, F.C., Pohlmann, N.: Sandnet: Network traffic analysis of malicious software. In: Badgers (2011)
Singh, S., Estan, C., Varghese, G., Savage, S.: Automated worm fingerprinting. In: Osdi (2004)
Snort, http://www.snort.org/ .
Suricata, http://suricata-ids.org/ .
Tuck, N., Sherwood, T., Calder, B., Varghese, G.: Deterministic memory-efficient string matching algorithms for intrusion detection. In: Infocom (2004)
Vrable, M., Ma, J., Chen, J., Moore, D., Vandekieft, E., Snoeren, A.C., Voelker, G.M., Savage, S.: Scalability, fidelity, and containment in the potemkin virtual honeyfarm. In: SOSP (2005)
Wang, K., Cretu, G.F., Stolfo, S.J.: Anomalous payload-based worm detection and signature generation. In: Valdes, A., Zamboni, D. (eds.) RAID 2005. LNCS, vol. 3858, pp. 227–246. Springer, Heidelberg (2006)
Wurzinger, P., Bilge, L., Holz, T., Goebel, J., Kruegel, C., Kirda, E.: Automatically generating models for botnet detection. In: Backes, M., Ning, P. (eds.) ESORICS 2009. LNCS, vol. 5789, pp. 232–249. Springer, Heidelberg (2009)
Wyke, J.: The zeroaccess botnet (2012), http://www.sophos.com/en-us/why-sophos/our-people/technical-papers/zeroaccess-botnet.aspx
Xie, Y., Yu, F., Achan, K., Panigrahy, R., Hulten, G., Osipkov, I.: Spamming botnets: Signatures and characteristics. In: Sigcomm (2008)
Yegneswaran, V., Giffin, J.T., Barford, P., Jha, S.: An architecture for generating semantics-aware signatures. In: Usenixsecurity (2005)
Yin, H., Song, D., Manuel, E., Kruegel, C., Kirda, E.: Panorama: Capturing system-wide information flow for malware detection and analysis. In: CCS (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rafique, M.Z., Caballero, J. (2013). FIRMA: Malware Clustering and Network Signature Generation with Mixed Network Behaviors. In: Stolfo, S.J., Stavrou, A., Wright, C.V. (eds) Research in Attacks, Intrusions, and Defenses. RAID 2013. Lecture Notes in Computer Science, vol 8145. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41284-4_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-41284-4_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41283-7
Online ISBN: 978-3-642-41284-4
eBook Packages: Computer ScienceComputer Science (R0)