Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Graph-Based Fraud Detection in the Face of Camouflage

Published: 29 June 2017 Publication History

Abstract

Given a bipartite graph of users and the products that they review, or followers and followees, how can we detect fake reviews or follows? Existing fraud detection methods (spectral, etc.) try to identify dense subgraphs of nodes that are sparsely connected to the remaining graph. Fraudsters can evade these methods using camouflage, by adding reviews or follows with honest targets so that they look “normal.” Even worse, some fraudsters use hijacked accounts from honest users, and then the camouflage is indeed organic.
Our focus is to spot fraudsters in the presence of camouflage or hijacked accounts. We propose FRAUDAR, an algorithm that (a) is camouflage resistant, (b) provides upper bounds on the effectiveness of fraudsters, and (c) is effective in real-world data. Experimental results under various attacks show that FRAUDAR outperforms the top competitor in accuracy of detecting both camouflaged and non-camouflaged fraud. Additionally, in real-world experiments with a Twitter follower--followee graph of 1.47 billion edges, FRAUDAR successfully detected a subgraph of more than 4, 000 detected accounts, of which a majority had tweets showing that they used follower-buying services.

References

[1]
Leman Akoglu, Rishi Chandy, and Christos Faloutsos. 2013. Opinion fraud detection in online reviews by network effects. In Proceedings of the 7th International AAAI Conference on Weblogs and Social Media.
[2]
Alex Beutel, Kenton Murray, Christos Faloutsos, and Alexander J. Smola. 2014. Cobafi: Collaborative Bayesian filtering. In Proceedings of the 23rd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 97--108.
[3]
Alex Beutel, Wanhong Xu, Venkatesan Guruswami, Christopher Palow, and Christos Faloutsos. 2013. Copycatch: Stopping group attacks by spotting lockstep behavior in social networks. In Proceedings of the 22nd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee. 119--130.
[4]
Shankar Bhamidi, J. Michael Steele, Tauhid Zaman, and others. 2015. Twitter event networks and the superstar model. The Annals of Applied Probability 25, 5 (2015), 2462--2502.
[5]
Qiang Cao, Michael Sirivianos, Xiaowei Yang, and Tiago Pregueiro. 2012. Aiding the detection of fake accounts in large scale social online services. In Proceedings of the 9th USENIX Conference on Networked Systems Design and Implementation.
[6]
Moses Charikar. 2000. Greedy approximation algorithms for finding dense components in a graph. In Approximation Algorithms for Combinatorial Optimization. Springer, 84--95.
[7]
Corinna Cortes, Daryl Pregibon, and Chris Volinsky. 2001. Communities of Interest. Springer.
[8]
Saptarshi Ghosh, Bimal Viswanath, Farshad Kooti, Naveen Kumar Sharma, Gautam Korlam, Fabricio Benevenuto, Niloy Ganguly, and Krishna Phani Gummadi. 2012. Understanding and combating link farming in the twitter social network. In Proceedings of the 21st International Conference on World Wide Web. ACM, 61--70.
[9]
Christos Giatsidis, Dimitrios M. Thilikos, and Michalis Vazirgiannis. 2011. Evaluating cooperation in communities with the k-core structure. In Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 87--93.
[10]
Zhongshu Gu, Kexin Pei, Qifan Wang, Luo Si, Xiangyu Zhang, and Dongyan Xu. 2015. LEAPS: Detecting camouflaged attacks with statistical learning guided by program analysis. In Proceedings of 2015 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. IEEE, 57--68.
[11]
Zoltán Gyöngyi, Hector Garcia-Molina, and Jan Pedersen. 2004. Combating web spam with trustrank. In Proceedings of the 30th International Conference on Very Large Data Bases. 576--587.
[12]
Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2015. A general suspiciousness metric for dense blocks in multimodal data. In Proceedings of the 2015 IEEE International Conference on Data Mining (ICDM). IEEE, 781--786.
[13]
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014b. CatchSync: Catching synchronized behavior in large directed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--950.
[14]
Meng Jiang, Peng Cui, Alex Beutel, Christos Faloutsos, and Shiqiang Yang. 2014a. Inferring strange behavior from connectivity pattern in social networks. In Advances in Knowledge Discovery and Data Mining. Springer, 126--138.
[15]
Nitin Jindal and Bing Liu. 2008. Opinion spam and analysis. In Proceedings of the 2008 International Conference on Web Search and Data Mining 2008. ACM, 219--230.
[16]
Chris Kanich, Christian Kreibich, Kirill Levchenko, Brandon Enright, Geoffrey M. Voelker, Vern Paxson, and Stefan Savage. 2008. Spamalytics: An empirical analysis of spam marketing conversion. In Proceedings of the 15th ACM Conference on Computer and Communications Security. ACM, 3--14.
[17]
Chris Kanich, Nicholas Weaver, Damon McCoy, Tristan Halvorson, Christian Kreibich, Kirill Levchenko, Vern Paxson, Geoffrey M. Voelker, and Stefan Savage. 2011. Show me the money: Characterizing spam-advertised revenue. In Proceedings of the 20th USENIX Security Symposium. 15--15.
[18]
G. Karypis and V. Kumar. 1995. METIS: Unstructured graph partitioning and sparse matrix ordering system, Version 2. The University of Minnesota.
[19]
J. M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM (JACM) 46, 5 (1999), 604--632.
[20]
Haewoon Kwak, Changhyun Lee, Hosung Park, and Sue Moon. 2010. What is twitter, a social network or a news media? In Proceedings of the 19th International Conference on World Wide Web. ACM, 591--600.
[21]
Jure Leskovec, Daniel Huttenlocher, and Jon Kleinberg. 2010. Signed networks in social media. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1361--1370.
[22]
Julian McAuley and Jure Leskovec. 2013. Hidden factors and hidden topics: Understanding rating dimensions with review text. In Proceedings of the 7th ACM Conference on Recommender Systems. ACM, 165--172.
[23]
Bhaskar Mehta and Thomas Hofmann. 2008. A survey of attack-resistant collaborative filtering algorithms. IEEE Technical Committee on Data Engineering 31, 2 (2008), 14--22.
[24]
Bhaskar Mehta, Thomas Hofmann, and Wolfgang Nejdl. 2007. Robust collaborative filtering. In Proceedings of the 2007 ACM Conference on Recommender Systems. ACM, 49--56.
[25]
Bhaskar Mehta and Wolfgang Nejdl. 2008. Attack resistant collaborative filtering. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM, 75--82.
[26]
Bamshad Mobasher, Robin Burke, Runa Bhaumik, and Chad Williams. 2007. Toward trustworthy recommender systems: An analysis of attack models and algorithm robustness. ACM Transactions on Internet Technology (TOIT) 7, 4 (2007), 23.
[27]
Arash Molavi Kakhki, Chloe Kliman-Silver, and Alan Mislove. 2013. Iolaus: Securing online content rating systems. In Proceedings of the 22nd International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 919--930.
[28]
George L. Nemhauser, Laurence A. Wolsey, and Marshall L. Fisher. 1978. An analysis of approximations for maximizing submodular set functions. Mathematical Programming 14, 1 (1978), 265--294.
[29]
Michael O’Mahony, Neil Hurley, Nicholas Kushmerick, and Guénolé Silvestre. 2004. Collaborative recommendation: A robustness analysis. ACM Transactions on Internet Technology (TOIT) 4, 4 (2004), 344--377.
[30]
Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. 2011. Finding deceptive opinion spam by any stretch of the imagination. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies. Vol. 1, Association for Computational Linguistics, 309--319.
[31]
Shashank Pandit, Duen Horng Chau, Samuel Wang, and Christos Faloutsos. 2007. Netprobe: A fast and scalable system for fraud detection in online auction networks. In Proceedings of the 16th International Conference on World Wide Web. ACM, 201--210.
[32]
Bryan Perozzi, Leman Akoglu, Patricia Iglesias Sánchez, and Emmanuel Müller. 2014. Focused clustering and outlier detection in large attributed graphs. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1346--1355.
[33]
B. A. Prakash, M. Seshadri, A. Sridharan, S. Machiraju, and C. Faloutsos. 2010. Eigenspokes: Surprising patterns and community structure in large graphs. Pacific Asia Knowledge Discovery and Data Mining, 2010a. Vol. 84.
[34]
Anand Rajaraman, Jeffrey D. Ullman, Jeffrey David Ullman, and Jeffrey David Ullman. 2012. Mining of Massive Datasets. Vol. 1, Cambridge University Press Cambridge.
[35]
Neil Shah, Alex Beutel, Brian Gallagher, and Christos Faloutsos. 2014. Spotting suspicious link behavior with fBox: An adversarial perspective. In Proceedings of the 2014 IEEE International Conference on Data Mining (ICDM’14). IEEE, 959--964.
[36]
Gianluca Stringhini, Manuel Egele, Christopher Kruegel, and Giovanni Vigna. 2012. Poultry markets: On the underground economy of twitter followers. In Proceedings of the 2012 ACM Workshop on Workshop on Online Social Networks. ACM, 1--6.
[37]
Gianluca Stringhini, Christopher Kruegel, and Giovanni Vigna. 2010. Detecting spammers on social networks. In Proceedings of the 26th Annual Computer Security Applications Conference. ACM, 1--9.
[38]
Steven H. Strogatz. 2001. Exploring complex networks. Nature 410, 6825 (2001), 268--276.
[39]
Dinh Nguyen Tran, Bonan Min, Jinyang Li, and Lakshminarayanan Subramanian. 2009. Sybil-resilient online content voting. In Proceedings of the 6th USENIX symposium on Networked Systems Design and Implementation, Vol. 9, 15--28.
[40]
Charalampos Tsourakakis. 2015. The K-clique densest subgraph problem. In Proceedings of the 24th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1122--1132.
[41]
Sankar Virdhagriswaran and Gordon Dakin. 2006. Camouflaged fraud detection in domains with complex relationships. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 941--947.
[42]
Hongning Wang, Yue Lu, and ChengXiang Zhai. 2011. Latent aspect rating analysis without aspect keyword supervision. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 618--626.
[43]
Steve Webb, James Caverlee, and Calton Pu. 2008. Social honeypots: Making friends with a spammer near you. In Conference Proceedings of on Email and Anti-Spam.
[44]
Baoning Wu, Vinay Goel, and Brian D. Davison. 2006. Propagating trust and distrust to demote web spam. In Proceedings of the Workshop on Models of Trust for the Web. Vol. 190.
[45]
Haifeng Yu, Phillip B. Gibbons, Michael Kaminsky, and Feng Xiao. 2008. Sybillimit: A near-optimal social network defense against sybil attacks. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE, 3--17.
[46]
Haifeng Yu, Michael Kaminsky, Phillip B. Gibbons, and Abraham Flaxman. 2006. Sybilguard: Defending against sybil attacks via social networks. ACM SIGCOMM Computer Communication Review 36, 4 (2006), 267--278.

Cited By

View all
  • (2024)Revisiting graph-based fraud detection in sight of heterophily and spectrumProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i8.28773(9214-9222)Online publication date: 20-Feb-2024
  • (2024)Data Interaction Security Monitoring Technology Based on Behavior Graph RepresentationProceedings of the 2024 3rd International Conference on Cryptography, Network Security and Communication Technology10.1145/3673277.3673283(30-34)Online publication date: 19-Jan-2024
  • (2024)Anomaly Detection in Dynamic Graphs: A Comprehensive SurveyACM Transactions on Knowledge Discovery from Data10.1145/366990618:8(1-44)Online publication date: 29-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 11, Issue 4
Special Issue on KDD 2016 and Regular Papers
November 2017
419 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3119906
  • Editor:
  • Jie Tang
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 June 2017
Accepted: 01 February 2017
Revised: 01 January 2017
Received: 01 November 2016
Published in TKDD Volume 11, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Fraud detection
  2. link analysis
  3. spam detection

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)285
  • Downloads (Last 6 weeks)37
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Revisiting graph-based fraud detection in sight of heterophily and spectrumProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i8.28773(9214-9222)Online publication date: 20-Feb-2024
  • (2024)Data Interaction Security Monitoring Technology Based on Behavior Graph RepresentationProceedings of the 2024 3rd International Conference on Cryptography, Network Security and Communication Technology10.1145/3673277.3673283(30-34)Online publication date: 19-Jan-2024
  • (2024)Anomaly Detection in Dynamic Graphs: A Comprehensive SurveyACM Transactions on Knowledge Discovery from Data10.1145/366990618:8(1-44)Online publication date: 29-May-2024
  • (2024)Detecting Evolving Fraudulent Behavior in Online Payment Services: Open-Category and Concept-DriftIEEE Transactions on Services Computing10.1109/TSC.2024.342288017:5(2180-2193)Online publication date: Sep-2024
  • (2024)DOS-GNN: Dual-Feature Aggregations with Over-Sampling for Class-Imbalanced Fraud Detection On Graphs2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10650494(1-8)Online publication date: 30-Jun-2024
  • (2024)Efficient Multi-Query Oriented Continuous Subgraph Matching2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00250(3230-3243)Online publication date: 13-May-2024
  • (2024)Contagion Source Detection by Maximum Likelihood Estimation and Starlike Graph Approximation2024 58th Annual Conference on Information Sciences and Systems (CISS)10.1109/CISS59072.2024.10480182(1-6)Online publication date: 13-Mar-2024
  • (2024)A Study on Fake Review Detection Based on RoBERTa and Behavioral FeaturesProcedia Computer Science10.1016/j.procs.2024.08.131242(1323-1330)Online publication date: 2024
  • (2024)Anomaly Behavior Analysis for Blockchain Social Networks Using Heterogeneous Graph Neural NetworksIntelligence of Things: Technologies and Applications10.1007/978-3-031-75596-5_24(259-268)Online publication date: 24-Dec-2024
  • (2023)User Behavior Analysis for Detecting Compromised User Accounts: A Review PaperCybernetics and Information Technologies10.2478/cait-2023-002723:3(102-113)Online publication date: 28-Sep-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media