research-article

xFraud: explainable fraud transaction detection

Authors:

Ce ZhangAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 15, Issue 3

Pages 427 - 436

https://doi.org/10.14778/3494124.3494128

Published: 01 November 2021 Publication History

Abstract

At online retail platforms, it is crucial to actively detect the risks of transactions to improve customer experience and minimize financial loss. In this work, we propose xFraud, an explainable fraud transaction prediction framework which is mainly composed of a detector and an explainer. The xFraud detector can effectively and efficiently predict the legitimacy of incoming transactions. Specifically, it utilizes a heterogeneous graph neural network to learn expressive representations from the informative heterogeneously typed entities in the transaction logs. The explainer in xFraud can generate meaningful and human-understandable explanations from graphs to facilitate further processes in the business unit. In our experiments with xFraud on real transaction networks with up to 1.1 billion nodes and 3.7 billion edges, xFraud is able to outperform various baseline models in many evaluation metrics while remaining scalable in distributed settings. In addition, we show that xFraud explainer can generate reasonable explanations to significantly assist the business analysis via both quantitative and qualitative evaluations.

References

[1]

Bart Baesens, Veronique Van Vlasselaer, and Wouter Verbeke. 2015. Fraud analytics using descriptive, predictive, and social network techniques: a guide to data science for fraud detection. John Wiley & Sons.

Digital Library

[2]

Federico Baldassarre and Hossein Azizpour. 2019. Explainability techniques for graph convolutional networks. arXiv preprint arXiv:1905.13686 (2019).

[3]

Adam Breuer, Roee Eilat, and Udi Weinsberg. 2020. Friend or Faux: Graph-Based Early Detection of Fake Accounts on Social Networks. In Proceedings of The Web Conference 2020. 1287--1297.

Digital Library

[4]

Bokai Cao, Mia Mao, Siim Viidu, and S Yu Philip. 2017. HitFraud: a broad learning approach for collective fraud detection in heterogeneous information networks. In 2017 IEEE international conference on data mining (ICDM). IEEE, 769--774.

[5]

Shaosheng Cao, XinXing Yang, Cen Chen, Jun Zhou, Xiaolong Li, and Yuan Qi. 2019. TitAnt: online real-time transaction fraud detection in Ant Financial. Proceedings of the VLDB Endowment (2019).

Digital Library

[6]

Yukuo Cen, Xu Zou, Jianwei Zhang, Hongxia Yang, Jingren Zhou, and Jie Tang. 2019. Representation learning for attributed multiplex heterogeneous network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1358--1368.

Digital Library

[7]

Shiyu Chang, Wei Han, Jiliang Tang, Guo-Jun Qi, Charu C Aggarwal, and Thomas S Huang. 2015. Heterogeneous network embedding via deep architectures. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 119--128.

Digital Library

[8]

Sarthika Dhawan, Siva Charan Reddy Gangireddy, Shiv Kumar, and Tanmoy Chakraborty. 2019. Spotting collective behaviour of online frauds in customer reviews. arXiv preprint arXiv:1905.13649 (2019).

[9]

Yuxiao Dong, Nitesh V Chawla, and Ananthram Swami. 2017. metapath2vec: Scalable representation learning for heterogeneous networks. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 135--144.

Digital Library

[10]

Paul Emmerich, Maximilian Pudelko, Sebastian Gallenmüller, and Georg Carle. 2017. FlowScope: Efficient packet capture and storage in 100 Gbit/s networks. In 2017 IFIP Networking Conference (IFIP Networking) and Workshops. IEEE, 1--9.

[11]

Dhivya Eswaran, Stephan Günnemann, Christos Faloutsos, Disha Makhija, and Mohit Kumar. 2017. Zoobp: Belief propagation for heterogeneous networks. Proceedings of the VLDB Endowment 10, 5 (2017), 625--636.

Digital Library

[12]

V. Fomin, J. Anmol, S. Desroziers, J. Kriss, and A. Tejani. 2020. High-level library to help with training neural networks in PyTorch. https://github.com/pytorch/ignite.

[13]

Xinyu Fu, Jiani Zhang, Ziqiao Meng, and Irwin King. 2020. Magnn: Metapath aggregated graph neural network for heterogeneous graph embedding. In Proceedings of The Web Conference 2020. 2331--2341.

Digital Library

[14]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In Advances in neural information processing systems. 1024--1034.

Digital Library

[15]

Bryan Hooi, Hyun Ah Song, Alex Beutel, Neil Shah, Kijung Shin, and Christos Faloutsos. 2016. Fraudar: Bounding graph fraud in the face of camouflage. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 895--904.

Digital Library

[16]

Binbin Hu, Yuan Fang, and Chuan Shi. 2019. Adversarial learning on heterogeneous information networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 120--129.

Digital Library

[17]

Binbin Hu, Zhiqiang Zhang, Chuan Shi, Jun Zhou, Xiaolong Li, and Yuan Qi. 2019. Cash-out user detection based on attributed heterogeneous information network with a hierarchical attention mechanism. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 946--953.

Digital Library

[18]

Ziniu Hu, Yuxiao Dong, Kuansan Wang, and Yizhou Sun. 2020. Heterogeneous graph transformer. In Proceedings of The Web Conference 2020. 2704--2710.

Digital Library

[19]

Qiang Huang, Makoto Yamada, Yuan Tian, Dinesh Singh, Dawei Yin, and Yi Chang. 2020. GraphLIME: Local interpretable model explanations for graph neural networks. arXiv preprint arXiv:2001.06216 (2020).

[20]

Parisa Kaghazgaran, James Caverlee, and Anna Squicciarini. 2018. Combating crowdsourced review manipulators: A neighborhood-based approach. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 306--314.

Digital Library

[21]

Srijan Kumar, Bryan Hooi, Disha Makhija, Mohit Kumar, Christos Faloutsos, and VS Subrahmanian. 2018. Rev2: Fraudulent user prediction in rating platforms. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 333--341.

Digital Library

[22]

Ao Li, Zhou Qin, Runshi Liu, Yiqun Yang, and Dong Li. 2019. Spam review detection with graph convolutional networks. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 2703--2711.

Digital Library

[23]

Xiang Li, Wen Zhang, Jiuzhou Xi, and Hao Zhu. 2018. HGsuspector: Scalable Collective Fraud Detection in Heterogeneous Graphs. (2018).

[24]

Chen Liang, Ziqi Liu, Bin Liu, Jun Zhou, Xiaolong Li, Shuang Yang, and Yuan Qi. 2019. Uncovering Insurance Fraud Conspiracy with Network Learning. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1181--1184.

Digital Library

[25]

Frank Lin and William W Cohen. 2010. Power iteration clustering. In ICML.

Digital Library

[26]

Shenghua Liu, Bryan Hooi, and Christos Faloutsos. 2017. Holoscope: Topology-and-spike aware fraud detection. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 1539--1548.

Digital Library

[27]

Ziqi Liu, Chaochao Chen, Longfei Li, Jun Zhou, Xiaolong Li, Le Song, and Yuan Qi. 2019. Geniepath: Graph neural networks with adaptive receptive paths. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 4424--4431.

Digital Library

[28]

Ziqi Liu, Chaochao Chen, Xinxing Yang, Jun Zhou, Xiaolong Li, and Le Song. 2018. Heterogeneous graph neural networks for malicious account detection. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 2077--2085.

Digital Library

[29]

Zhiwei Liu, Yingtong Dou, Philip S Yu, Yutong Deng, and Hao Peng. 2020. Alleviating the Inconsistency Problem of Applying Graph Neural Network to Fraud Detection. arXiv preprint arXiv:2005.00625 (2020).

[30]

Qingsong Lv, Ming Ding, Qiang Liu, Yuxiang Chen, Wenzheng Feng, Siming He, Chang Zhou, Jianguo Jiang, Yuxiao Dong, and Jie Tang. 2021. Are we really making much progress? Revisiting, benchmarking, and refining heterogeneous graph neural networks. (2021).

[31]

Jun Ma, Danqing Zhang, Yun Wang, Yan Zhang, and Alexey Pozdnoukhov. 2018. GraphRAD: A Graph-based Risky Account Detection System. (2018).

[32]

Wei Min, Zhengyang Tang, Min Zhu, Yuxi Dai, Yan Wei, and Ruinan Zhang. 2018. Behavior language processing with graph based feature generation for fraud detection in online lending. In Proceedings of Workshop on Misinformation and Misbehavior Mining on the Web.

[33]

Hamed Nilforoshan and Neil Shah. 2019. SliceNDice: Mining Suspicious Multi-Attribute Entity Groups with Multi-View Graphs. In 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 351--363.

[34]

Susie Xi Rao, Shuai Zhang, Zhichao Han, Zitao Zhang, Wei Min, Zhiyao Chen, Yinan Shan, Yang Zhao, and Ce Zhang. 2021. Appendix for xFraud: Explainable Fraud Transaction Detection. https://github.com/eBay/xFraud/blob/master/documents/Appendix_XFraud_VLDB.pdf.

[35]

Yuxiang Ren, Hao Zhu, Jiawei ZHang, Peng Dai, and Liefeng Bo. 2019. EnsemFDet: An Ensemble Approach to Fraud Detection based on Bipartite Graph. arXiv preprint arXiv:1912.11113 (2019).

[36]

Chuan Shi, Binbin Hu, Wayne Xin Zhao, and S Yu Philip. 2018. Heterogeneous information network embedding for recommendation. IEEE Transactions on Knowledge and Data Engineering 31, 2 (2018), 357--370.

Digital Library

[37]

Yu Shi, Fangqiu Han, Xinwei He, Xinran He, Carl Yang, Jie Luo, and Jiawei Han. 2018. mvn2vec: Preservation and collaboration in multi-view network embedding. arXiv preprint arXiv:1801.06597 (2018).

[38]

Kai Shu, Deepak Mahudeswaran, Suhang Wang, and Huan Liu. 2020. Hierarchical propagation networks for fake news detection: Investigation and exploitation. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 626--637.

[39]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.

Digital Library

[40]

Haibo Wang, Chuan Zhou, Jia Wu, Weizhen Dang, Xingquan Zhu, and Jilong Wang. 2018. Deep structure learning for fraud detection. In 2018 IEEE International Conference on Data Mining (ICDM). IEEE, 567--576.

[41]

Jianyu Wang, Rui Wen, Chunming Wu, Yu Huang, and Jian Xion. 2019. Fdgars: Fraudster detection via graph convolutional networks in online app review system. In Companion Proceedings of The 2019 World Wide Web Conference. 310--316.

Digital Library

[42]

Xiao Wang, Houye Ji, Chuan Shi, Bai Wang, Yanfang Ye, Peng Cui, and Philip S Yu. 2019. Heterogeneous graph attention network. In The World Wide Web Conference. 2022--2032.

Digital Library

[43]

Mark Weber, Giacomo Domeniconi, Jie Chen, Daniel Karl I Weidele, Claudio Bellei, Tom Robinson, and Charles E Leiserson. 2019. Anti-money laundering in bitcoin: Experimenting with graph convolutional networks for financial forensics. arXiv preprint arXiv:1908.02591 (2019).

[44]

Rui Wen, Jianyu Wang, Chunming Wu, and Jian Xiong. 2020. ASA: Adversary Situation Awareness via Heterogeneous Graph Convolutional Networks. In Companion Proceedings of the Web Conference 2020. 674--678.

Digital Library

[45]

P. Reddy X. Li, J. Saude and M. Veloso. 2020. Classifying and understanding financial data using graph neural network. In AAAI.

[46]

Carl Yang, Yuxin Xiao, Yu Zhang, Yizhou Sun, and Jiawei Han. 2020. Heterogeneous network representation learning: A unified framework with survey and benchmark. IEEE Transactions on Knowledge and Data Engineering (2020).

Digital Library

[47]

Zhitao Ying, Dylan Bourgeois, Jiaxuan You, Marinka Zitnik, and Jure Leskovec. 2019. Gnnexplainer: Generating explanations for graph neural networks. In Advances in neural information processing systems. 9244--9255.

Digital Library

[48]

Hao Yuan, Jiliang Tang, Xia Hu, and Shuiwang Ji. 2020. XGNN: Towards Model-Level Explanations of Graph Neural Networks. arXiv preprint arXiv:2006.02587 (2020).

[49]

Seongjun Yun, Minbyul Jeong, Raehyun Kim, Jaewoo Kang, and Hyunwoo J Kim. 2019. Graph transformer networks. Advances in Neural Information Processing Systems 32 (2019), 11983--11993.

[50]

Chuxu Zhang, Dongjin Song, Chao Huang, Ananthram Swami, and Nitesh V Chawla. 2019. Heterogeneous graph neural network. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 793--803.

Digital Library

[51]

Yiming Zhang, Yujie Fan, Yanfang Ye, Liang Zhao, and Chuan Shi. 2019. Key Player Identification in Underground Forums over Attributed Heterogeneous Information Network Embedding Framework. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 549--558.

Digital Library

[52]

Kai Zhao, Ting Bai, Bin Wu, Bai Wang, Youjie Zhang, Yuanyu Yang, and Jian-Yun Nie. 2020. Deep adversarial completion for sparse heterogeneous information network embedding. In Proceedings of the Web Conference 2020. 508--518.

Digital Library

[53]

Qiwei Zhong, Yang Liu, Xiang Ao, Binbin Hu, Jinghua Feng, Jiayu Tang, and Qing He. 2020. Financial Defaulter Detection on Online Credit Payment via Multi-view Attributed Heterogeneous Information Network. In Proceedings of The Web Conference 2020. 785--795.

Digital Library

[54]

Yongchun Zhu, Dongbo Xi, Bowen Song, Fuzhen Zhuang, Shuai Chen, Xi Gu, and Qing He. 2020. Modeling Users' Behavior Sequences with Hierarchical Explainable Network for Cross-domain Fraud Detection. In Proceedings of The Web Conference 2020. 928--938.

Digital Library

Cited By

Jeon MPark JOh H(2024)PL4XGL: A Programming Language Approach to Explainable Graph LearningProceedings of the ACM on Programming Languages10.1145/36564648:PLDI(2148-2173)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656464
Altman EBlanuša Jvon Niederhäusern LEgressy BAnghel AAtasu KOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Realistic synthetic financial transactions for anti-money laundering modelsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667422(29851-29874)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667422
Wu XXu YZhang WZhang Y(2023)Billion-Scale Bipartite Graph Embedding: A Global-Local Induced ApproachProceedings of the VLDB Endowment10.14778/3626292.362630017:2(175-183)Online publication date: 1-Oct-2023
https://dl.acm.org/doi/10.14778/3626292.3626300
Show More Cited By

Index Terms

xFraud: explainable fraud transaction detection
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
2. Information systems
  1. Information systems applications
    1. Data mining

Index terms have been assigned to the content through auto-classification.

Recommendations

WFR-TM

Transactional Memory (TM) is a promising concurrent programming paradigm which employs transactions to achieve synchronization in accessing common data known as transactional variables. A transaction may either commit, making its updates to ...
Wait-n-GoTM: improving HTM performance by serializing cyclic dependencies
ASPLOS '13

Transactional memory (TM) has been proposed to alleviate some key programmability problems in chip multiprocessors. Most TMs optimistically allow concurrent transactions, detecting read-write or write-write conflicts. Upon conflicts, existing hardware ...
Speculation-based techniques for transactional lock-free execution of lock-based programs

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 15, Issue 3

November 2021

364 pages

ISSN:2150-8097

Editors:
Juliana Freire
New York University
,
Xuemin Lin
University of New South Wales

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 November 2021

Published in PVLDB Volume 15, Issue 3

Badges

Artifacts Available / v1.1

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
312
Total Downloads

Downloads (Last 12 months)71
Downloads (Last 6 weeks)8

Reflects downloads up to 10 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Jeon MPark JOh H(2024)PL4XGL: A Programming Language Approach to Explainable Graph LearningProceedings of the ACM on Programming Languages10.1145/36564648:PLDI(2148-2173)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656464
Altman EBlanuša Jvon Niederhäusern LEgressy BAnghel AAtasu KOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Realistic synthetic financial transactions for anti-money laundering modelsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667422(29851-29874)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667422
Wu XXu YZhang WZhang Y(2023)Billion-Scale Bipartite Graph Embedding: A Global-Local Induced ApproachProceedings of the VLDB Endowment10.14778/3626292.362630017:2(175-183)Online publication date: 1-Oct-2023
https://dl.acm.org/doi/10.14778/3626292.3626300
Xiao FWu YZhang MChen GOoi B(2023)MINT: Detecting Fraudulent Behaviors from Time-Series Relational DataProceedings of the VLDB Endowment10.14778/3611540.361155116:12(3610-3623)Online publication date: 1-Aug-2023
https://dl.acm.org/doi/10.14778/3611540.3611551
Lv GZhang CChen L(2023)HENCE-X: Toward Heterogeneity-Agnostic Multi-Level Explainability for Deep Graph NetworksProceedings of the VLDB Endowment10.14778/3611479.361150316:11(2990-3003)Online publication date: 24-Aug-2023
https://dl.acm.org/doi/10.14778/3611479.3611503
Choi JPark JKim WPark JSuh YSung M(2023)PU GNN: Chargeback Fraud Detection in P2E MMORPGs via Graph Attention Networks with Imbalanced PU LabelsMachine Learning and Knowledge Discovery in Databases: Applied Data Science and Demo Track10.1007/978-3-031-43427-3_15(243-258)Online publication date: 18-Sep-2023
https://dl.acm.org/doi/10.1007/978-3-031-43427-3_15
Wu BChao KLi Y(2023)DualFraud: Dual-Target Fraud Detection and Explanation in Supply Chain Finance Across Heterogeneous GraphsDatabase Systems for Advanced Applications10.1007/978-3-031-30678-5_28(370-379)Online publication date: 17-Apr-2023
https://dl.acm.org/doi/10.1007/978-3-031-30678-5_28
Lu MHan ZRao SZhang ZZhao YShan YRaghunathan RZhang CJiang JAl Hasan MXiong L(2022)BRIGHT - Graph Neural Networks in Real-time Fraud DetectionProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557136(3342-3351)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557136

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents