Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3366423.3380213acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework

Published: 20 April 2020 Publication History

Abstract

Since the label collecting is prohibitive and time-consuming, unsupervised methods are preferred in applications such as fraud detection. Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions. Existing methods propose to model the data clusters on selected dimensions, yet globally omitting any dimension may damage the pattern of certain clusters. To address the above issues, we propose a novel unsupervised generative framework called FIRD, which utilizes adversarial distributions to fit and disentangle the heterogeneous statistical patterns. When applying to discrete spaces, FIRD effectively distinguishes the synchronized fraudsters from normal users. Besides, FIRD also provides superior performance on anomaly detection datasets compared with SOTA anomaly detection methods (over 5% average AUC improvement). The significant experiment results on various datasets verify that the proposed method can better model the heterogeneous statistical patterns in high-dimensional data and benefit downstream applications.

References

[1]
Naoki Abe, Bianca Zadrozny, and John Langford. 2006. Outlier detection by active learning. In Proceedings of the Twelfth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Philadelphia, PA, USA, August 20-23, 2006. 504–509. https://doi.org/10.1145/1150402.1150459
[2]
Charu C. Aggarwal. 2013. Outlier Analysis. Springer. https://doi.org/10.1007/978-1-4614-6396-2
[3]
Charu C. Aggarwal and Saket Sathe. 2015. Theoretical Foundations and Algorithms for Outlier Ensembles. SIGKDD Explorations 17, 1 (2015), 24–47. https://doi.org/10.1145/2830544.2830549
[4]
Salem Alelyani, Jiliang Tang, and Huan Liu. 2013. Feature Selection for Clustering: A Review. In Data Clustering: Algorithms and Applications. 29–60.
[5]
Andrew R. Barron, Jorma Rissanen, and Bin Yu. 1998. The Minimum Description Length Principle in Coding and Modeling. IEEE Trans. Information Theory 44, 6 (1998), 2743–2760. https://doi.org/10.1109/18.720554
[6]
Jianfei Chen, Jun Zhu, Yee Whye Teh, and Tong Zhang. 2018. Stochastic Expectation Maximization with Variance Reduction. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montréal, Canada.7978–7988. http://papers.nips.cc/paper/8021-stochastic-expectation-maximization-with-variance-reduction
[7]
Constantinos Constantinopoulos, Michalis K. Titsias, and Aristidis Likas. 2006. Bayesian Feature and Model Selection for Gaussian Mixture Models. IEEE Trans. Pattern Anal. Mach. Intell. 28, 6 (2006), 1013–1018. https://doi.org/10.1109/TPAMI.2006.111
[8]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Fei-Fei Li. 2009. ImageNet: A large-scale hierarchical image database. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 20-25 June 2009, Miami, Florida, USA. 248–255. https://doi.org/10.1109/CVPRW.2009.5206848
[9]
Franck Dufrenois and Jean-Charles Noyer. 2016. One class proximal support vector machines. Pattern Recognition 52(2016), 96–112. https://doi.org/10.1016/j.patcog.2015.09.036
[10]
Markus Goldstein and Andreas Dengel. 2012. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. KI-2012: Poster and Demo Track(2012), 59–63.
[11]
Zengyou He, Xiaofei Xu, and Shengchun Deng. 2003. Discovering cluster-based local outliers. Pattern Recognition Letters 24, 9-10 (2003), 1641–1650. https://doi.org/10.1016/S0167-8655(03)00003-5
[12]
Meng Jiang, Alex Beutel, Peng Cui, Bryan Hooi, Shiqiang Yang, and Christos Faloutsos. 2015. A General Suspiciousness Metric for Dense Blocks in Multimodal Data. In 2015 IEEE International Conference on Data Mining, ICDM 2015, Atlantic City, NJ, USA, November 14-17, 2015. 781–786. https://doi.org/10.1109/ICDM.2015.61
[13]
Alan Jovic, Karla Brkic, and Nikola Bogunovic. 2015. A review of feature selection methods with applications. In 38th International Convention on Information and Communication Technology, Electronics and Microelectronics, MIPRO 2015, Opatija, Croatia, May 25-29, 2015. 1200–1205. https://doi.org/10.1109/MIPRO.2015.7160458
[14]
Tapas Kanungo, David M. Mount, Nathan S. Netanyahu, Christine D. Piatko, Ruth Silverman, and Angela Y. Wu. 2002. An Efficient k-Means Clustering Algorithm: Analysis and Implementation. IEEE Trans. Pattern Anal. Mach. Intell. 24, 7 (2002), 881–892. https://doi.org/10.1109/TPAMI.2002.1017616
[15]
Fabian Keller, Emmanuel Müller, and Klemens Böhm. 2012. HiCS: High Contrast Subspaces for Density-Based Outlier Ranking. In IEEE 28th International Conference on Data Engineering (ICDE 2012), Washington, DC, USA (Arlington, Virginia), 1-5 April, 2012. 1037–1048. https://doi.org/10.1109/ICDE.2012.88
[16]
YongSeog Kim, W. Nick Street, and Filippo Menczer. 2002. Evolutionary model selection in unsupervised learning. Intell. Data Anal. 6, 6 (2002), 531–556. http://content.iospress.com/articles/intelligent-data-analysis/ida00110
[17]
Martin O. Larsson and Johan Ugander. 2011. A concave regularization technique for sparse mixture models. In Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, Granada, Spain.1890–1898. http://papers.nips.cc/paper/4430-a-concave-regularization-technique-for-sparse-mixture-models
[18]
Martin H. C. Law, Anil K. Jain, and Mário A. T. Figueiredo. 2002. Feature Selection in Mixture-Based Clustering. In Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, NIPS 2002, December 9-14, 2002, Vancouver, British Columbia, Canada]. 625–632. http://papers.nips.cc/paper/2308-feature-selection-in-mixture-based-clustering
[19]
Aleksandar Lazarevic and Vipin Kumar. 2005. Feature bagging for outlier detection. In Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, Illinois, USA, August 21-24, 2005. 157–166. https://doi.org/10.1145/1081870.1081891
[20]
Tsung-Yi Lin, Michael Maire, Serge J. Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common Objects in Context. In Computer Vision - ECCV 2014 - 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V. 740–755. https://doi.org/10.1007/978-3-319-10602-1_48
[21]
Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2008. Isolation Forest. In Proceedings of the 8th IEEE International Conference on Data Mining (ICDM 2008), December 15-19, 2008, Pisa, Italy. 413–422. https://doi.org/10.1109/ICDM.2008.17
[22]
Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. 2001. On Spectral Clustering: Analysis and an algorithm. In Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, NIPS 2001, December 3-8, 2001, Vancouver, British Columbia, Canada]. 849–856. http://papers.nips.cc/paper/2092-on-spectral-clustering-analysis-and-an-algorithm
[23]
Søren Feodor Nielsen. 2000. The stochastic EM algorithm: estimation and asymptotic results. Bernoulli 6, 3 (2000), 457–489.
[24]
Girish Keshav Palshikar. 2002. The hidden truth-frauds and their control: A critical application for business intelligence. Intelligent Enterprise 5, 9 (2002), 46–51.
[25]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake VanderPlas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, and Edouard Duchesnay. 2012. Scikit-learn: Machine Learning in Python. CoRR abs/1201.0490(2012). arxiv:1201.0490http://arxiv.org/abs/1201.0490
[26]
Adrian E Raftery and Nema Dean. 2006. Variable selection for model-based clustering. J. Amer. Statist. Assoc. 101, 473 (2006), 168–178.
[27]
S Benson Edwin Raj and A Annie Portia. 2011. Analysis on credit card fraud detection methods. In 2011 International Conference on Computer, Communication and Electrical Technology (ICCCET). IEEE, 152–156.
[28]
Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. 2016. SQuAD: 100, 000+ Questions for Machine Comprehension of Text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016, Austin, Texas, USA, November 1-4, 2016. 2383–2392. https://www.aclweb.org/anthology/D16-1264/
[29]
Shebuti Rayana. 2016. ODDS Library. http://odds.cs.stonybrook.edu
[30]
Kijung Shin, Bryan Hooi, and Christos Faloutsos. 2016. M-Zoom: Fast Dense-Block Detection in Tensors with Quality Guarantees. In Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2016, Riva del Garda, Italy, September 19-23, 2016, Proceedings, Part I. 264–280. https://doi.org/10.1007/978-3-319-46128-1_17
[31]
Kijung Shin, Bryan Hooi, Jisu Kim, and Christos Faloutsos. 2017. D-Cube: Dense-Block Detection in Terabyte-Scale Tensors. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM 2017, Cambridge, United Kingdom, February 6-10, 2017. 681–689. https://doi.org/10.1145/3018661.3018676
[32]
Mei-Ling Shyu, Shu-Ching Chen, Kanoksri Sarinnapakorn, and LiWu Chang. 2003. A Novel Anomaly Detection Scheme Based on Principal Component Classifier. Technical Report. Department of Electrical And Computer Engineering, University of Miami.
[33]
Cláudia M. V. Silvestre, Margarida G. M. S. Cardoso, and Mário A. T. Figueiredo. 2015. Feature selection for clustering categorical data with an embedded modelling approach. Expert Systems 32, 3 (2015), 444–453. https://doi.org/10.1111/exsy.12082
[34]
Richard Socher, Alex Perelygin, Jean Wu, Jason Chuang, Christopher D. Manning, Andrew Y. Ng, and Christopher Potts. 2013. Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, EMNLP 2013, 18-21 October 2013, Grand Hyatt Seattle, Seattle, Washington, USA, A meeting of SIGDAT, a Special Interest Group of the ACL. 1631–1642. https://www.aclweb.org/anthology/D13-1170/
[35]
Mahlet G Tadesse, Naijun Sha, and Marina Vannucci. 2005. Bayesian variable selection in clustering high-dimensional data. J. Amer. Statist. Assoc. 100, 470 (2005), 602–617.
[36]
Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman. 2019. GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. https://openreview.net/forum?id=rJ4km2R5t7
[37]
Arthur J. White, Jason Wyse, and Thomas Brendan Murphy. 2016. Bayesian variable selection for latent class analysis using a collapsed Gibbs sampler. Statistics and Computing 26, 1-2 (2016), 511–527. https://doi.org/10.1007/s11222-014-9542-5
[38]
Yue Zhao, Zain Nasrullah, Maciej K. Hryniewicki, and Zheng Li. 2019. LSCP: Locally Selective Combination in Parallel Outlier Ensembles. In Proceedings of the 2019 SIAM International Conference on Data Mining, SDM 2019, Calgary, Alberta, Canada, May 2-4, 2019.585–593. https://doi.org/10.1137/1.9781611975673.66
[39]
Zheng Zhao and Huan Liu. 2007. Spectral feature selection for supervised and unsupervised learning. In Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, Oregon, USA, June 20-24, 2007. 1151–1157. https://doi.org/10.1145/1273496.1273641
[40]
Arthur Zimek, Matthew Gaudet, Ricardo J. G. B. Campello, and Jörg Sander. 2013. Subsampling for efficient and effective unsupervised outlier detection ensembles. In The 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, Chicago, IL, USA, August 11-14, 2013. 428–436. https://doi.org/10.1145/2487575.2487676

Cited By

View all
  • (2024)Collaborative Fraud Detection on Large Scale Graph Using Secure Multi-Party ComputationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679863(1473-1482)Online publication date: 21-Oct-2024

Index Terms

  1. Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '20: Proceedings of The Web Conference 2020
        April 2020
        3143 pages
        ISBN:9781450370233
        DOI:10.1145/3366423
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 20 April 2020

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. adversarial distributions
        2. heterogeneous statistical patterns
        3. high-dimensional data
        4. prior knowledge
        5. unsupervised learning

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        WWW '20
        Sponsor:
        WWW '20: The Web Conference 2020
        April 20 - 24, 2020
        Taipei, Taiwan

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)7
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 28 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Collaborative Fraud Detection on Large Scale Graph Using Secure Multi-Party ComputationProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679863(1473-1482)Online publication date: 21-Oct-2024

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media