research-article

A Self-Representation Method with Local Similarity Preserving for Fast Multi-View Outlier Detection

Authors:

Zibin ZhengAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data, Volume 17, Issue 1

Article No.: 2, Pages 1 - 20

https://doi.org/10.1145/3532191

Published: 15 March 2023 Publication History

Abstract

With the rapidly growing attention to multi-view data in recent years, multi-view outlier detection has become a rising field with intense research. These researches have made some success, but still exist some issues that need to be solved. First, many multi-view outlier detection methods can only handle datasets that conform to the cluster structure but are powerless for complex data distributions such as manifold structures. This overly restrictive data assumption limits the applicability of these methods. In addition, almost the majority of multi-view outlier detection algorithms cannot solve the online detection problem of multi-view outliers. To address these issues, we propose a new detection method based on the local similarity relation and data reconstruction, i.e., the Self-Representation Method with Local Similarity Preserving for fast multi-view outlier detection (SRLSP). By using the local similarity structure, the proposed method fully utilizes the characteristics of outliers and detects outliers with an applicable objective function. Besides, a well-designed optimization algorithm is proposed, which completes each iteration with linear time complexity and can calculate each instance parallelly. Also, the optimization algorithm can be easily extended to the online version, which is more suitable for practical production environments. Extensive experiments on both synthetic and real-world datasets demonstrate the superiority of the proposed method on both performance and time complexity.

References

[1]

Charu C. Aggarwal. 2015. Data Mining. Springer.

[2]

Mohiuddin Ahmed and Abdun Naser Mahmood. 2013. A novel approach for outlier detection and clustering improvement. In Proceedings of the IEEE Conference on Industrial Electronics and Applications (ICIEA). 577–582.

[3]

Emin Aleskerov, Bernd Freisleben, and Bharat Rao. 1997. CARDWATCH: A neural network based database mining system for credit card fraud detection. In Proceedings of the IEEE/IAFE Computational Intelligence for Financial Engineering (CIFEr). 220–226.

[4]

Fabrizio Angiulli, Stefano Basta, and Clara Pizzuti. 2005. Distance-based detection and prediction of outliers. IEEE Transactions on Knowledge and Data Engineering (TKDE) 18, 2 (2005), 145–160.

Digital Library

[5]

Irad Ben-Gal. 2005. Outlier detection. In Proceedings of the Data Mining and Knowledge Discovery Handbook. Springer, 131–146.

[6]

Kamal Berahmand, Mehrnoush Mohammadi, Azadeh Faroughi, and Rojiar Pir Mohammadiani. 2021. A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix. Cluster Computing (2021), 1–20.

[7]

Kamal Berahmand, Elahe Nasiri, and Yuefeng Li2021. Spectral clustering on protein-protein interaction networks via constructing affinity matrix using attributed graph embedding. Computers in Biology and Medicine 138 (2021), 104933.

Digital Library

[8]

Roi Blanco and Christina Lioma. 2012. Graph-based term weighting for information retrieval. Information Retrieval 15, 1 (2012), 54–92.

Digital Library

[9]

Mohamed Bouguessa. 2015. A practical outlier detection approach for mixed-attribute data. Expert Systems with Applications 42, 22 (2015), 8637–8649.

Digital Library

[10]

Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge university press.

[11]

Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. 2000. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. 93–104.

Digital Library

[12]

Xiao Cai, Feiping Nie, and Heng Huang. 2013. Multi-view k-means clustering on big data. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI). 2598–2604.

[13]

Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. 2011. Robust principal component analysis?Journal of the ACM (JACM) 58, 3 (2011), 1–37.

Digital Library

[14]

Chuan Chen, Yu Wang, Weibo Hu, and Zibin Zheng. 2020. Robust multi-view k-means clustering with outlier removal. Knowledge-Based Systems (KBS) 210 (2020), 106518.

[15]

Zitai Chen, Chuan Chen, Zong Zhang, Zibin Zheng, and Qingsong Zou. 2019. Variational graph embedding and clustering with laplacian eigenmaps. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI). 2144–2150.

Digital Library

[16]

Li Cheng, Yijie Wang, and Xinwang Liu. 2021. Neighborhood consensus networks for unsupervised multi-view outlier detection. (2021).

[17]

Miaomiao Cheng, Liping Jing, and Michael K. Ng. 2020. Robust unsupervised cross-modal hashing for multimedia retrieval. ACM Transactions on Information Systems (TOIS) 38, 3 (2020), 1–25.

Digital Library

[18]

Ehsan Elhamifar and René Vidal. 2013. Sparse subspace clustering: Algorithm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 35, 11 (2013), 2765–2781.

Digital Library

[19]

Guojun Gan and Michael Kwok-Po Ng. 2017. K-means clustering with outlier removal. Pattern Recognition Letters 90 (2017), 8–14.

Digital Library

[20]

Jing Gao, Wei Fan, Deepak Turaga, Srinivasan Parthasarathy, and Jiawei Han. 2011. A spectral framework for detecting inconsistency across multi-source object relationships. In Proceedings of the IEEE International Conference on Data Mining (ICDM). 1050–1055.

Digital Library

[21]

Amol Ghoting, Matthew Eric Otey, and Srinivasan Parthasarathy. 2004. Loaded: Link-based outlier and anomaly detection in evolving data sets. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM’04). IEEE, 387–390.

[22]

Douglas M. Hawkins. 1980. Identification of Outliers. Springer.

[23]

Vandana P. Janeja and Revathi Palanisamy. 2013. Multi-domain anomaly detection in spatial datasets. Knowledge and Information Systems (KAIS) 36, 3 (2013), 749–788.

Digital Library

[24]

Yu-Xuan Ji, Ling Huang, Heng-Ping He, Chang-Dong Wang, Guangqiang Xie, Wei Shi, and Kun-Yu Lin. 2019. Multi-view outlier detection in deep intact space. In Proceedings of the IEEE International Conference on Data Mining (ICDM). 1132–1137.

[25]

Zhao Kang, Guoxin Shi, Shudong Huang, Wenyu Chen, Xiaorong Pu, Joey Tianyi Zhou, and Zenglin Xu. 2020. Multi-graph fusion for multi-view spectral clustering. Knowledge-Based Systems (KBS) 189 (2020), 105102.

Digital Library

[26]

Anna Koufakou and Michael Georgiopoulos. 2010. A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes. Data Mining and Knowledge Discovery 20, 2 (2010), 259–289.

Digital Library

[27]

Vipin Kumar. 2005. Parallel and distributed computing for cybersecurity. IEEE Distributed Systems Online 6, 10 (2005), 1–9.

Digital Library

[28]

Kai Li, Sheng Li, Zhengming Ding, Weidong Zhang, and Yun Fu. 2018. Latent discriminant subspace representations for multi-view outlier detection. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI). 3522–3529.

[29]

Sheng Li, Ming Shao, and Yun Fu. 2018. Multi-view low-rank analysis with applications to outlier detection. ACM Transactions on Knowledge Discovery from Data (TKDD) 12, 3 (2018), 1–22.

Digital Library

[30]

Youwei Liang, Dong Huang, and Chang-Dong Wang. 2019. Consistency meets inconsistency: A unified graph learning framework for multi-view clustering. In Proceedings of the IEEE International Conference on Data Mining (ICDM). 1204–1209.

[31]

Bo Liu, Yanshan Xiao, S. Yu Philip, Zhifeng Hao, and Longbing Cao. 2013. An efficient approach for outlier detection with imperfect data labels. IEEE Transactions on Knowledge and Data Engineering (TKDE) 26, 7 (2013), 1602–1616.

[32]

Hongfu Liu, Jun Li, Yue Wu, and Yun Fu. 2021. Clustering with outlier removal. IEEE Transactions on Knowledge and Data Engineering (TKDE) 33, 6 (2021), 2369–2379.

[33]

Can-Yi Lu, Hai Min, Zhong-Qiu Zhao, Lin Zhu, De-Shuang Huang, and Shuicheng Yan. 2012. Robust and efficient subspace segmentation via least squares regression. In Proceedings of the 12th European Conference on Computer Vision (ECCV), Vol. 7578. 347–360.

Digital Library

[34]

Alejandro Marcos Alvarez, Makoto Yamada, Akisato Kimura, and Tomoharu Iwata. 2013. Clustering-based anomaly detection in multi-view data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM). 1545–1548.

[35]

Feiping Nie, Jing Li, and Xuelong Li2017. Self-weighted multiview clustering with multiple graphs. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI). 2564–2570.

[36]

Feiping Nie, Xiaoqian Wang, and Heng Huang. 2014. Clustering and projected clustering with adaptive neighbors. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 977–986.

Digital Library

[37]

Guansong Pang, Chunhua Shen, Longbing Cao, and Anton Van Den Hengel. 2021. Deep learning for anomaly detection: A review. ACM Computing Surveys (CSUR) 54, 2 (2021), 1–38.

Digital Library

[38]

Sam T. Roweis and Lawrence K. Saul. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500 (2000), 2323–2326.

[39]

Lukas Ruff, Jacob R. Kauffmann, Robert A. Vandermeulen, Grégoire Montavon, Wojciech Samek, Marius Kloft, Thomas G. Dietterich, and Klaus-Robert Müller. 2021. A unifying review of deep and shallow anomaly detection. Proceedings of the IEEE (2021).

[40]

Lukas Ruff, Robert A. Vandermeulen, Nico Görnitz, Alexander Binder, Emmanuel Müller, Klaus-Robert Müller, and Marius Kloft. 2020. Deep semi-supervised anomaly detection. In Proceedings of the International Conference on Learning Representations. Retrieved from https://openreview.net/forum?id=HkgH0TEYwH.

[41]

Felix Sattler, Klaus-Robert Müller, Thomas Wiegand, and Wojciech Samek. 2020. On the byzantine robustness of clustered federated learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 8861–8865.

[42]

Bernhard Schölkopf, Robert C. Williamson, Alexander J. Smola, John Shawe-Taylor, and John C. Platt1999. Support vector method for novelty detection. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Vol. 12. 582–588.

[43]

Xiang-Rong Sheng, De-Chuan Zhan, Su Lu, and Yuan Jiang. 2019. Multi-view anomaly detection: Neighborhood in locality matters. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI). 4894–4901.

Digital Library

[44]

Clay Spence, Lucas Parra, and Paul Sajda. 2001. Detection, synthesis and compression in mammographic image analysis with a hierarchical image probability model. In Proceedings of the IEEE Workshop on Mathematical Methods in Biomedical Image Analysis (MMBIA). 3–10.

[45]

Jingjing Tang, Yingjie Tian, Xiaohui Liu, Dewei Li, Jia Lv, and Gang Kou. 2018. Improved multi-view privileged support vector machine. Neural Networks 106 (2018), 96–109.

Digital Library

[46]

Chu Wang, Yan-Ming Zhang, and Cheng-Lin Liu. 2018. Anomaly detection via minimum likelihood generative adversarial networks. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR). 1121–1126.

[47]

Hua Wang, Feiping Nie, and Heng Huang. 2013. Multi-view clustering and feature learning via structured sparsity. In Proceedings of the 30th International Conference on Machine Learning (ICML), Vol. 28. 352–360.

[48]

Hao Wang, Yan Yang, Bing Liu, and Hamido Fujita. 2019. A study of graph-based system for multi-view clustering. Knowledge-Based Systems (KBS) 163 (2019), 1009–1019.

[49]

Zhiyue Wu, Hongzuo Xu, Guansong Pang, Fengyuan Yu, Yijie Wang, Songlei Jian, and Yongjun Wang. 2021. Dram failure prediction in aiops: Empirical evaluation, challenges and opportunities. arXiv:2104.15052. Retrieved from https://arxiv.org/abs/2104.15052.

[50]

Xijiong Xie and Shiliang Sun. 2019. Multi-view support vector machines with the consensus and complementarity information. IEEE Transactions on Knowledge and Data Engineering (TKDE) 32, 12 (2019), 2401–2413.

[51]

Chang Xu, Dacheng Tao, and Chao Xu. 2013. A survey on multi-view learning. arXiv:1304.5634. Retrieved from https://arxiv.org/abs/1304.5634.

[52]

Hongzuo Xu, Yijie Wang, Yongjun Wang, and Zhiyue Wu. 2019. Mix: A joint learning framework for detecting both clustered and scattered outliers in mixed-type data. In Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 1408–1413.

[53]

Jiawei Yang, Susanto Rahardja, and Pasi Fränti. 2021. Mean-shift outlier detection and filtering. Pattern Recognition 115 (2021), 107874.

[54]

Fanghua Ye, Zhiwei Lin, Chuan Chen, Zibin Zheng, Hong Huang, and Emine Yilmaz. 2020. Outlier resilient collaborative web service QoS prediction. arXiv:2006.01287. Retrieved from https://arxiv.org/abs/2006.01287.

[55]

Amir Zadeh, Paul Pu Liang, Navonil Mazumder, Soujanya Poria, Erik Cambria, and Louis-Philippe Morency. 2018. Memory fusion network for multi-view sequential learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI). 5634–5641.

[56]

Handong Zhao and Yun Fu. 2015. Dual-regularized multi-view outlier detection. In Proceedings of the 34th International Joint Conference on Artificial Intelligence (IJCAI). 4077–4083.

[57]

Handong Zhao, Hongfu Liu, Zhengming Ding, and Yun Fu. 2017. Consensus regularized multi-view outlier detection. IEEE Transactions on Image Processing (TIP) 27, 1 (2017), 236–248.

[58]

Jing Zhao, Xijiong Xie, Xin Xu, and Shiliang Sun. 2017. Multi-view learning overview: Recent progress and new challenges. Information Fusion 38 (2017), 43–54.

Digital Library

Cited By

Rong HQian MMa TJin DSheng V(2024)CoBjeason: Reasoning Covered Object in Image by Multi-Agent Collaboration Based on Informed Knowledge GraphACM Transactions on Knowledge Discovery from Data10.1145/364356518:5(1-56)Online publication date: 28-Feb-2024
https://dl.acm.org/doi/10.1145/3643565
Chen CLee CHuang SPeng W(2024)Credit Card Fraud Detection via Intelligent Sampling and Self-supervised LearningACM Transactions on Intelligent Systems and Technology10.1145/364128315:2(1-29)Online publication date: 28-Mar-2024
https://dl.acm.org/doi/10.1145/3641283
Ye ZXie XAi QLiu YWang ZSu WZhang M(2024)Relevance Feedback with Brain SignalsACM Transactions on Information Systems10.1145/363787442:4(1-37)Online publication date: 9-Feb-2024
https://dl.acm.org/doi/10.1145/3637874
Show More Cited By

Index Terms

A Self-Representation Method with Local Similarity Preserving for Fast Multi-View Outlier Detection
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Anomaly detection
        Cluster analysis
    2. Learning settings
      1. Online learning settings

Recommendations

Information-aware Multi-view Outlier Detection
With the development of multi-view learning, multi-view outlier detection has received increasing attention in recent years. However, the current research still faces two challenges: (1) The current research lacks theoretical analysis tools for multi-view ...
Learning Enhanced Representations via Contrasting for Multi-view Outlier Detection
Database Systems for Advanced Applications
Abstract
Multi-view outlier detection has attracted rapidly growing attention to researchers due to its wide applications. However, most existing methods fail to detect outliers in more than two views. Moreover, they only employ the clustering technique to ...
Multi-View Low-Rank Analysis with Applications to Outlier Detection

Detecting outliers or anomalies is a fundamental problem in various machine learning and data mining applications. Conventional outlier detection algorithms are mainly designed for single-view data. Nowadays, data can be easily collected from multiple ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 17, Issue 1

January 2023

375 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/3572846

Editor:
Charu Aggarwal
IBM T. J. Watson Research, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2023

Online AM: 27 April 2022

Accepted: 14 April 2022

Revised: 14 March 2022

Received: 02 September 2021

Published in TKDD Volume 17, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Key-Area Research and Development Program of Guangdong Province
National Natural Science Foundation of China
Guangdong Basic and Applied Basic Research Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
647
Total Downloads

Downloads (Last 12 months)255
Downloads (Last 6 weeks)21

Reflects downloads up to 17 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Rong HQian MMa TJin DSheng V(2024)CoBjeason: Reasoning Covered Object in Image by Multi-Agent Collaboration Based on Informed Knowledge GraphACM Transactions on Knowledge Discovery from Data10.1145/364356518:5(1-56)Online publication date: 28-Feb-2024
https://dl.acm.org/doi/10.1145/3643565
Chen CLee CHuang SPeng W(2024)Credit Card Fraud Detection via Intelligent Sampling and Self-supervised LearningACM Transactions on Intelligent Systems and Technology10.1145/364128315:2(1-29)Online publication date: 28-Mar-2024
https://dl.acm.org/doi/10.1145/3641283
Ye ZXie XAi QLiu YWang ZSu WZhang M(2024)Relevance Feedback with Brain SignalsACM Transactions on Information Systems10.1145/363787442:4(1-37)Online publication date: 9-Feb-2024
https://dl.acm.org/doi/10.1145/3637874
Liu JChen CLee CHuang S(2024)Evolving Knowledge Graph Representation Learning with Multiple Attention Strategies for Citation Recommendation SystemACM Transactions on Intelligent Systems and Technology10.1145/363527315:2(1-26)Online publication date: 28-Mar-2024
https://dl.acm.org/doi/10.1145/3635273
Wang SHuang SWu ZLiu RChen YZhang D(2024)Heterogeneous graph convolutional network for multi-view semi-supervised classificationNeural Networks10.1016/j.neunet.2024.106438178(106438)Online publication date: Oct-2024
https://doi.org/10.1016/j.neunet.2024.106438
Tian LPeng SLiu XChen YCao J(2024)Multi-view anomaly detection via hybrid instance-neighborhood aligning and cross-view reasoningMultimedia Systems10.1007/s00530-024-01526-230:6Online publication date: 10-Oct-2024
https://doi.org/10.1007/s00530-024-01526-2
Wang XJing LLiu HYu J(2023)Structure-Driven Representation Learning for Deep ClusteringACM Transactions on Knowledge Discovery from Data10.1145/362340018:1(1-25)Online publication date: 16-Oct-2023
https://doi.org/10.1145/3623400
Wang KZhu YZang TWang CLiu KMa P(2023)Multi-aspect Graph Contrastive Learning for Review-enhanced RecommendationACM Transactions on Information Systems10.1145/361810642:2(1-29)Online publication date: 8-Nov-2023
https://dl.acm.org/doi/10.1145/3618106
Chen ZFu LXiao SWang SPlant CGuo W(2023)Multi-View Graph Convolutional Networks with Differentiable Node SelectionACM Transactions on Knowledge Discovery from Data10.1145/360895418:1(1-21)Online publication date: 10-Aug-2023
https://dl.acm.org/doi/10.1145/3608954
Chen HChen XTao HLi ZWang X(2023)Low-rank Representation with Adaptive Dimensionality Reduction via Manifold Optimization for ClusteringACM Transactions on Knowledge Discovery from Data10.1145/358976717:9(1-18)Online publication date: 15-Jun-2023
https://dl.acm.org/doi/10.1145/3589767
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents