Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Self-Representation Method with Local Similarity Preserving for Fast Multi-View Outlier Detection

Published: 15 March 2023 Publication History

Abstract

With the rapidly growing attention to multi-view data in recent years, multi-view outlier detection has become a rising field with intense research. These researches have made some success, but still exist some issues that need to be solved. First, many multi-view outlier detection methods can only handle datasets that conform to the cluster structure but are powerless for complex data distributions such as manifold structures. This overly restrictive data assumption limits the applicability of these methods. In addition, almost the majority of multi-view outlier detection algorithms cannot solve the online detection problem of multi-view outliers. To address these issues, we propose a new detection method based on the local similarity relation and data reconstruction, i.e., the Self-Representation Method with Local Similarity Preserving for fast multi-view outlier detection (SRLSP). By using the local similarity structure, the proposed method fully utilizes the characteristics of outliers and detects outliers with an applicable objective function. Besides, a well-designed optimization algorithm is proposed, which completes each iteration with linear time complexity and can calculate each instance parallelly. Also, the optimization algorithm can be easily extended to the online version, which is more suitable for practical production environments. Extensive experiments on both synthetic and real-world datasets demonstrate the superiority of the proposed method on both performance and time complexity.

References

[1]
Charu C. Aggarwal. 2015. Data Mining. Springer.
[2]
Mohiuddin Ahmed and Abdun Naser Mahmood. 2013. A novel approach for outlier detection and clustering improvement. In Proceedings of the IEEE Conference on Industrial Electronics and Applications (ICIEA). 577–582.
[3]
Emin Aleskerov, Bernd Freisleben, and Bharat Rao. 1997. CARDWATCH: A neural network based database mining system for credit card fraud detection. In Proceedings of the IEEE/IAFE Computational Intelligence for Financial Engineering (CIFEr). 220–226.
[4]
Fabrizio Angiulli, Stefano Basta, and Clara Pizzuti. 2005. Distance-based detection and prediction of outliers. IEEE Transactions on Knowledge and Data Engineering (TKDE) 18, 2 (2005), 145–160.
[5]
Irad Ben-Gal. 2005. Outlier detection. In Proceedings of the Data Mining and Knowledge Discovery Handbook. Springer, 131–146.
[6]
Kamal Berahmand, Mehrnoush Mohammadi, Azadeh Faroughi, and Rojiar Pir Mohammadiani. 2021. A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix. Cluster Computing (2021), 1–20.
[7]
Kamal Berahmand, Elahe Nasiri, and Yuefeng Li2021. Spectral clustering on protein-protein interaction networks via constructing affinity matrix using attributed graph embedding. Computers in Biology and Medicine 138 (2021), 104933.
[8]
Roi Blanco and Christina Lioma. 2012. Graph-based term weighting for information retrieval. Information Retrieval 15, 1 (2012), 54–92.
[9]
Mohamed Bouguessa. 2015. A practical outlier detection approach for mixed-attribute data. Expert Systems with Applications 42, 22 (2015), 8637–8649.
[10]
Stephen Boyd and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge university press.
[11]
Markus M. Breunig, Hans-Peter Kriegel, Raymond T. Ng, and Jörg Sander. 2000. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. 93–104.
[12]
Xiao Cai, Feiping Nie, and Heng Huang. 2013. Multi-view k-means clustering on big data. In Proceedings of the 23rd International Joint Conference on Artificial Intelligence (IJCAI). 2598–2604.
[13]
Emmanuel J. Candès, Xiaodong Li, Yi Ma, and John Wright. 2011. Robust principal component analysis?Journal of the ACM (JACM) 58, 3 (2011), 1–37.
[14]
Chuan Chen, Yu Wang, Weibo Hu, and Zibin Zheng. 2020. Robust multi-view k-means clustering with outlier removal. Knowledge-Based Systems (KBS) 210 (2020), 106518.
[15]
Zitai Chen, Chuan Chen, Zong Zhang, Zibin Zheng, and Qingsong Zou. 2019. Variational graph embedding and clustering with laplacian eigenmaps. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI). 2144–2150.
[16]
Li Cheng, Yijie Wang, and Xinwang Liu. 2021. Neighborhood consensus networks for unsupervised multi-view outlier detection. (2021).
[17]
Miaomiao Cheng, Liping Jing, and Michael K. Ng. 2020. Robust unsupervised cross-modal hashing for multimedia retrieval. ACM Transactions on Information Systems (TOIS) 38, 3 (2020), 1–25.
[18]
Ehsan Elhamifar and René Vidal. 2013. Sparse subspace clustering: Algorithm, theory, and applications. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 35, 11 (2013), 2765–2781.
[19]
Guojun Gan and Michael Kwok-Po Ng. 2017. K-means clustering with outlier removal. Pattern Recognition Letters 90 (2017), 8–14.
[20]
Jing Gao, Wei Fan, Deepak Turaga, Srinivasan Parthasarathy, and Jiawei Han. 2011. A spectral framework for detecting inconsistency across multi-source object relationships. In Proceedings of the IEEE International Conference on Data Mining (ICDM). 1050–1055.
[21]
Amol Ghoting, Matthew Eric Otey, and Srinivasan Parthasarathy. 2004. Loaded: Link-based outlier and anomaly detection in evolving data sets. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM’04). IEEE, 387–390.
[22]
Douglas M. Hawkins. 1980. Identification of Outliers. Springer.
[23]
Vandana P. Janeja and Revathi Palanisamy. 2013. Multi-domain anomaly detection in spatial datasets. Knowledge and Information Systems (KAIS) 36, 3 (2013), 749–788.
[24]
Yu-Xuan Ji, Ling Huang, Heng-Ping He, Chang-Dong Wang, Guangqiang Xie, Wei Shi, and Kun-Yu Lin. 2019. Multi-view outlier detection in deep intact space. In Proceedings of the IEEE International Conference on Data Mining (ICDM). 1132–1137.
[25]
Zhao Kang, Guoxin Shi, Shudong Huang, Wenyu Chen, Xiaorong Pu, Joey Tianyi Zhou, and Zenglin Xu. 2020. Multi-graph fusion for multi-view spectral clustering. Knowledge-Based Systems (KBS) 189 (2020), 105102.
[26]
Anna Koufakou and Michael Georgiopoulos. 2010. A fast outlier detection strategy for distributed high-dimensional data sets with mixed attributes. Data Mining and Knowledge Discovery 20, 2 (2010), 259–289.
[27]
Vipin Kumar. 2005. Parallel and distributed computing for cybersecurity. IEEE Distributed Systems Online 6, 10 (2005), 1–9.
[28]
Kai Li, Sheng Li, Zhengming Ding, Weidong Zhang, and Yun Fu. 2018. Latent discriminant subspace representations for multi-view outlier detection. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI). 3522–3529.
[29]
Sheng Li, Ming Shao, and Yun Fu. 2018. Multi-view low-rank analysis with applications to outlier detection. ACM Transactions on Knowledge Discovery from Data (TKDD) 12, 3 (2018), 1–22.
[30]
Youwei Liang, Dong Huang, and Chang-Dong Wang. 2019. Consistency meets inconsistency: A unified graph learning framework for multi-view clustering. In Proceedings of the IEEE International Conference on Data Mining (ICDM). 1204–1209.
[31]
Bo Liu, Yanshan Xiao, S. Yu Philip, Zhifeng Hao, and Longbing Cao. 2013. An efficient approach for outlier detection with imperfect data labels. IEEE Transactions on Knowledge and Data Engineering (TKDE) 26, 7 (2013), 1602–1616.
[32]
Hongfu Liu, Jun Li, Yue Wu, and Yun Fu. 2021. Clustering with outlier removal. IEEE Transactions on Knowledge and Data Engineering (TKDE) 33, 6 (2021), 2369–2379.
[33]
Can-Yi Lu, Hai Min, Zhong-Qiu Zhao, Lin Zhu, De-Shuang Huang, and Shuicheng Yan. 2012. Robust and efficient subspace segmentation via least squares regression. In Proceedings of the 12th European Conference on Computer Vision (ECCV), Vol. 7578. 347–360.
[34]
Alejandro Marcos Alvarez, Makoto Yamada, Akisato Kimura, and Tomoharu Iwata. 2013. Clustering-based anomaly detection in multi-view data. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM). 1545–1548.
[35]
Feiping Nie, Jing Li, and Xuelong Li2017. Self-weighted multiview clustering with multiple graphs. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI). 2564–2570.
[36]
Feiping Nie, Xiaoqian Wang, and Heng Huang. 2014. Clustering and projected clustering with adaptive neighbors. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). 977–986.
[37]
Guansong Pang, Chunhua Shen, Longbing Cao, and Anton Van Den Hengel. 2021. Deep learning for anomaly detection: A review. ACM Computing Surveys (CSUR) 54, 2 (2021), 1–38.
[38]
Sam T. Roweis and Lawrence K. Saul. 2000. Nonlinear dimensionality reduction by locally linear embedding. Science 290, 5500 (2000), 2323–2326.
[39]
Lukas Ruff, Jacob R. Kauffmann, Robert A. Vandermeulen, Grégoire Montavon, Wojciech Samek, Marius Kloft, Thomas G. Dietterich, and Klaus-Robert Müller. 2021. A unifying review of deep and shallow anomaly detection. Proceedings of the IEEE (2021).
[40]
Lukas Ruff, Robert A. Vandermeulen, Nico Görnitz, Alexander Binder, Emmanuel Müller, Klaus-Robert Müller, and Marius Kloft. 2020. Deep semi-supervised anomaly detection. In Proceedings of the International Conference on Learning Representations. Retrieved from https://openreview.net/forum?id=HkgH0TEYwH.
[41]
Felix Sattler, Klaus-Robert Müller, Thomas Wiegand, and Wojciech Samek. 2020. On the byzantine robustness of clustered federated learning. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 8861–8865.
[42]
Bernhard Schölkopf, Robert C. Williamson, Alexander J. Smola, John Shawe-Taylor, and John C. Platt1999. Support vector method for novelty detection. In Proceedings of the Advances in Neural Information Processing Systems (NIPS), Vol. 12. 582–588.
[43]
Xiang-Rong Sheng, De-Chuan Zhan, Su Lu, and Yuan Jiang. 2019. Multi-view anomaly detection: Neighborhood in locality matters. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI). 4894–4901.
[44]
Clay Spence, Lucas Parra, and Paul Sajda. 2001. Detection, synthesis and compression in mammographic image analysis with a hierarchical image probability model. In Proceedings of the IEEE Workshop on Mathematical Methods in Biomedical Image Analysis (MMBIA). 3–10.
[45]
Jingjing Tang, Yingjie Tian, Xiaohui Liu, Dewei Li, Jia Lv, and Gang Kou. 2018. Improved multi-view privileged support vector machine. Neural Networks 106 (2018), 96–109.
[46]
Chu Wang, Yan-Ming Zhang, and Cheng-Lin Liu. 2018. Anomaly detection via minimum likelihood generative adversarial networks. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR). 1121–1126.
[47]
Hua Wang, Feiping Nie, and Heng Huang. 2013. Multi-view clustering and feature learning via structured sparsity. In Proceedings of the 30th International Conference on Machine Learning (ICML), Vol. 28. 352–360.
[48]
Hao Wang, Yan Yang, Bing Liu, and Hamido Fujita. 2019. A study of graph-based system for multi-view clustering. Knowledge-Based Systems (KBS) 163 (2019), 1009–1019.
[49]
Zhiyue Wu, Hongzuo Xu, Guansong Pang, Fengyuan Yu, Yijie Wang, Songlei Jian, and Yongjun Wang. 2021. Dram failure prediction in aiops: Empirical evaluation, challenges and opportunities. arXiv:2104.15052. Retrieved from https://arxiv.org/abs/2104.15052.
[50]
Xijiong Xie and Shiliang Sun. 2019. Multi-view support vector machines with the consensus and complementarity information. IEEE Transactions on Knowledge and Data Engineering (TKDE) 32, 12 (2019), 2401–2413.
[51]
Chang Xu, Dacheng Tao, and Chao Xu. 2013. A survey on multi-view learning. arXiv:1304.5634. Retrieved from https://arxiv.org/abs/1304.5634.
[52]
Hongzuo Xu, Yijie Wang, Yongjun Wang, and Zhiyue Wu. 2019. Mix: A joint learning framework for detecting both clustered and scattered outliers in mixed-type data. In Proceedings of the 2019 IEEE International Conference on Data Mining (ICDM). IEEE, 1408–1413.
[53]
Jiawei Yang, Susanto Rahardja, and Pasi Fränti. 2021. Mean-shift outlier detection and filtering. Pattern Recognition 115 (2021), 107874.
[54]
Fanghua Ye, Zhiwei Lin, Chuan Chen, Zibin Zheng, Hong Huang, and Emine Yilmaz. 2020. Outlier resilient collaborative web service QoS prediction. arXiv:2006.01287. Retrieved from https://arxiv.org/abs/2006.01287.
[55]
Amir Zadeh, Paul Pu Liang, Navonil Mazumder, Soujanya Poria, Erik Cambria, and Louis-Philippe Morency. 2018. Memory fusion network for multi-view sequential learning. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI). 5634–5641.
[56]
Handong Zhao and Yun Fu. 2015. Dual-regularized multi-view outlier detection. In Proceedings of the 34th International Joint Conference on Artificial Intelligence (IJCAI). 4077–4083.
[57]
Handong Zhao, Hongfu Liu, Zhengming Ding, and Yun Fu. 2017. Consensus regularized multi-view outlier detection. IEEE Transactions on Image Processing (TIP) 27, 1 (2017), 236–248.
[58]
Jing Zhao, Xijiong Xie, Xin Xu, and Shiliang Sun. 2017. Multi-view learning overview: Recent progress and new challenges. Information Fusion 38 (2017), 43–54.

Cited By

View all
  • (2024)CoBjeason: Reasoning Covered Object in Image by Multi-Agent Collaboration Based on Informed Knowledge GraphACM Transactions on Knowledge Discovery from Data10.1145/364356518:5(1-56)Online publication date: 28-Feb-2024
  • (2024)Credit Card Fraud Detection via Intelligent Sampling and Self-supervised LearningACM Transactions on Intelligent Systems and Technology10.1145/364128315:2(1-29)Online publication date: 28-Mar-2024
  • (2024)Relevance Feedback with Brain SignalsACM Transactions on Information Systems10.1145/363787442:4(1-37)Online publication date: 9-Feb-2024
  • Show More Cited By

Index Terms

  1. A Self-Representation Method with Local Similarity Preserving for Fast Multi-View Outlier Detection

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Knowledge Discovery from Data
        ACM Transactions on Knowledge Discovery from Data  Volume 17, Issue 1
        January 2023
        375 pages
        ISSN:1556-4681
        EISSN:1556-472X
        DOI:10.1145/3572846
        Issue’s Table of Contents

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 15 March 2023
        Online AM: 27 April 2022
        Accepted: 14 April 2022
        Revised: 14 March 2022
        Received: 02 September 2021
        Published in TKDD Volume 17, Issue 1

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Outlier detection
        2. multi-view data
        3. subspace learning
        4. adaptive similarity learning

        Qualifiers

        • Research-article

        Funding Sources

        • Key-Area Research and Development Program of Guangdong Province
        • National Natural Science Foundation of China
        • Guangdong Basic and Applied Basic Research Foundation

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)255
        • Downloads (Last 6 weeks)21
        Reflects downloads up to 17 Oct 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)CoBjeason: Reasoning Covered Object in Image by Multi-Agent Collaboration Based on Informed Knowledge GraphACM Transactions on Knowledge Discovery from Data10.1145/364356518:5(1-56)Online publication date: 28-Feb-2024
        • (2024)Credit Card Fraud Detection via Intelligent Sampling and Self-supervised LearningACM Transactions on Intelligent Systems and Technology10.1145/364128315:2(1-29)Online publication date: 28-Mar-2024
        • (2024)Relevance Feedback with Brain SignalsACM Transactions on Information Systems10.1145/363787442:4(1-37)Online publication date: 9-Feb-2024
        • (2024)Evolving Knowledge Graph Representation Learning with Multiple Attention Strategies for Citation Recommendation SystemACM Transactions on Intelligent Systems and Technology10.1145/363527315:2(1-26)Online publication date: 28-Mar-2024
        • (2024)Heterogeneous graph convolutional network for multi-view semi-supervised classificationNeural Networks10.1016/j.neunet.2024.106438178(106438)Online publication date: Oct-2024
        • (2024)Multi-view anomaly detection via hybrid instance-neighborhood aligning and cross-view reasoningMultimedia Systems10.1007/s00530-024-01526-230:6Online publication date: 10-Oct-2024
        • (2023)Structure-Driven Representation Learning for Deep ClusteringACM Transactions on Knowledge Discovery from Data10.1145/362340018:1(1-25)Online publication date: 16-Oct-2023
        • (2023)Multi-aspect Graph Contrastive Learning for Review-enhanced RecommendationACM Transactions on Information Systems10.1145/361810642:2(1-29)Online publication date: 8-Nov-2023
        • (2023)Multi-View Graph Convolutional Networks with Differentiable Node SelectionACM Transactions on Knowledge Discovery from Data10.1145/360895418:1(1-21)Online publication date: 10-Aug-2023
        • (2023)Low-rank Representation with Adaptive Dimensionality Reduction via Manifold Optimization for ClusteringACM Transactions on Knowledge Discovery from Data10.1145/358976717:9(1-18)Online publication date: 15-Jun-2023
        • Show More Cited By

        View Options

        Get Access

        Login options

        Full Access

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Full Text

        View this article in Full Text.

        Full Text

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media