research-article

Multi-View Low-Rank Analysis with Applications to Outlier Detection

Authors:

Yun FuAuthors Info & Claims

ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 12, Issue 3

Article No.: 32, Pages 1 - 22

https://doi.org/10.1145/3168363

Published: 23 March 2018 Publication History

Abstract

Detecting outliers or anomalies is a fundamental problem in various machine learning and data mining applications. Conventional outlier detection algorithms are mainly designed for single-view data. Nowadays, data can be easily collected from multiple views, and many learning tasks such as clustering and classification have benefited from multi-view data. However, outlier detection from multi-view data is still a very challenging problem, as the data in multiple views usually have more complicated distributions and exhibit inconsistent behaviors. To address this problem, we propose a multi-view low-rank analysis (MLRA) framework for outlier detection in this article. MLRA pursuits outliers from a new perspective, robust data representation. It contains two major components. First, the cross-view low-rank coding is performed to reveal the intrinsic structures of data. In particular, we formulate a regularized rank-minimization problem, which is solved by an efficient optimization algorithm. Second, the outliers are identified through an outlier score estimation procedure. Different from the existing multi-view outlier detection methods, MLRA is able to detect two different types of outliers from multiple views simultaneously. To this end, we design a criterion to estimate the outlier scores by analyzing the obtained representation coefficients. Moreover, we extend MLRA to tackle the multi-view group outlier detection problem. Extensive evaluations on seven UCI datasets, the MovieLens, the USPS-MNIST, and the WebKB datasets demon strate that our approach outperforms several state-of-the-art outlier detection methods.

References

[1]

Alejandro Marcos Alvarez, Makoto Yamada, Akisato Kimura, and Tomoharu Iwata. 2013. Clustering-based anomaly detection in multi-view data. In CIKM. 1545--1548.

Digital Library

[2]

Fabrizio Angiulli and Fabio Fassetti. 2009. Outlier detection using inductive logic programming. In ICDM. 693--698.

Digital Library

[3]

Ira Assent, Xuan Hong Dang, Barbora Micenková, and Raymond T. Ng. 2013. Outlier detection with space transformation and spectral analysis. In SDM. 225--233.

[4]

F. R. Bach. 2008. Consistency of trace norm minimization. Journal of Machine Learning Research 9 (2008), 1019--1048.

Digital Library

[5]

K. Bache and M. Lichman. 2013. UCI Machine Learning Repository. (2013). Retrieved from http://archive.ics.uci.edu/ml.

[6]

Avrim Blum and Tom M. Mitchell. 1998. Combining labeled and unlabeled data with co-training. In COLT. ACM, 92--100.

Digital Library

[7]

J. F. Cai, E. J. Candes, and Z. W. Shen. 2010. A singular value thresholding algorithm for matrix completion. SIAM Journal on Optimization 20, 4 (2010), 1956--1982.

[8]

E. J. Candès, X. D. Li, Y. Ma, and J. Wright. 2011. Robust principal component analysis?Journal of ACM 58, 3 (2011), 11.

[9]

Jianhui Chen, Jiayu Zhou, and Jieping Ye. 2011. Integrating low-rank and group-sparse structures for robust multi-task learning. In KDD. 42--50.

Digital Library

[10]

Bin Cheng, Guangcan Liu, Jingdong Wang, ZhongYang Huang, and Shuicheng Yan. 2011. Multi-task low-rank affinity pursuit for image segmentation. In ICCV. 2439--2446.

Digital Library

[11]

Santanu Das, Bryan L. Matthews, Ashok N. Srivastava, and Nikunj C. Oza. 2010. Multiple kernel learning for heterogeneous anomaly detection: Algorithm and aviation safety case study. In KDD. 47--56.

Digital Library

[12]

Bo Du and Liangpei Zhang. 2014. A discriminative metric learning based anomaly detection method. IEEE Transactions on Geoscience and Remote Sensing 52, 11 (2014), 6844--6857.

[13]

Andrew F. Emmott, Shubhomoy Das, Thomas Dietterich, Alan Fern, and Weng-Keen Wong. 2013. Systematic construction of anomaly detection benchmarks from real data. In KDD Workshop on Outlier Detection and Description. 16--21.

Digital Library

[14]

Jing Gao, Wei Fan, Deepak S. Turaga, Srinivasan Parthasarathy, and Jiawei Han. 2011. A spectral framework for detecting inconsistency across multi-source object relationships. In ICDM. 1050--1055.

Digital Library

[15]

Yuhong Guo. 2013. Convex subspace representation learning from multi-view data. In AAAI. Vol. 1, 2.

Digital Library

[16]

Ko-Jen Hsiao, Kevin S. Xu, Jeff Calder, and Alfred O. Hero III. 2012. Multi-criteria anomaly detection using pareto depth analysis. In NIPS. 854--862.

Digital Library

[17]

Han Hu, Zhouchen Lin, Jianjiang Feng, and Jie Zhou. 2014. Smooth representation clustering. In CVPR. 3834--3841.

Digital Library

[18]

Jonathan Hull. 1994. A database for handwritten text recognition research. IEEE Transactions on Pattern Analysis and Machine 16, 5 (1994), 550--554.

Digital Library

[19]

Vandana Pursnani Janeja and Revathi Palanisamy. 2013. Multi-domain anomaly detection in spatial datasets. Knowledge and Information Systems 36, 3 (2013), 749--788.

Digital Library

[20]

R. H. Keshavan, A. Montanari, and S. Oh. 2009. Matrix completion from noisy entries. In NIPS. 952--960.

Digital Library

[21]

Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haaffner. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86, 11 (1998), 2278--2324.

[22]

Yuh-Jye Lee, Yi-Ren Yeh, and Yu-Chiang Frank Wang. 2013. Anomaly detection via online oversampling principal component analysis. IEEE Transactions on Knowledge and Data Engineering 25, 7 (2013), 1460--1470.

Digital Library

[23]

Liangyue Li, Sheng Li, and Yun Fu. 2014. Learning low-rank and discriminative dictionary for image classification. Image and Vision Computing 32, 10 (2014), 814--823.

[24]

Sheng Li and Yun Fu. 2013. Low-rank coding with b-matching constraint for semi-supervised classification. In IJCAI. 1472--1478.

Digital Library

[25]

Sheng Li and Yun Fu. 2014. Robust subspace discovery through supervised low-rank constraints. In SDM. 163--171.

[26]

Sheng Li and Yun Fu. 2015. Multi-view low-rank analysis for outlier detection. In SDM.

[27]

Sheng Li and Yun Fu. 2017. Robust Representation for Data Analytics. Springer.

Digital Library

[28]

Sheng Li, Ming Shao, and Yun Fu. 2014. Locality linear fitting one-class SVM with low-rank constraints for outlier detection. In IJCNN. 676--683.

[29]

Shao-Yuan Li, Yuan Jiang, and Zhi-Hua Zhou. 2014. Partial multi-view clustering. In AAAI. Citeseer, 1968--1974.

Digital Library

[30]

Z. C. Lin, M. M. Chen, L. Q. Wu, and Y. Ma. 2009. The Augmented Lagrange Multiplier Method for Exact Recovery of Corrupted Low-Rank Matrices. Technical Report, University of Illinois at Urbana-Champaign.

[31]

Alexander Liu and Dung N. Lam. 2012. Using consensus clustering for multi-view anomaly detection. In IEEE Symposium on Security and Privacy Workshops. 117--124.

Digital Library

[32]

Bo Liu, Yanshan Xiao, Longbing Cao, Zhifeng Hao, and Feiqi Deng. 2013. SVDD-based outlier detection on uncertain data. Knowledge and Information Systems 34, 3 (2013), 597--618.

Digital Library

[33]

Bo Liu, Yanshan Xiao, Philip S. Yu, Zhifeng Hao, and Longbing Cao. 2014. An efficient approach for outlier detection with imperfect data labels. IEEE Transactions on Knowledge and Data Engineering 26, 7 (2014), 1602--1616.

[34]

Fei Tony Liu, Kai Ming Ting, and Zhi-Hua Zhou. 2012. Isolation-based anomaly detection. TKDD 6, 1 (2012), 3.

Digital Library

[35]

Guangcan Liu, Zhouchen Lin, Shuicheng Yan, Ju Sun, Yong Yu, and Yi Ma. 2013. Robust recovery of subspace structures by low-rank representation. IEEE Transactions on Pattern Analysis and Machine 35, 1 (2013), 171--184.

Digital Library

[36]

Guangcan Liu, Qingshan Liu, and Ping Li. 2017. Blessing of dimensionality: Recovering mixture data via dictionary pursuit. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 1 (2017), 47--60.

Digital Library

[37]

Guangcan Liu, Huan Xu, Jinhui Tang, Qingshan Liu, and Shuicheng Yan. 2016. A deterministic analysis for LRR. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 3 (2016), 417--430.

Digital Library

[38]

Guangcan Liu, Huan Xu, and Shuicheng Yan. 2012. Exact subspace segmentation and outlier detection by low-rank representation. In AISTATS. 703--711.

[39]

G. C. Liu, Z. C. Lin, and Y. Yu. 2010. Robust subspace segmentation by low-rank representation. In ICML. 663--670.

Digital Library

[40]

Roland Memisevic. 2012. On multi-view feature learning. In ICML.

Digital Library

[41]

Krikamol Muandet and Bernhard Schölkopf. 2013. One-class support measure machines for group anomaly detection. In UAI.

Digital Library

[42]

Emmanuel Müller, Ira Assent, Patricia Iglesias Sanchez, Yvonne Mülle, and Klemens Böhm. 2012. Outlier ranking via subspace analysis in multiple views of the data. In ICDM. 529--538.

Digital Library

[43]

Colin O’Reilly, Alexander Gluhak, and Muhammad Ali Imran. 2015. Adaptive anomaly detection with kernel eigenspace splitting and merging. IEEE Transactions on Knowledge and Data Engineering 27, 1 (2015), 3--16.

[44]

Yaling Pei, Osmar R. Zaïane, and Yong Gao. 2006. An efficient reference-based approach to outlier detection in large datasets. In ICDM. 478--487.

Digital Library

[45]

Bryan Perozzi, Leman Akoglu, Patricia Iglesias Sanchez, and Emmanuel Müller. 2014. Focused clustering and outlier detection in large attributed graphs. In KDD. 1346--1355.

Digital Library

[46]

Ninh Pham and Rasmus Pagh. 2012. A near-linear time approximation algorithm for angle-based outlier detection in high-dimensional data. In KDD. 877--885.

Digital Library

[47]

Erich Schubert, Arthur Zimek, and Hans-Peter Kriegel. 2014. Generalized outlier detection with flexible kernel density estimates. In SDM. 542--550.

[48]

Ming Shao, Dmitry Kit, and Yun Fu. 2014. Generalized transfer subspace learning through low-rank constraint. International Journal of Computer Vision 109, 1--2 (2014), 74--93.

Digital Library

[49]

Vikas Sindhwani and David S. Rosenberg. 2008. An RKHS for multi-view learning and manifold co-regularization. In ICML. 976--983.

Digital Library

[50]

Karthik Sridharan and Sham M. Kakade. 2008. An information theoretic framework for multi-view learning. In COLT. 403--414.

[51]

Hanghang Tong and Ching-Yung Lin. 2011. Non-negative residual matrix factorization with application to graph anomaly detection. In SDM. 143--153.

[52]

Grigorios Tzortzis and Aristidis Likas. 2012. Kernel-based weighted multi-view clustering. In ICDM. 675--684.

Digital Library

[53]

Martha White, Yaoliang Yu, Xinhua Zhang, and Dale Schuurmans. 2012. Convex multi-view subspace learning. In NIPS. 1682--1690.

Digital Library

[54]

Shu Wu and Shengrui Wang. 2013. Information-theoretic outlier detection for large-scale categorical data. IEEE Transactions on Knowledge and Data Engineering 25, 3 (2013), 589--602.

Digital Library

[55]

Liang Xiong, Xi Chen, and Jeff Schneider. 2011. Direct robust matrix factorization for anomaly detection. In ICDM. IEEE, 844--853.

Digital Library

[56]

Liang Xiong, Barnabás Póczos, and Jeff G. Schneider. 2011. Group anomaly detection using flexible genre models. In NIPS. 1071--1079.

Digital Library

[57]

Chang Xu, Dacheng Tao, and Chao Xu. 2013. A survey on multi-view learning. CoRR abs/1304.5634 (2013).

[58]

Huan Xu, Constantine Caramanis, and Sujay Sanghavi. 2010. Robust PCA via outlier pursuit. In NIPS. 2496--2504.

Digital Library

[59]

Qi Rose Yu, Xinran He, and Yan Liu. 2014. GLAD: Group anomaly detection in social media analysis. In KDD. 372--381.

Digital Library

[60]

Xiaowei Zhou, Can Yang, and Weichuan Yu. 2012. Automatic mitral leaflet tracking in echocardiography by outlier detection in the low-rank representation. In CVPR. 972--979.

Digital Library

[61]

Arthur Zimek, Matthew Gaudet, Ricardo J. G. B. Campello, and Jörg Sander. 2013. Subsampling for efficient and effective unsupervised outlier detection ensembles. In KDD. 428--436.

Digital Library

Cited By

Lai JWang TChen CZheng Z(2024)Information-aware Multi-view Outlier DetectionACM Transactions on Knowledge Discovery from Data10.1145/363835418:4(1-16)Online publication date: 13-Feb-2024
https://dl.acm.org/doi/10.1145/3638354
Wang SLiu JYu GLiu XZhou SZhu EYang YYin JYang W(2024)Multiview Deep Anomaly Detection: A Systematic ExplorationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318472335:2(1651-1665)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3184723
Liu YWu HHuang ZWang HNing YMa JLiu QChen E(2023)TechPat: Technical Phrase Extraction for Patent MiningACM Transactions on Knowledge Discovery from Data10.1145/359660317:9(1-31)Online publication date: 15-Jun-2023
https://dl.acm.org/doi/10.1145/3596603
Show More Cited By

Index Terms

Multi-View Low-Rank Analysis with Applications to Outlier Detection
1. Computing methodologies
  1. Machine learning
    1. Machine learning algorithms
      1. Regularization
2. Information systems
  1. Information systems applications
    1. Data mining
      1. Data cleaning

Recommendations

Information-aware Multi-view Outlier Detection
With the development of multi-view learning, multi-view outlier detection has received increasing attention in recent years. However, the current research still faces two challenges: (1) The current research lacks theoretical analysis tools for multi-view ...
Multi-view Outlier Detection via Graphs Denoising
Abstract
Recently, multi-view outlier detection attracts increasingly more attention. Although existing multi-view outlier detection methods have demonstrated promising performance, they still suffer from some problems. Firstly, many methods make the ...
Highlights
- A novel unsupervised multi-view outlier detection method is proposed.
- It can explicitly extract the structured outliers on multiple graphs.
- The experiments demonstrates the effectiveness and superiority of the proposed method.
Robust Multi-view Subspace Learning Through Structured Low-Rank Matrix Recovery
Pattern Recognition and Computer Vision
Abstract
Multi-view data exists widely in our daily life. A popular approach to deal with multi-view data is the multi-view subspace learning (MvSL), which projects multi-view data into a common latent subspace to learn more powerful representation. Low-...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data

ACM Transactions on Knowledge Discovery from Data Volume 12, Issue 3

June 2018

360 pages

ISSN:1556-4681

EISSN:1556-472X

DOI:10.1145/3178546

Editors:
Charu Aggarwal
IBM T. J. Watson Research, USA
,
Xindong Wu
University of Louisiana at Lafayette, USA

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 March 2018

Accepted: 01 November 2017

Revised: 01 April 2017

Received: 01 September 2016

Published in TKDD Volume 12, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

NSF IIS award
U.S. Army Research Office Award
ONR Young Investigator Award

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

43
Total Citations
View Citations
679
Total Downloads

Downloads (Last 12 months)35
Downloads (Last 6 weeks)4

Reflects downloads up to 12 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lai JWang TChen CZheng Z(2024)Information-aware Multi-view Outlier DetectionACM Transactions on Knowledge Discovery from Data10.1145/363835418:4(1-16)Online publication date: 13-Feb-2024
https://dl.acm.org/doi/10.1145/3638354
Wang SLiu JYu GLiu XZhou SZhu EYang YYin JYang W(2024)Multiview Deep Anomaly Detection: A Systematic ExplorationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.318472335:2(1651-1665)Online publication date: Feb-2024
https://doi.org/10.1109/TNNLS.2022.3184723
Liu YWu HHuang ZWang HNing YMa JLiu QChen E(2023)TechPat: Technical Phrase Extraction for Patent MiningACM Transactions on Knowledge Discovery from Data10.1145/359660317:9(1-31)Online publication date: 15-Jun-2023
https://dl.acm.org/doi/10.1145/3596603
Wang HCheng ZSun JYang XWu XChen HYang YEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)Debunking Free Fusion Myth: Online Multi-view Anomaly Detection with Disentangled Product-of-Experts ModelingProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612487(3277-3286)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3612487
Wang YChen CLai JFu LZhou YZheng Z(2023)A Self-Representation Method with Local Similarity Preserving for Fast Multi-View Outlier DetectionACM Transactions on Knowledge Discovery from Data10.1145/353219117:1(1-20)Online publication date: 15-Mar-2023
https://dl.acm.org/doi/10.1145/3532191
Liu HXu XLi EZhang SLi X(2023)Anomaly Detection With Representative NeighborsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.310989834:6(2831-2841)Online publication date: Jun-2023
https://doi.org/10.1109/TNNLS.2021.3109898
Ahmed ULin JSrivastava G(2023)Deep Fuzzy Contrast-Set Deviation Point Representation and Trajectory DetectionIEEE Transactions on Fuzzy Systems10.1109/TFUZZ.2022.319787631:2(571-581)Online publication date: Feb-2023
https://doi.org/10.1109/TFUZZ.2022.3197876
Zhang WWu QZhao WDeng HYang Y(2023)Hierarchical One-Class Model With Subnetwork for Representation Learning and Outlier DetectionIEEE Transactions on Cybernetics10.1109/TCYB.2022.316634953:10(6303-6316)Online publication date: Oct-2023
https://doi.org/10.1109/TCYB.2022.3166349
Jiang XWang LLi JChen HTran V(2023)A multi-view anomalous co-location detection framework considering both intra- and inter-feature couplings2023 24th IEEE International Conference on Mobile Data Management (MDM)10.1109/MDM58254.2023.00030(132-137)Online publication date: Jul-2023
https://doi.org/10.1109/MDM58254.2023.00030
Nguyen PTran HLe T(2023)Multi-view Deep Markov Models for Time Series Anomaly Detection2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386155(799-808)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigData59044.2023.10386155
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents