FuseAD: Unsupervised Anomaly Detection in Streaming Sensors Data by Fusing Statistical and Deep Learning Models
Abstract
:1. Introduction
- A novel fusion method for deep-learning-based and statistical-model-based anomaly detection techniques. In contrast to the ensembling-based anomaly detection methods in which one out of different forecasting results is picked based on the lowest error, the proposed residual scheme lets the network learn how to produce the best forecasting outcome based on two different kinds of models. In addition, the fusion mechanism enables the network to complement the strengths of the underlying two disjoint models by fusing the information encapsulated in them. As a result, the fused network performs better in cases where a single model is unable to produce good results.
- Extensive evaluation of different distance-based, machine-learning-based, and deep-learning-based anomaly detection methods including iForest [9], one-class support vector machine (OCSVM) [16], local outlier factor (LOF) [8], principal compnent analysis (PCA) [17], TwitterAD [12], DeepAnT [13], Bayes ChangePT [18], Context OSE [19], EXPoSE [20], HTM Java [19], NUMENTA [14], Relative Entropy [21], Skyline [19], Twitter ADVec [12], and Windowed Gaussian [19] on two anomaly detection benchmarks. These benchmarks contain a total of 425 time-series.
- An ablation study in order to identify the contribution of the different components in FuseAD. In this study, we highlight the significance of using the fused model by comparing the results with each individual model.
2. Literature Review
- Type of anomaly: point anomaly, contextual anomaly, and collective anomaly.
- Availability of labels: supervised, unsupervised, and semi-supervised.
- Type of employed model: linear models, statistical models, probabilistic models, clustering-based, nearest-neighbors-based, density-based, and deep-learning-based, etc.
- Applications: fraud detection, surveillance, industrial damage detection, medical anomaly detection, and intrusion detection, etc.
3. Methodology
3.1. Statistical Model (ARIMA)
3.2. Deep-Learning-Based Model (CNN)
3.3. FuseAD: The Proposed Method
3.3.1. Forecasting Pipeline
3.3.2. Anomaly Detector
4. Experimental Setups
4.1. Yahoo Webscope Dataset
4.1.1. Dataset Description
4.1.2. Experimental Setting
4.2. NAB Dataset
4.2.1. Dataset Description
4.2.2. Experimental Setting
4.3. Evaluation Metric
5. Results
6. Ablation Study
7. Conclusions
Author Contributions
Funding
Acknowledgments
Conflicts of Interest
Abbreviations
AUC | Area under curve |
NAB | Numenta Anomaly Benchmark |
ROC | Receiver operating characteristic curve |
LOF | Local outlier factor |
COF | Connectivity-based outlier factor |
INFLO | Influenced outlierness |
TPR | True positive rate |
FPR | False positive rate |
AIC | Akaike information criterion |
ARIMA | Auto-regressive integrated moving average |
ANN | Artificial neural network |
CNN | Convolutional neural network |
ARMA | Auto-regressive moving average |
RMSE | Root mean square error |
MAE | Mean absolute error |
OCSVM | One-class support vector machine |
PCA | Principle component analysis |
LSTM | Long short-term memory |
References
- Vertatique. How Many Billion IoT Devices by 2020? 2018. Available online: http://www.vertatique.com/50-billion-connected-devices-2020 (accessed on 20 February 2019).
- Arshad, R.; Zahoor, S.; Shah, M.A.; Wahid, A.; Yu, H. Green IoT: An Investigation on Energy Saving Practices for 2020 and Beyond. IEEE Access 2017, 5, 15667–15681. [Google Scholar] [CrossRef]
- Beghi, A.; Brignoli, R.; Cecchinato, L.; Menegazzo, G.; Rampazzo, M. A data-driven approach for fault diagnosis in HVAC chiller systems. In Proceedings of the 2015 IEEE Conference on Control Applications (CCA), Sydney, Australia, 21–23 September 2015; pp. 966–971. [Google Scholar] [CrossRef]
- Laptev, N.; Amizadeh, S.; Flint, I. Generic and scalable framework for automated time-series anomaly detection. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, Australia, 10–13 August 2015; pp. 1939–1947. [Google Scholar]
- Capozzoli, A.; Piscitelli, M.S.; Gorrino, A.; Ballarini, I.; Corrado, V. Data analytics for occupancy pattern learning to reduce the energy consumption of HVAC systems in office buildings. Sustain. Cities Soc. 2017, 35, 191–208. [Google Scholar] [CrossRef]
- Munir, M.; Baumbach, S.; Gu, Y.; Dengel, A.; Ahmed, S. Data Analytics: Industrial Perspective & Solutions for Streaming Data. In Data Mining in Time Series and Streaming Databases; Last, M., Kandel, A., Bunke, H., Eds.; World Scientific: Singapore, 2018; Chapter 7; pp. 144–168. [Google Scholar]
- Hawkins, D.M. Identification of Outliers; Springer: Berlin/Heidelberg, Germany, 1980; Volume 11. [Google Scholar]
- Breunig, M.M.; Kriegel, H.P.; Ng, R.T.; Sander, J. LOF: Identifying density-based local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, Dallas, TX, USA, 15–18 May 2000; Volume 29, pp. 93–104. [Google Scholar]
- Liu, F.T.; Ting, K.M.; Zhou, Z.H. Isolation forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; pp. 413–422. [Google Scholar]
- Sabokrou, M.; Fayyaz, M.; Fathy, M.; Moayed, Z.; Klette, R. Deep-anomaly: Fully convolutional neural network for fast anomaly detection in crowded scenes. Comput. Vis. Image Underst. 2018, 172, 88–97. [Google Scholar] [CrossRef] [Green Version]
- Goldstein, M.; Dengel, A. Histogram-based outlier score (hbos): A fast unsupervised anomaly detection algorithm. In Proceedings of the Poster and Demo Track of the 35th German Conference on Artificial Intelligence (KI-2012), Saarbrücken, Germany, 24–27 September 2012; pp. 59–63. [Google Scholar]
- Kejariwal, A. Introducing Practical and Robust Anomaly Detection in a Time Series. 2015. Available online: https://blog.twitter.com/engineering/en_us/a/2015/introducing-practical-and-robust-anomaly-detection-in-a-time-series.html (accessed on 12 February 2019).
- Munir, M.; Siddiqui, S.A.; Dengel, A.; Ahmed, S. DeepAnT: A Deep Learning Approach for Unsupervised Anomaly Detection in Time Series. IEEE Access 2019, 7, 1991–2005. [Google Scholar] [CrossRef]
- Lavin, A.; Ahmad, S. Evaluating Real-Time Anomaly Detection Algorithms—The Numenta Anomaly Benchmark. In Proceedings of the Machine 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, 9–11 December 2015; pp. 38–44. [Google Scholar]
- Chalapathy, R.; Chawla, S. Deep Learning for Anomaly Detection: A Survey. arXiv 2019, arXiv:1901.03407. [Google Scholar]
- Ma, J.; Perkins, S. Time-series novelty detection using one-class support vector machines. Neural Networks, 2003. In Proceedings of the 2003 Joint Conference of the Fourth International Conference on Information, Singapore, 15–18 December 2003; Volume 3, pp. 1741–1745. [Google Scholar]
- Shyu, M.L.; Chen, S.C.; Sarinnapakorn, K.; Chang, L. A Novel Anomaly Detection Scheme based on Principal Component Classifier; Technical Report; Miami Univ Coral Gables FL Department of Electrical and Computer Engineering: Coral Gables, FL, USA, 2003. [Google Scholar]
- Adams, R.P.; MacKay, D.J. Bayesian online changepoint detection. arXiv 2007, arXiv:0710.3742. [Google Scholar]
- Ahmad, S.; Lavin, A.; Purdy, S.; Agha, Z. Unsupervised real-time anomaly detection for streaming data. Neurocomputing 2017, 262, 134–147. [Google Scholar] [CrossRef]
- Schneider, M.; Ertel, W.; Ramos, F. Expected similarity estimation for large-scale batch and streaming anomaly detection. Mach. Learn. 2016, 105, 305–333. [Google Scholar] [CrossRef] [Green Version]
- Wang, C.; Viswanathan, K.; Lakshminarayan, C.; Talwar, V.; Satterfield, W.; Schwan, K. Statistical techniques for online anomaly detection in data centers. In Proceedings of the 12th IFIP/IEEE International Symposium on Integrated Network Management (IM 2011) and Workshops, Dublin, Ireland, 23–27 May 2011; pp. 385–392. [Google Scholar]
- Aggarwal, C.C. An Introduction to Outlier Analysis. In Outlier Analysis; Springer: Berlin/Heidelberg, Germany, 2016; pp. 1–34. [Google Scholar]
- Chandola, V.; Banerjee, A.; Kumar, V. Anomaly detection: A survey. ACM Comput. Surv. 2009, 41, 15. [Google Scholar] [CrossRef]
- Goldstein, M. Anomaly Detection in Large Datasets. Ph.D. Thesis, University of Kaiserslautern, München, Germany, 2014. [Google Scholar]
- Ramaswamy, S.; Rastogi, R.; Shim, K. Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, TX, USA, 15–18 May 2000; Volume 29, pp. 427–438. [Google Scholar]
- Tang, J.; Chen, Z.; Fu, A.W.C.; Cheung, D.W. Enhancing effectiveness of outlier detections for low density patterns. In Advances in Knowledge Discovery and Data Mining; Springer: Berlin/Heidelberg, Germany, 2002; pp. 535–548. [Google Scholar]
- Jin, W.; Tung, A.K.; Han, J.; Wang, W. Ranking outliers using symmetric neighborhood relationship. In Advances in Knowledge Discovery and Data Mining; Springer: Berlin/Heidelberg, Germany, 2006; pp. 577–593. [Google Scholar]
- Vasiliadis, G.; Polychronakis, M.; Ioannidis, S. MIDeA: A multi-parallel intrusion detection architecture. In Proceedings of the 18th ACM conference on Computer and communications security, Chicago, IL, USA, 17–21 October 2011; pp. 297–308. [Google Scholar]
- Buda, T.S.; Caglayan, B.; Assem, H. DeepAD: A Generic Framework Based on Deep Learning for Time Series Anomaly Detection. In Pacific-Asia Conference on Knowledge Discovery and Data Mining; Springer: Berlin/Heidelberg, Germany, 2018; pp. 577–588. [Google Scholar]
- Yu, Q.; Jibin, L.; Jiang, L. An improved ARIMA-based traffic anomaly detection algorithm for wireless sensor networks. Int. J. Distrib. Sens. Netw. 2016, 2016, 9653230. [Google Scholar] [CrossRef]
- Yaacob, A.H.; Tan, I.K.; Chien, S.F.; Tan, H.K. ARIMA based network anomaly detection. In Proceedings of the 2010 Second International Conference on Communication Software and Networks, HongKong, China, 29 June–1 July 2010; pp. 205–209. [Google Scholar]
- Whittle, P. Hypothesis Testing in Time Series Analysis; Almqvist and Wiksell International: Stockholm, Sweden, 1951. [Google Scholar]
- Kang, M.J.; Kang, J.W. Intrusion detection system using deep neural network for in-vehicle network security. PLoS ONE 2016, 11, e0155781. [Google Scholar] [CrossRef] [PubMed]
- Trippi, R.R.; Turban, E. Neural Networks in Finance And Investing: Using Artificial Intelligence to Improve Real World Performance; McGraw-Hill, Inc.: New York, NY, USA, 1992. [Google Scholar]
- Crabtree, B.F.; Ray, S.C.; Schmidt, P.M.; O’Connor, P.T.; Schmidt, D.D. The individual over time: Time series applications in health care research. J. Clin. Epidemiol. 1990, 43, 241–260. [Google Scholar] [CrossRef]
- Kushwaha, A.K.; Dhillon, J.K. Deep Learning Trends for Video Based Activity Recognition: A Survey. Int. J. Sens. Wirel. Commun. Control 2018, 8, 165–171. [Google Scholar]
- Malhotra, P.; Vig, L.; Shroff, G.; Agarwal, P. Long short term memory networks for anomaly detection in time series. In Proceedings; Presses Universitaires de Louvain: Louvain-la-Neuve, Belgium, 2015; p. 89. [Google Scholar]
- Chauhan, S.; Vig, L. Anomaly detection in ECG time signals via deep long short-term memory networks. In Proceedings of the 2015 IEEE International Conference on Data Science and Advanced Analytics (IEEE DSAA’2015), Paris, French, 19–21 October 2015; pp. 1–7. [Google Scholar]
- Lipton, Z.C.; Kale, D.C.; Elkan, C.; Wetzell, R. Learning to Diagnose with LSTM Recurrent Neural Networks. arXiv 2016, arXiv:1511.03677. [Google Scholar]
- Zheng, Y.; Liu, Q.; Chen, E.; Ge, Y.; Zhao, J.L. Time series classification using multi-channels deep convolutional neural networks. In International Conference on Web-Age Information Management; Springer: Berlin/Heidelberg, Germany, 2014; pp. 298–310. [Google Scholar]
- Lopez-Martin, M.; Carro, B.; Sanchez-Esguevillas, A.; Lloret, J. Network traffic classifier with convolutional and recurrent neural networks for Internet of Things. IEEE Access 2017, 5, 18042–18050. [Google Scholar] [CrossRef]
- Du, X.; El-Khamy, M.; Lee, J.; Davis, L. Fused DNN: A deep neural network fusion approach to fast and robust pedestrian detection. In Proceedings of the WACV 2017, Santa Rosa, CA, USA, 27–29 March 2017; pp. 953–961. [Google Scholar]
- Hyndman, R.; Khandakar, Y. Automatic Time Series Forecasting: The forecast Package for R. J. Stat. Softw. Art. 2008, 27, 1–22. [Google Scholar] [CrossRef]
- Conejo, A.J.; Plazas, M.A.; Espinola, R.; Molina, A.B. Day-ahead electricity price forecasting using the wavelet transform and ARIMA models. IEEE Trans. Power Syst. 2005, 20, 1035–1042. [Google Scholar] [CrossRef]
- Contreras, J.; Espinola, R.; Nogales, F.J.; Conejo, A.J. ARIMA models to predict next-day electricity prices. IEEE Trans. Power Syst. 2003, 18, 1014–1020. [Google Scholar] [CrossRef]
- Gunathilaka, R.D.; Tularam, G.A. The tea industry and a review of its price modelling in major tea producing countries. J. Manag. Strategy 2016, 7, 21–36. [Google Scholar] [CrossRef]
- Koch, G.; Zemel, R.; Salakhutdinov, R. Siamese neural networks for one-shot image recognition. In Proceedings of the 32 nd International Conference on Machine Learning, Lille, France, 6–11 July 2015; Volume 37. [Google Scholar]
- Singh, N.; Olinsky, C. Demystifying Numenta anomaly benchmark. In Proceedings of the IJCNN 2017: International Joint Conference on Neural Networks, Anchorage, Alaska, 14–19 May 2017; pp. 1570–1577. [Google Scholar]
Benchmark | iForest [9] | OCSVM [16] | LOF [8] | PCA [17] | TwitterAD [12] | DeepAnT [13] | FuseAD |
---|---|---|---|---|---|---|---|
A1 | 0.8888 | 0.8159 | 0.9037 | 0.8363 | 0.8239 | 0.8976 | 0.9471 |
A2 | 0.6620 | 0.6172 | 0.9011 | 0.9234 | 0.5000 | 0.9614 | 0.9993 |
A3 | 0.6279 | 0.5972 | 0.6405 | 0.6278 | 0.6176 | 0.9283 | 0.9987 |
A4 | 0.6327 | 0.6036 | 0.6403 | 0.6100 | 0.6534 | 0.8597 | 0.9657 |
Bayes ChangePT [18] | Context OSE [19] | EXPoSE [20] | HTM Java [19] | NUMENTA [14] | Relative Entropy [21] | Skyline [19] | Twitter ADVec [12] | Windowed Gaussian [19] | DeepAnt [13] | FuseAD | |
---|---|---|---|---|---|---|---|---|---|---|---|
Artificial-nA | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Artificial-wA | 0.502 | 0.316 | 0.5144 | 0.653 | 0.531 | 0.505 | 0.558 | 0.503 | 0.406 | 0.555 | 0.544 |
Real-AdE | 0.509 | 0.307 | 0.581 | 0.568 | 0.576 | 0.505 | 0.534 | 0.504 | 0.538 | 0.562 | 0.588 |
Real-AWS | 0.499 | 0.311 | 0.594 | 0.587 | 0.542 | 0.506 | 0.602 | 0.503 | 0.614 | 0.583 | 0.572 |
Real-KC | 0.501 | 0.486 | 0.533 | 0.584 | 0.590 | 0.503 | 0.610 | 0.504 | 0.572 | 0.601 | 0.587 |
Real-Tr | 0.507 | 0.310 | 0.613 | 0.691 | 0.679 | 0.508 | 0.556 | 0.505 | 0.553 | 0.637 | 0.619 |
Real-Tw | 0.498 | 0.304 | 0.594 | 0.549 | 0.586 | 0.500 | 0.559 | 0.505 | 0.560 | 0.554 | 0.546 |
A1 | A2 | A3 | A4 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
ARIMA | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||
CNN | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||
AUC | 0.920 | 0.936 | 0.947 | 0.999 | 0.999 | 0.999 | 0.992 | 0.986 | 0.998 | 0.949 | 0.928 | 0.965 |
Artificial-nA | Artificial-wA | Read-AdE | Real-AWS | Real-KC | Real-Tr | Real-Tw | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ARIMA | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||
CNN | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||
AUC | 0 | 0 | 0 | 0.49 | 0.53 | 0.54 | 0.56 | 0.58 | 0.58 | 0.55 | 0.58 | 0.57 | 0.50 | 0.60 | 0.58 | 0.58 | 0.61 | 0.61 | 0.55 | 0.55 | 0.54 |
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Munir, M.; Siddiqui, S.A.; Chattha, M.A.; Dengel, A.; Ahmed, S. FuseAD: Unsupervised Anomaly Detection in Streaming Sensors Data by Fusing Statistical and Deep Learning Models. Sensors 2019, 19, 2451. https://doi.org/10.3390/s19112451
Munir M, Siddiqui SA, Chattha MA, Dengel A, Ahmed S. FuseAD: Unsupervised Anomaly Detection in Streaming Sensors Data by Fusing Statistical and Deep Learning Models. Sensors. 2019; 19(11):2451. https://doi.org/10.3390/s19112451
Chicago/Turabian StyleMunir, Mohsin, Shoaib Ahmed Siddiqui, Muhammad Ali Chattha, Andreas Dengel, and Sheraz Ahmed. 2019. "FuseAD: Unsupervised Anomaly Detection in Streaming Sensors Data by Fusing Statistical and Deep Learning Models" Sensors 19, no. 11: 2451. https://doi.org/10.3390/s19112451