Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3447548.3467125acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

Multi-Scale One-Class Recurrent Neural Networks for Discrete Event Sequence Anomaly Detection

Published: 14 August 2021 Publication History

Abstract

Discrete event sequences are ubiquitous, such as an ordered event series of process interactions in Information and Communication Technology systems. Recent years have witnessed increasing efforts in detecting anomalies with discrete event sequences. However, it remains an extremely difficult task due to several intrinsic challenges including data imbalance issues, discrete property of the events, and sequential nature of the data. To address these challenges, in this paper, we propose OC4Seq, a multi-scale one-class recurrent neural network for detecting anomalies in discrete event sequences. Specifically, OC4Seq integrates the anomaly detection objective with recurrent neural networks (RNNs) to embed the discrete event sequences into latent spaces, where anomalies can be easily detected. In addition, given that an anomalous sequence could be caused by either individual events, subsequences of events, or the whole sequence, we design a multi-scale RNN framework to capture different levels of sequential patterns simultaneously. We fully implement and evaluate OC4Seq on three real-world system log datasets. The results show that OC4Seq consistently outperforms various representative baselines by a large margin. Moreover, through both quantitative and qualitative analysis, the importance of capturing multi-scale sequential patterns for event anomaly detection is verified. To encourage reproducibility, we make the code and data publicly available.

References

[1]
Mennatallah Amer, Markus Goldstein, and Slim Abdennadher. 2013. Enhancing one-class support vector machines for unsupervised anomaly detection. In Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description.
[2]
Cristiana Amza, Emmanuel Cecchet, Anupam Chanda, Alan L Cox, Sameh Elnikety, Romer Gil, Julie Marguerite, Karthick Rajamani, and Willy Zwaenepoel. 2002. Specification and implementation of dynamic web site benchmarks. In 5th Workshop on Workload Characterization.
[3]
Shyam Boriah, Varun Chandola, and Vipin Kumar. 2008. Similarity measures for categorical data: A comparative evaluation. In Proceedings of the 2008 SIAM International Conference on Data Mining. 243--254.
[4]
Suratna Budalakoti, Ashok N Srivastava, and Matthew Eric Otey. 2008. Anomaly detection and diagnosis algorithms for discrete symbol sequences with applications to airline safety. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), Vol. 39, 1 (2008), 101--113.
[5]
Cheng Cao, Zhengzhang Chen, James Caverlee, Lu-An Tang, Chen Luo, and Zhichun Li. 2018. Behavior-Based Community Detection: Application to Host Assessment In Enterprise Information Networks. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management. 1977--1985.
[6]
Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2010. Anomaly detection for discrete sequences: A survey. TKDE, Vol. 24, 5 (2010), 823--839.
[7]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
[8]
Sung-Bae Cho and Hyuk-Jang Park. 2003. Efficient anomaly detection by modeling privilege flows using hidden Markov model. Computers & Security, Vol. 22, 1 (2003), 45--55.
[9]
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).
[10]
Andrew M Dai and Quoc V Le. 2015. Semi-supervised sequence learning. In Advances in Neural Information Processing Systems. 3079--3087.
[11]
Boxiang Dong, Zhengzhang Chen, Hui Wang, Lu-An Tang, Kai Zhang, Ying Lin, Zhichun Li, and Haifeng Chen. 2020. Anomalous Event Sequence Detection. IEEE Intelligent Systems (2020).
[12]
Boxiang Dong, Zhengzhang Chen, Hui (Wendy) Wang, Lu-An Tang, Kai Zhang, Ying Lin, Zhichun Li, and Haifeng Chen. 2017. Efficient Discovery of Abnormal Event Sequences in Enterprise Security Systems. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management. 707--715.
[13]
David L Donoho and Carrie Grimes. 2003. Hessian eigenmaps: Locally linear embedding techniques for high-dimensional data. Proceedings of the National Academy of Sciences, Vol. 100, 10 (2003), 5591--5596.
[14]
Min Du, Feifei Li, Guineng Zheng, and Vivek Srikumar. 2017. Deeplog: Anomaly detection and diagnosis from system logs through deep learning. In Proceedings of the 2017 CCS. 1285--1298.
[15]
Manish Gupta, Jing Gao, Charu C Aggarwal, and Jiawei Han. 2013. Outlier detection for temporal data: A survey. TKDE, Vol. 26, 9 (2013), 2250--2267.
[16]
Haibo He and Yunqian Ma. 2013. Imbalanced learning: foundations, algorithms, and applications .John Wiley & Sons.
[17]
Pinjia He, Jieming Zhu, Zibin Zheng, and Michael R Lyu. 2017. Drain: An online log parsing approach with fixed depth tree. In 2017 IEEE International Conference on Web Services (ICWS). IEEE, 33--40.
[18]
Shilin He, Jieming Zhu, Pinjia He, and Michael R Lyu. 2016. Experience report: System log analysis for anomaly detection. In 2016 IEEE 27th International Symposium on Software Reliability Engineering (ISSRE). IEEE, 207--218.
[19]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural Computation, Vol. 9, 8 (1997), 1735--1780.
[20]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[21]
Kun-Lun Li, Hou-Kuan Huang, Sheng-Feng Tian, and Wei Xu. 2003. Improving one-class SVM for anomaly detection. In Proceedings of the 2003 International Conference on Machine Learning and Cybernetics, Vol. 5. IEEE, 3077--3081.
[22]
Rushi Longadge and Snehalata Dongre. 2013. Class imbalance problem in data mining review. arXiv preprint arXiv:1305.1707 (2013).
[23]
Jian-Guang Lou, Qiang Fu, Shengqi Yang, Ye Xu, and Jiang Li. 2010. Mining Invariants from Console Logs for System Problem Detection. In USENIX Annual Technical Conference. 1--14.
[24]
Chen Luo, Zhengzhang Chen, Lu-An Tang, Anshumali Shrivastava, Zhichun Li, Haifeng Chen, and Jieping Ye. 2018. TINET: learning invariant networks via knowledge transfer. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1890--1899.
[25]
Mirco Marchetti and Dario Stabili. 2017. Anomaly detection of CAN bus messages through analysis of ID sequences. In 2017 IEEE Intelligent Vehicles Symposium (IV). IEEE, 1577--1583.
[26]
Weibin Meng and et al. 2019. Loganomaly: Unsupervised detection of sequential and quantitative anomalies in unstructured logs. In IJCAI-19, Vol. 7. 4739--4745.
[27]
Adam Oliner and Jon Stearley. 2007. What supercomputers say: A study of five system logs. In 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN'07). IEEE, 575--584.
[28]
Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Müller, and Marius Kloft. 2018. Deep one-class classification. In International Conference on Machine Learning. 4393--4402.
[29]
Bernhard Schölkopf, John C Platt, John Shawe-Taylor, Alex J Smola, and Robert C Williamson. 2001. Estimating the support of a high-dimensional distribution. Neural Computation, Vol. 13, 7 (2001), 1443--1471.
[30]
Xiaobin Tan and Hongsheng Xi. 2008. Hidden semi-Markov model for anomaly detection. Appl. Math. Comput., Vol. 205, 2 (2008), 562--567.
[31]
David MJ Tax and Robert PW Duin. 2004. Support vector data description. Machine Learning, Vol. 54, 1 (2004), 45--66.
[32]
Aaron Randall Tuor, Ryan Baerwolf, Nicolas Knowles, Brian Hutchinson, Nicole Nichols, and Robert Jasper. 2018. Recurrent neural network language models for open vocabulary event-level cyber anomaly detection. In Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence.
[33]
Shen Wang, Zhengzhang Chen, Xiao Yu, Ding Li, Jingchao Ni, Lu-An Tang, Jiaping Gui, Zhichun Li, Haifeng Chen, and Philip S. Yu. 2019. Heterogeneous Graph Matching Networks for Unknown Malware Detection. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence. 3762--3770.
[34]
Yanxin Wang, Johnny Wong, and Andrew Miner. 2004. Anomaly intrusion detection using one class SVM. In Proceedings from the Fifth Annual IEEE SMC Information Assurance Workshop, 2004. IEEE, 358--364.
[35]
Svante Wold, Kim Esbensen, and Paul Geladi. 1987. Principal component analysis. Chemometrics and Intelligent Laboratory Systems, Vol. 2, 1--3 (1987), 37--52.
[36]
Wei Xu, Ling Huang, Armando Fox, David Patterson, and Michael I Jordan. 2009. Detecting large-scale system problems by mining console logs. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles. 117--132.
[37]
Xiao Yu, Pallavi Joshi, Jianwu Xu, Guoliang Jin, Hui Zhang, and Guofei Jiang. 2016. Cloudseer: Workflow monitoring of cloud infrastructures via interleaved logs. ACM SIGARCH Computer Architecture News, Vol. 44, 2 (2016), 489--502.
[38]
Yin Zhang, Rong Jin, and Zhi-Hua Zhou. 2010. Understanding bag-of-words model: a statistical framework. International Journal of Machine Learning and Cybernetics, Vol. 1, 1--4 (2010), 43--52.
[39]
Xu Zhao, Kirk Rodrigues, Yu Luo, Ding Yuan, and Michael Stumm. 2016. Non-intrusive performance profiling for entire software stacks based on the flow reconstruction principle. In 12th USENIX-(OSDI).

Cited By

View all
  • (2024)MISP: A Multimodal-based Intelligent Server Failure Prediction Model for Cloud Computing SystemsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671568(5509-5520)Online publication date: 25-Aug-2024
  • (2024)Detecting Illicit Food Factories from Chemical Declaration Data via Graph-aware Self-supervised Contrastive Anomaly RankingProceedings of the ACM Web Conference 202410.1145/3589334.3648138(4501-4511)Online publication date: 13-May-2024
  • (2024)Holistic Representation Learning for Multitask Trajectory Anomaly Detection2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00659(6715-6725)Online publication date: 3-Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
August 2021
4259 pages
ISBN:9781450383325
DOI:10.1145/3447548
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 August 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. anomaly detection
  2. event sequence modeling
  3. multi-scale sequential pattern mining
  4. one-class recurrent neural network

Qualifiers

  • Research-article

Funding Sources

Conference

KDD '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)391
  • Downloads (Last 6 weeks)43
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)MISP: A Multimodal-based Intelligent Server Failure Prediction Model for Cloud Computing SystemsProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671568(5509-5520)Online publication date: 25-Aug-2024
  • (2024)Detecting Illicit Food Factories from Chemical Declaration Data via Graph-aware Self-supervised Contrastive Anomaly RankingProceedings of the ACM Web Conference 202410.1145/3589334.3648138(4501-4511)Online publication date: 13-May-2024
  • (2024)Holistic Representation Learning for Multitask Trajectory Anomaly Detection2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00659(6715-6725)Online publication date: 3-Jan-2024
  • (2024)Anomaly Detection in IoT Networks Based on Intelligent Security Event Correlation2024 16th International Conference on COMmunication Systems & NETworkS (COMSNETS)10.1109/COMSNETS59351.2024.10426939(816-824)Online publication date: 3-Jan-2024
  • (2024)Landscape and Taxonomy of Online Parser-Supported Log Anomaly Detection MethodsIEEE Access10.1109/ACCESS.2024.338728712(78193-78218)Online publication date: 2024
  • (2024)Contracting skeletal kinematics for human-related video anomaly detectionPattern Recognition10.1016/j.patcog.2024.110817(110817)Online publication date: Jul-2024
  • (2024)AFMFKnowledge-Based Systems10.1016/j.knosys.2024.111912296:COnline publication date: 19-Jul-2024
  • (2024)TWLog: Task Workflow-Based Log Anomaly DetectionWeb and Big Data10.1007/978-981-97-7244-5_1(3-16)Online publication date: 28-Aug-2024
  • (2024)Backdoor Attack Against One-Class Sequential Anomaly Detection ModelsAdvances in Knowledge Discovery and Data Mining10.1007/978-981-97-2259-4_20(262-274)Online publication date: 25-Apr-2024
  • (2024)Achieving Counterfactual Explanation for Sequence Anomaly DetectionMachine Learning and Knowledge Discovery in Databases. Research Track and Demo Track10.1007/978-3-031-70371-3_2(19-35)Online publication date: 22-Aug-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media