Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3388831.3388860acmotherconferencesArticle/Chapter ViewAbstractPublication PagesvaluetoolsConference Proceedingsconference-collections
short-paper

TRACK: Optimizing Artificial Neural Networks for Anomaly Detection in Spark Streaming Systems

Published: 29 May 2020 Publication History

Abstract

Due to the growth of Big Data processing technologies and cloud computing services, it is common to have multiple tenants share the same computing resources, which may cause performance anomalies. There is an urgent need for an effective performance anomaly detection method that can be used within the production environment to avoid any late detection of unexpected system failures. To address this challenge, we introduce, TRACK, a new black-box training workload configuration optimization with a neural network driven methodology to identify anomalous performance in an in-memory Big Data Spark streaming platform. The proposed methodology revolves around using Bayesian optimization to find the optimal training dataset size and configuration parameters to train the model efficiently. TRACK is validated on a real Apache Spark streaming system and the results show that the TRACK achieves the highest performance (95% for F-score) and reduces the training time by 80% to efficiently train the proposed anomaly detection model in the in-memory streaming platform.

References

[1]
A. Alnafessah and G. Casale. 2019. Artificial neural networks based techniques for anomaly detection in Apache Spark. Cluster Computing (2019), 1--16.
[2]
G. Ananthanarayanan, S. Kandula, A. G. Greenberg, I. Stoica, Y. Lu, B. Saha, and E. Harris. 2010. Reining in the Outliers in Map-Reduce Clusters using Mantri. In Osdi, Vol. 10. 24.
[3]
S. M. Erfani, S. Rajasegarar, S. Karunasekera, and C. Leckie. 2016. High-dimensional and large-scale anomaly detection using a linear one-class SVM with deep learning. Pattern Recognition 58 (2016), 121--134.
[4]
E. W. Fulp, G. A. Fink, and J. N. Haack. 2008. Predicting Computer System Failures Using Support Vector Machines. WASL 8 (2008), 5--5.
[5]
D. M. Kline and V. L. Berardi. 2005. Revisiting squared-error and cross-entropy functions for training neural network classifiers. Neural Computing & Applications 14, 4(2005), 310--318.
[6]
Mathworks. 2019. Bayesian Optimization Algorithm. https://uk.mathworks.com/help/stats/bayesian-optimization-algorithm.html#bva8tie-1 visited on 1-10-2019.
[7]
M. F. Møller. 1993. A scaled conjugate gradient algorithm for fast supervised learning. Neural networks 6, 4 (1993), 525--533.
[8]
W. Qi, Y. Li, H. Zhou, W. Li, and H. Yang. 2017. Data Mining Based Root-Cause Analysis of Performance Bottleneck for Big Data Workload. In 2017 IEEE HPCC Conferencece. IEEE, 254--261.
[9]
K. G. Sheela and S. N. Deepa. 2013. Review on methods to fix number of hidden neurons in neural networks. Mathematical Problems in Engineering 2013 (2013).

Cited By

View all
  • (2021)AI‐Driven Performance Management in Data‐Intensive ApplicationsCommunication Networks and Service Management in the Era of Artificial Intelligence and Machine Learning10.1002/9781119675525.ch9(199-222)Online publication date: 3-Sep-2021
  • (2020)TRACK-Plus: Optimizing Artificial Neural Networks for Hybrid Anomaly Detection in Data Streaming SystemsIEEE Access10.1109/ACCESS.2020.3015346(1-1)Online publication date: 2020

Index Terms

  1. TRACK: Optimizing Artificial Neural Networks for Anomaly Detection in Spark Streaming Systems

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        VALUETOOLS '20: Proceedings of the 13th EAI International Conference on Performance Evaluation Methodologies and Tools
        May 2020
        217 pages
        ISBN:9781450376464
        DOI:10.1145/3388831
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        In-Cooperation

        • EAI: The European Alliance for Innovation

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 29 May 2020

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. Apache Spark
        2. Artificial Intelligence
        3. Big Data
        4. Machine Learning
        5. Neural Network
        6. Performance Anomalies

        Qualifiers

        • Short-paper
        • Research
        • Refereed limited

        Conference

        VALUETOOLS '20

        Acceptance Rates

        Overall Acceptance Rate 90 of 196 submissions, 46%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)6
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 06 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2021)AI‐Driven Performance Management in Data‐Intensive ApplicationsCommunication Networks and Service Management in the Era of Artificial Intelligence and Machine Learning10.1002/9781119675525.ch9(199-222)Online publication date: 3-Sep-2021
        • (2020)TRACK-Plus: Optimizing Artificial Neural Networks for Hybrid Anomaly Detection in Data Streaming SystemsIEEE Access10.1109/ACCESS.2020.3015346(1-1)Online publication date: 2020

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media