Anomaly Detection Technique

Anomaly Detection
Softnix Technology
Chakrit Phain

Agenda
2
• Part 1
• Introduction
• Application for Anomaly Detection
• AIOps
• GraphDB
• Part 2
• Type Of Anomaly Detection
• How to Identify Outliers in your Data
• Part 3
• Anomaly Detection for Timeseries Technique

Introduction
3https://en.wikipedia.org/wiki/Anomaly_detection
In data mining, anomaly detection (also outlier detection[1]) is the
identification of rare items, events or observations which raise suspicions by
differing significantly from the majority of the data.

Introduction
4
https://www.anodot.com/blog/what-is-anomaly-detection/
https://developer.mindsphere.io/apis/analytics-anomalydetection/api-anomalydetection-overview.html

Applications of Anomaly Detection in the Business
world
7
• Intrusion detection
• Attacks on computer systems and computer networks.
• Fraud detection
• The purchasing behavior of some one who steals a credit card is probably
different from that of the original owner.
• Ecosystem Disturbance.
• Hurricanes , floods , heat waves … etc
• Medicine.
• Unusual symptoms or test result may indicate potential health problem.
https://zindi.africa/blog/introduction-to-anomaly-detection-using-machine-learning-with-a-case-study

Introduction
8https://www.datasciencecentral.com/profiles/blogs/driving-ai-revolution-with-pre-built-analytic-modules

AIOps
10
https://aiops.fatpipes.biz/2019/09/aiops-best-for-anomalies-noise-alert.html?utm_source=dlvr.it&utm_medium=twitter

AIOps
13https://infrastructureedge.blogspot.com/2019/09/enterprise-it-managers-say-aiops-works.html?utm_source=dlvr.it&utm_medium

Fraud Detection for Graph analytic
16
https://neo4j.com/whitepapers/fraud-detection-graph-databases/?ref=blog

Insurance Fraud
17
https://neo4j.com/whitepapers/fraud-detection-graph-databases/?ref=blog

Anomalies can be broadly categorized as
20
1. Point Anomalies
2. Contextual Anomalies
3. Collective Anomalies
If an individual data instance can be considered as
anomalous with respect to the rest of data, then the instance is
termed as a point anomaly.
Chandola, V., Banerjee, A. & Kumar, V., 2009. Anomaly Detection: A Survey. ACM Computing Surveys, July. 41(3).

21
1. Point Anomalies
If a data instance is anomalous in a specific context
(but not otherwise), then it is termed as a
contextual anomaly

22
1. Point Anomalies
If a collection of related data instances is
anomalous with respect to the entire data set, it
is termed as a collective anomaly

How to Identify Outliers in your Data
24
1. Extreme Value Analysis
2. Proximity Methods
3. Projection Methods
4. Methods Robust to Outliers
https://machinelearningmastery.com/how-to-identify-outliers-in-your-data/

25
1. Focus on univariate methods
2. Visualize the data using scatterplots, histograms and box
and whisker plots and look for extreme values
3. Assume a distribution (Gaussian) and look for values more
than 2 or 3 standard deviations from the mean or 1.5 times
from the first or third quartile
4. Filter out outliers candidate from training dataset and
assess your models performance
https://en.wikipedia.org/wiki/Multivariate_normal_distribution

26

27

28

Typology of Anomalies
29https://tunguska.home.xs4all.nl/Publications/AD_Typology.htm

Anomaly based IDS using Backpropagation Neural Network
https://pdfs.semanticscholar.org/3ac9/37cb50bad238091fba6d1076a78136fe8691.pdf
• Vrushali D. Mane ME (Electronics) JNEC, Aurangabad
• S.N. Pawar Associate Professor JNEC, Aurangabad
• Neural Network detect 98% accuracy

Anomaly Detection for Timeseries

Recurrent neural network (RNN)

Recurrent neural network (GRU)
n=24

Recurrent neural network (GRU)

Historical with DB-Scan
https://en.wikipedia.org/wiki/DBSCAN https://www.mathworks.com/matlabcentral/fileexchange/53842-dbscan

Create new graph with shift to 1 step

Zoom your graph and shift first 3 step

Cosine Similarity
https://www.softnix.co.th/2019/05/29/similarity-%E0%B8%84%E0%B8%A7%E0%B8%B2%E0%B8%A1%E0%B9%80%E0%B8%AB%E0

Rotate graph next 3 step and compare with cosine similarity
Remark when value lower than threshold

Compare with anomaly detection (left) and correlation values (right)

Pros & Cons
Time shift Detection DB-Scan
Work well on slope data like
temperature
Like Time shift compare
Not work on many spikes data Must defined eps and min_samples

Anomaly Detection Technique

More Related Content

Anomaly Detection Technique

Editor's Notes