Anomaly Detection Using Graph Neural Networks
Anomaly Detection Using Graph Neural Networks
Feb 2019
Abstract—Conventional methods for anomaly detection spatial as well as temporal networks[26]. However, with the
include techniques based on clustering, proximity or increasing use of these platforms, user come under the threat
classification. With the rapidly growing social networks, outliers of several anomalies in which horizontal anomalies are
or anomalies find ingenious ways to obscure themselves in the difficult to detect and hazardous for any network. These are
network and making the conventional techniques inefficient. In the anomalies caused by a user because of his/her anomalous
this paper, we utilize the ability of Deep Learning over and changing behaviour towards different sources. A self-
topological characteristics of a social network to detect healing neuro-fuzzy approach is used for the detection,
anomalies in email network and twitter network. We present a recovery, and removal of horizontal anomalies efficiently and
model, Graph Neural Network, which is applied on social
accurately[12]. This approach model is evaluated with three
connection graphs to detect anomalies. The combinations of
various social network statistical measures are taken into
datasets: DARPA'98 benchmark dataset, synthetic dataset, and
account to study the graph structure and functioning of the real-time traffic. The evaluation over DARPA'98 dataset
anomalous nodes by employing deep neural networks on it. The demonstrates that the proposed approach is better than the
hidden layer of the neural network plays an important role in existing solutions.
finding the impact of statistical measure combination in In a social network application, anomalous node detection
anomaly detection. is actually a challenging task. On one side, a number of
utilities exist in social networking sites and on the other end,
Keywords— Graph Neural Network, Anomaly Detection,
Social Network, Enron dataset, twitter dataset
free handed content delivery led to extensive misuse. Many
cyber-attacks through social media content show that it has
become prime source of malicious activities. These attempts
I. INTRODUCTION are made to are made to earn illegal profits through rumour
Anomaly or outlier detection is a procedure to find data spread, enhance the prestige of an unknown product, etc.
points which have a spurious behaviour. It happens around us Therefore, in order to catch and abbreviate security risk in the
in the form of fraud detection [6], network surveillance [7], social network, techniques required to detect anomalous
public safety and security [8], intrusion detection [9], medical behaviour in a social network.
problems [10], false advertisement and many more. The term According to observation, malicious users follow some
anomalies are used synonymously with outliers, noise, and ambiguous patterns while sharing information. In this research
deviations. Anomalies may occur as point anomalies or group work, few hypotheses are considered to capture the behaviour
anomalies [11]. Point anomalies can be defined as single data of anomalous nodes. Initially, adjacency matrix of the dataset
points having deviant behaviour from the rest of the network. is taken as input for graph neural network and then topological
Whereas group anomalies are collective anomalous data characteristics of the targeted datasets are computed. This
points, mostly observed in fraudulent activities. Our work is work is inspired by graph-based anomaly detection. Deep
focused on point anomalies. learning technique is utilizing in order to detect nodes
Graph-based anomaly detection can be helpful in finding containing anomalous behaviour [1]. The outliers are labelled
the spammers [14], outspread of any information [16], fake as a distrustful node with the help of social network statistical
reviews [17] or malicious activities [18]. Thus, detecting measures: Between Centrality, Degree, and Closeness. These
anomalies is a vital task to ensure safety and security for the parameters are taken into account to understand the structure
users in a network. Analysing large graphs to find out the of the graph. These topological characteristics are used to
anomalies can also yield important and interesting information exploit the anomalous node behaviour. Following
about the graph structure. contributions are made in this research work:
Detecting spam profiles is considered as one of the most RC1: Uses graph neural network in order to detect
challenging issues in the online social network. Most recent anomaly in social network.
work in this direction has been done by Farris et. al. in 2018, a RC2: Impact of statistical properties of a social network is
hybrid model on SVM-WOA [14] is introduced. This model is tested and empirical validation of results is evaluated on
applied and tested on different lingual context, collected from Enron and Twitter dataset.
Twitter in four languages: Arabic, English, Spanish, Korean to
identify the most influencing features/factor. This model can The paper is divided into five sections. The first section is
effectively help in designing more accurate and insightful the same introduction section which discusses the problem
spam detection models for an online social network. domain. Section 2 covers the related works done in the
domain of anomaly detection. Following this, the third section
Social networks have become a hot topic today and much presents the considered hypothesis, dataset and graph neural
of the work has been done on social networks including network which is used to detect the anomalies in a social
visualization[24] [25], recommendation, link prediction on network. The fourth section shows the experimental setup and
347
graph adjacency matrix to classify nodes in two categories-
anomaly, general(not anomalous).
348
Closeness 0.25 0.8611
connections in an attempt to influence the network as much as Best achieved accuracy corresponding to an individual
possible. It will try to act as a central node in a network so that parameter for a specific threshold is presented in table 3 for
it can give an impact to its neighbouring nodes. Also, having a Enron dataset. The same threshold was used for the Twitter
large number of connections, it can reach out to the whole dataset to mark the nodes as anomalous and general(see table
network in short paths, thus being close to each and every 4). Using a threshold of 1*10^-7 on betweenness centrality, 70
node in the network. These hypotheses can also be used in for degree and 0.25 for closeness centrality, nodes having
detecting the spammers in a network. value greater than threshold were labelled as outliers.
Thresholds were chosen by the trial and error method to yield
The three hypothesis take into the graph structure of the around 40% - 50% of the dataset as outliers.
anomalous nodes and yield features incorporating it. This will
also be helpful in the application of neural network model. Comparing the accuracy of the parameters found on both
Once the three parameters are calculated and data is the datasets, we can conclude that degree is a better parameter
augmented, we find out a representation vector of each node to capture the nature of anomalous nodes and hence,
in the network with the graph neural network. For the network hypothesis 1 holds true.
input and computation, we consider the following: For carrying out fraudulent activities, anomalies will try to
x Given a graph G = (V, E) having N number of nodes connect with and affect as many people as possible. This
and E number of edges. justifies the advantage of degree parameter over others. Our
hypothesis gives a good accuracy and generalizes well over
x Let A denote the matrix of size N*N, representing both the datasets. Thus, we can conclude that our hypotheses
the adjacency matrix of the graph. hold true to detect anomalies in a social network.
x Let W denote the weight matrix initialized uniformly. We conduct another experiment by making a combination
of two to further study the nature of anomalies. This will rank
x Let H(l) denote the l-th hidden neural network layer. the parameters to find out which of them is a better measure in
Our goal is to accurately classify the nodes of the graph as observing the behaviour of an anomaly. Table 5 and Table 6
anomalous or normal using the graph neural network. The presents the combined parameters based anomaly detection
input to the neural network will be the adjacency matrix of the results for Enron and Twitter dataset respectively. Results
graph and thereafter, we use the layer-wise propagation rule. show that combined parameters help in enhancing the
anomaly detection accuracy. For the combination of
parameters, we can observe that degree is the best parameter
which captures the behavior of
TABLE V: Results on Enron Dataset
Here, σ is the ReLU activation function used for the first
layer and sigmoid activation for the output layer. We build our Parameter Threshold Accuracy
neural network using the Keras library. Weights are initialized Between Centrality, Degree 1*10^-7, 70 0.9845
using uniform random initialization.
Between Centrality, 1*10^-7, 0.25 0.9006
Closeness
IV. EXPERIMENTS
We divide the dataset into 80-20 ratio and run the graph Closeness, Degree 0.25, 70 0.9749
neural network for 100 epochs. For compiling the Keras
model, Adam optimizer and Binary-Cross Entropy are used TABLE VI: Results on Twitter Dataset
for optimization and loss computation. Table 3 shows the Parameter Threshold Accuracy
classification results for Enron dataset for best suitable
thresholds of betweenness, closeness and degree centrality. Between Centrality, Degree 1*10^-7, 70 0.9823
TABLE III: Results of the parameters on Enron Dataset Closeness, Degree 0.25, 70 0.9756
Parameter Threshold Accuracy
anomalous nodes. This also validates our hypothesis 1 and the
Between Centrality 1*10^-7 0.8615 assumption that anomalous nodes tend to connect to maximal
Degree 70 0.8632 nodes as possible to be a central node and be close to each and
every node as much as possible. Our work outperforms the
works [6] [12] in a way that it takes the graph structure by
349
considering the degree, closeness and betweenness. By the [9] Karami, A. (2018). An anomaly-based intrusion detection system in
definition of anomaly, the node will have a peculiar behaviour presence of benign outliers with visualization capabilities. Expert
Systems with Applications, 108, 36-60.
of having very large or very few connections, thus, verifying
[10] Kodama, T., Kamata, K., Fujiwara, K., Kano, M., Yamakawa, T., Yuki,
our approach. I., \& Murayama, Y. (2018). Ischemic stroke detection by analyzing
heart rate variability in rat middle cerebral artery occlusion model.
V. CONCLUSIONS IEEE Transactions on Neural Systems and Rehabilitation Engineering.
[11] Ahmed, Mohiuddin, Abdun Naser Mahmood, and Jiankun Hu. "A
In this work, we presented a deep learning model, Graph survey of network anomaly detection techniques." Journal of Network
Neural Network to detect the anomalies and outliers in a and Computer Applications 60 (2016): 19-31.
social network. We also present three hypothesis stating the [12] KUMAR, RAVINDER, et al. "NHAD: Neuro-Fuzzy Based Horizontal
behaviour of anomalous nodes and try to prove them using our Anomaly Detection In Online Social Networks." IEEE Transactions on
model. Validation of the efficiency of our model was done on Knowledge and Data Engineering (2018).
two datasets - Enron (email communication network) and [13] Kim, Tae-Young, and Sung-Bae Cho. "Web traffic anomaly detection
using C-LSTM neural networks." Expert Systems with Applications
Twitter (social networking site). The number of outliers in the 106 (2018): 66-76.
dataset were augmented using the node properties - degree,
[14] AlaM, A. Z., Faris, H., \& Hassonah, M. A. (2018). Evolving Support
between centrality and closeness centrality. These parameters Vector Machines using Whale Optimization Algorithm for spam
were chosen since they take into account the structure of the profiles detection on online social networks in different lingual
graph. We show the results by taking these parameters contexts. Knowledge-Based Systems, 153, 91-104.
individually and as a combination which achieves good [15] Liu, Siyuan, Qiang Qu, and Shuhui Wang. "Heterogeneous anomaly
accuracy over the datasets and hence, proves our hypothesis detection in social diffusion with discriminative feature discovery."
true. Information Sciences 439 (2018): 1-18.
[16] Prado-Romero, M. A., Oliva, A. F., & Hernández, L. G. (2018,
September). Identifying Twitter Users Influence and Open Mindedness
VI. FUTURE WORK Using Anomaly Detection. In International Workshop on Artificial
Intelligence and Pattern Recognition(pp. 166-173). Springer, Cham.
Detecting anomalies can help to reduce the fraudulent
[17] Ramalingam, D., &Chinnaiah, V. (2018). Fake profile detection
activities or spamming spreading in the network. For this, techniques in large-scale online social networks: A comprehensive
efficient method need to be developed which take into account review. Computers & Electrical Engineering, 65, 165-177.
the behaviour of anomalies to its core. Graph Neural Network [18] Al-Qurishi, M., Hossain, M. S., Alrubaian, M., Rahman, S. M. M.,
can capture the features and establish well relationships &Alamri, A. (2018). Leveraging Analysis of User Behavior to Identify
between them due to the hidden layer. Experimenting with Malicious Activities in Large-Scale Social Networks. IEEE
more features and testing it on neural networks can broaden Transactions on Industrial Informatics, 14(2), 799-813.
the study on the nature of anomalies. [19] Scarselli, F., Tsoi, A. C., Gori, M., & Hagenbuchner, M. (2005). A new
neural network model for graph processing. Department of Information
Engineering, University of Siena, Tech. Rep, 502, 01-05.
REFERENCES [20] Li, Y., Tarlow, D., Brockschmidt, M., & Zemel, R. (2015). Gated
[1] Semi-Supervised Classification With Graph Convolutional Networks, graph sequence neural networks. arXiv preprint arXiv:1511.05493.
Thomas N. Kipf, Max Welling, ICLR 2017 [21] Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J., &
[2] Lili Zhang*, Huibin Wang, Chenming Li, Qing Ye, Yehong Shao, Bronstein, M. M. (2017, July). Geometric deep learning on graphs and
``Unsupervised Anomaly Detection Algorithm of Graph Data Based on manifolds using mixture model CNNs. In Proc. CVPR (Vol. 1, No. 2,
Graph Kernel '', 2017 IEEE 4th International Conference on Cyber p. 3).
Security and Cloud Computing [22] Radford, B. J., Apolonio, L. M., Trias, A. J., & Simpson, J. A. (2018).
[3] A Survey on Different Graph Based Anomaly Detection Techniques, Network Traffic Anomaly Detection Using Recurrent Neural
Debajit Sensarma, and Samar Sen Sarma, Indian Journal of Science and Networks. arXiv preprint arXiv:1803.10769.
Technology, Vol 8(31), November 2015 [23] Aggrawal, N., & Arora, A. (2016, October). Visualization, analysis and
[4] Leman Akoglu, Hanghang Tong and Danai Koutra, ``Graph-based structural pattern infusion of DBLP co-authorship network using
Anomaly Detection and Description: {A} Survey, CoRR, Gephi. In Next Generation Computing Technologies (NGCT), 2016
abs/1404.4679, 2014 2nd International Conference on(pp. 494-500). IEEE.
[5] GLAD: Group Anomaly Detection in Social Media Analysis, Rose Yu, [24] Aggrawal, N., & Arora, A. (2016). Vulnerabilities issues and
Xinran He, and Yan Liu melioration plans for online social network over Web 2.0. Commun.
Dependability Qual. Manag. Int. J, 19(1), 66-73.
[6] Behdad, Mohammad, Luigi Barone, Mohammed Bennamoun, and Tim
French. "Nature-inspired techniques in the context of fraud detection." [25] Miller, Zachary, et al. "Twitter spammer detection using data stream
IEEE Transactions on Systems, Man, and Cybernetics, Part C clustering." Information Sciences 260 (2014): 64-73.
(Applications and Reviews) 42, no. 6 (2012): 1273-1290. [26] Ranshous, Stephen, et al. "Anomaly detection in dynamic networks: a
[7] Alpaydn, G. An Adaptive Deep Neural Network for Detection, survey." Wiley Interdisciplinary Reviews: Computational Statistics 7.3
Recognition of Objects with Long Range Auto Surveillance. (2015): 223-247.G. Eason, B. Noble, and I. N. Sneddon, “On certain
integrals of Lipschitz-Hankel type involving products of Bessel
[8] Yang, J., Zhou, C., Yang, S., Xu, H., \& Hu, B. (2018). Anomaly functions,” Phil. Trans. Roy. Soc. London, vol. A247, pp. 529–551,
detection based on zone partition for security protection of industrial April,1955.
cyber-physical systems. IEEE Transactions on Industrial Electronics,
65(5), 4257-4267.
350