An Analysis of Outlier Detection Through Clustering Method

This research paper deals with an outlier which is known as an unusual behavior of any substance present in the spot. This is a detection process that can be employed for both anomaly detection and abnormal observation. This can be obtained through other members who belong to that data set. The deviation present in the outlier process can be attained by measuring certain terms like range, size, activity, etc. By detecting outlier one can easily reject the negativity present in the field. For ins

Uploaded by

Ijaems Journal

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

40 views

An Analysis of Outlier Detection Through Clustering Method

Uploaded by

Ijaems Journal

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

International Journal of Advanced Engineering, Management and Science (IJAEMS) [Vol-6, Issue-12, Dec-2020]

https://dx.doi.org/10.22161/ijaems.612.13 ISSN: 2454-1311

An Analysis of Outlier Detection through

clustering method
T. Chandrakala1, S. Nirmala Sugirtha Rajini2

1Assistant Professor, Department of Computer Applications, Jawahar Science College, Neyveli, TamilNadu, India
2Professor, Department of Computer Applications, Dr. M.G.R. Educational & Research Institute, Chennai, TamilNadu, India

Received: 08 Nov 2020; Received in revised form: 03 Dec 2020; Accepted: 12 Dec 2020; Available online: 30 Dec 2020
©2020 The Author(s). Published by Infogain Publication. This is an open access article under the CC BY license
(https://creativecommons.org/licenses/by/4.0/).

Abstract— This research paper deals with an outlier which is known as an unusual behavior of any substance
present in the spot. This is a detection process that can be employed for both anomaly detection and abnormal
observation. This can be obtained through other members who belong to that data set. The deviation present in
the outlier process can be attained by measuring certain terms like range, size, activity, etc. By detecting
outlier one can easily reject the negativity present in the field. For instance, in healthcare, the health condition
of a person can be determined through his latest health report or his regular activity. When found the person
being inactive there may be a chance for that person to be sick. Many approaches have been used in this
research paper for detecting outliers. The approaches used in this research are 1) Centroid based approach
based on K-Means and Hierarchical Clustering algorithm and 2) through Clustering based approach. This
approach may help in detecting outlier by grouping all similar elements in the same group. For grouping, the
elements clustering method paves a way for it. This research paper will be based on the above mentioned 2
approaches.
Keywords— detection of an outlier, data set, clustering approach, abnormality.

I. INTRODUCTION any data analysis is performed on the data(Bhattacharya et

Mining, in general, is termed as the intrinsic methodology of al., 2015).
discovering interesting, formerly unknown data patterns. A large number of domains apply Outlier detection directly.
Outlier detection has important applications in the field of This results in the development of innumerable outlier
data mining, such as fraud detection, customer behavior detection techniques. A lot of these techniques have been
analysis, and intrusion detection. A number of approaches developed to solve focused problems pertaining to a
are used in the process of detecting the outlier (Bezerra et particular application domain, while others have been
al., 2016). Clustering can be termed as a set-grouping task developed in a more generic fashion(Pimentel et al., 2014).
where similar objects are being grouped together. Clustering,
a primitive anthropological method is a vital method in
II. DATA IN DATA MINING
exploratory data mining for statistical data analysis, machine
Generally, we are drowned in information, but starving
learning, and image analysis and in many other predominant
for knowledge. Data can be collected from multiple sources.
branches of supervised and unsupervised learning.
The purposes can be categorized as Business, Science, and
Outlier detection is related to unwanted noise in the data. As Society. Business purposes can use the data for Web, E-
far as the analysts are concerned, noise in data is not Com, Transaction, and Stock Marketing. Scientific data can
important but acts as a hindrance to data analysis. Noise be used for Remote sensing, Bio-informatics, and Scientific
removal is the process of removing unwanted objects before