0% found this document useful (0 votes)

67 views

Normalization Based K Means Clustering Algorithm

The document proposes a normalization-based K-means clustering algorithm called N-K means. It summarizes previous work that applied techniques like normalization and improved initialization of centroids to enhance K-means clustering. The proposed N-K means algorithm pre-processes and normalizes the data before applying K-means clustering. It calculates initial centroids based on weighted averages of the dataset attributes. Experimental results show N-K means performs better than traditional K-means in terms of complexity and performance.

Uploaded by

Antonio D'agata

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views

Normalization Based K Means Clustering Algorithm

Uploaded by

Antonio D'agata

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Normalization based K means Clustering Algorithm

Deepali Virmani1,Shweta Taneja2,Geetika Malhotra3

1
Department of Computer Science,Bhagwan Parshuram Institute of Technology,New Delhi
Email:deepalivirmani@gmail.com
2
Department of Computer Science,Bhagwan Parshuram Institute of Technology,New Delhi
Email:shweta_taneja08@yahoo.co.in
3
Department of Computer Science,Bhagwan Parshuram Institute of Technology,New Delhi
Email:geets002@gmail.com

Abstract- K-means is an effective clustering partitioning. In hierarchical clustering, the clusters

technique used to separate similar data into groups are combined based on their proximity or how
based on initial centroids of clusters. In this paper, close they are. This combination is prevented when
Normalization based K-means clustering further process leads to undesirable clusters. In
algorithm(N-K means) is proposed. Proposed N-K partition clustering approach, one dataset is
means clustering algorithm applies normalization separated into definite number of small sets in a
prior to clustering on the available data as well as single iteration[10]. The accuracy and quality of
the proposed approach calculates initial centroids clustering results depends how the algorithms are
based on weights. Experimental results prove the implemented and their ability to find hidden
betterment of proposed N-K means clustering knowledge.
algorithm over existing K-means clustering
algorithm in terms of complexity and overall There are various clustering algorithms based on
performance. the nature of generated clusters and techniques.
Few of them are BIRCH(Balanced iterative
Keywords- Clustering, Data mining, K means, reducing and clustering using
Normalization, Weighted Average hierarchies),CURE(Clustering using
representatives),K-means, genetic K-means, Clara,
I. INTRODUCTION Dbscan,Clarans etc[6]. The most widely used
clustering algorithm is the K- means algorithm.
Data mining[7][11]or knowledge discovery is a This algorithm is used in many practical
process of analysing large amounts of data and applications.It works by selecting the initial
extracting useful information. It is an important number of clusters and initial centroids[7][13]. We
technology which is used by industries as a novel have chosen K-means algorithm over other
approach to mine data. Data mining tools and clustering algorithms as it very efficient in
techniques are used to generate effective results processing large data sets. It often terminates at a
which was earlier difficult and time local optimum and generates tighter clusters than
consuming.Data mining is widely used in various hierarchical clustering, especially if clusters are
areas like financial data analysis, retail and globular. It is a popular algorithm because of its
telecommunication industry, biological data observable speed and simplicity.But K-means has a
analysis, fraud detection, spatial data analysis and major disadvantage that it does not work well with
other scientific applications. clusters of different size and different density.
Moreover initial centroids are chosen randomly due
Clustering is a technique of data mining in
to which clusters produced vary from one run to
which similar objects are grouped into clusters.
another. Also various datapoints exist on which K-
Clustering techniques are widely used in various
means takes superpolynomial time[8][5].
domains like information retrieval, image
processing,etc[1][2].There are two types of Different researchers have put forward various
approaches in clustering: hierarchical and methods to improve the efficiency and time of K-
means algorithm. K-means uses the concept of Authors[3], have proposed data preprocessing
Euclidean distance to calculate the centroids of the techniques like cleaning and normalization to
clusters. This method is less effective when new produce optimum quality clusters. In normalization
data sets are added and have no effect on the the data to be analyzed is scaled to a specific range.
measured distance between various data objects. A modified k means algorithm is proposed which
The computational complexity of k means provides a solution for automatic initialization of
algorithm is also very high[1][9].Also, K-means is centroids and performance is enhanced using
unable to handle noisy data and missing normalization. This techqnique overcomes many
values.Data preprocessing techniques are often drawbacks of naive k means algorithm.
applied to the datasets to make them more clean,
consistent and noise free. Normalization is used to Some authors have proposed their methods to
identify initial centroids. In the following work[1],
eliminate redundant data and ensures that good
authors proposed a novel method to find better
quality clusters are generated which can improve initial centroids as well as more accurate clusters
the efficiency of clustering algorithms.So it with less computational time. This method was
becomes an essential step before clustering as adopted to find weighted average score of dataset
Euclidean distance is very sensitive to the changes by averaging the value of attribute of each data
in the differences[3]. point to generate initial centroids.
In another work, Authors[8] proposed a new k
This paper is organized as follows:in Section II, means clustering method with improved initial
a description of the literature survey is done in centre. In this method, initial cluster centres are
which we have covered the work done by various selected and the centres are used as input to the k
authors to improve K-means clustering algorithm. means. The user is not required to give the number
Then in Section III, our proposed N-K means of clusters as input.
algorithm is described stepwise followed by In another research as done by authors[4], a data
experimental results in Section IV, where a clustering approach is proposed which works by
comparison is shown between traditional K-means partitioning the space into different segments and
clustering algorithm and N-K means clustering calculating the frequency of data point in each
segment and the segment which shows maximum
algorithm. Lastly, the conclusions are addressed in
frequency of data point has maximum chances to
Section V. contain the centroid of the cluster.The authors have
introduced concept of threshold distance for each
II. LITERATURE SURVEY cluster’s centroid for comparing the distance
between data point and cluster’s centroid and using
A lot of methods and techniques have been this method, efforts to calculate the distance
proposed over the past few years to improve the between data point and cluster’s centroid is
accuracy of the algorithm and there is a need to minimized. This algorithm effectively decreases
optimize it to have good results.This section the complexity and makes calculations easier.
discusses the various approaches proposed by
researchers to find better initial centroids in k
means algorithm .

III. PROPOSED N- K MEANS ALGORITHM

K-means algorithm can generate better results after 3.1 DATA PRE - PROCESSING
the modification of the databases. We apply the It is a very important step and should be adopted in
modified algorithm with calculation of initial clustering as this method uses concepts like
centroids based on weighted average score of constant, average, minimum, maximum, standard
dataset. Next, we preprocess and normalize dataset deviation to calculates missing values in the
before we apply the N-K means algorithm. This tuples[7]. These missing values need to be avoided
proposed method works in three stages.During the for accurate results. Preprocessing involves steps
first stage,data preprocessing technique is adopted like data cleaning, data integration, data
that transforms raw data into understandable transformation, data reduction and data
format. During the second stage, normalization is discretization.
performed to standardize the data objects into
specific range. During the third stage we apply the 3.2 NORMALIZATION
N-K means algorithm to generate clusters. Data Mining can generate effective results if
normalization is applied to the dataset. It is a
process used to standardize all the attributes of the 3.3 INITIAL CENTROIDS
dataset and give them equal weight so that CALCULATION
redundant or noisy objects can be eliminated and
We use a uniform method to find score by taking
there is valid and reliable data which enhances the
the average of the attribute of each data point
accuracy of the result. K-Means algorithm uses
which will generate initial centroids that follow the
Euclidean distance that is highly prone to
data distribution of the given set. A sorting
irregularities in the size of various features[3].
algorithm is applied to the score of each data point
There are various data normalization methods like
and then divided into k subsets where k is the
Min-Max, Z-Score and Decimal Scaling.The best
number of clusters. Finally the nearest value of
normalization method depends on the data to be
mean from each subset is taken as initial centroid.
normalized. Here, we have used Min-Max
In this method we have introduced a weight with
normalization technique in our algorithm because
each attribute, which makes the method
our dataset is limited and has not much variability
advantageous as it can cause enhancement of any
between minimum and maximum. Min-Max
feature of the dataset by increasing the weight
normalization technique performs a linear
related to that attribute.
transformation on the data. In this method, we fit
the data in a predefined boundary or in a predefined The algorithm is given in Fig 1:
interval.

ALGORITHM 1: Steps of N-K means Algorithm

INPUT: A dataset with d dimensions

OUTPUT: Clusters

1. Load initial data set.

2. Find the maximum and minimum values of each feature from the dataset.
3. Normalize real scalar values of datasets with maximum and minimum values using
equation : v’ = v-min(e) (1) where,
max(e)- min(e)
min(e) and max(e) are the minimum and the maximum values for attribute E.

4. Pass the number of clusters and generate initial centroids using algorithm 2.
5. Generate clusters.

Figure 1: shows the steps of N-K means algorithm

ALGORITHM 2: Initialization of centroids

1. Calculate the average score of each data point.

1)di= x1, x2,x3,x4…xn
2)di(avg)=(w1*x1+w2*x2+w3*x3+…..wm*xm)/m where x=
attribute’s value , m= no of attributes,w= weight to multiply to
ensure fair distribution of cluster.

2. Sort the data based on average score .

3. Divide the data based on k subsets.

4. Calculate the mean value of each subset.

5. Take the nearest possible data point of the mean as the initial centroid
for each data subsets.

Figure 2: Algorithm to calculate initial number of centroids

IV. EXPERIMENTAL RESULTS

Our experiment was conducted on Iris data set [12] k. From the comparisons we can make out that N-K
from UCI Machine Learning Repository for means algorithm outperforms the traditional K-
evaluating the performance of N-K means means algorithm in terms of parameters namely
clustering algorithm. In this section, we represent a execution time and speed. Hence the algorithm
comparative analysis of traditional K-means computationally runs faster as it executes in less
clustering algorithm with N-K means algorithm. number of iterations and the complexity is reduced.
Both the algorithms are run for different values of The results are depicted in Table 1.
Table 1: Performance Comparison of N-K means Algorithm With Existing K-means

Value of k Algorithm Time taken Speed

(ms)
1 K means 0.078 5.1
N- K means 0.065 3.5
3 K-means 0.094 6.2
N-K means 0.081 4.7
5 K-means 0.125 6.6
N-K means 0.103 5.0
7 K-means 0.134 7.2
N-K means 0.117 5.7

0.2 8
0.15 6
TIME

SPEED

0.1 4
K-means K-means
0.05
2
0 N-K means N-K means
0
1 3 5 7
1 3 5 7
No. of clusters
No. of clusters

Figure 3: shows comparison between K-means and

Figure 4: shows comparison between K-means and
N-K means on the basis of time
N-K means on the basis of speed
V. CONCLUSION

The K-means clustering algorithm is widely used

REFERENCES
for clustering huge data sets. But traditional k
means algorithm does not always generate good [1] Md Sohrab Mahmud, Md. Mostafizer Rahman,
quality results as automatic initialization of Md. Nasim Akhtar, “ Improvement of K-means
centroids affects final clusters.This paper presents Clustering algorithm with better initial centroids
an efficient algorithm where we have first based on weighted average”, 7th International
preprocessed our dataset based on normalization Conference on Electrical and Computer
technique and then generated effective clusters. Engineering, 2012, pp. 647-650.
[2] Madhu Yedla,Srinivasa Rao Pathakota,TM
This is done by assigning weights to each attribute
Srinivasa, “Enhancing K-means Clustering
value to achieve standardization. Our algorithm has Algorithm with Improved Initial Center”,
proved to be better than traditional K-means International Journal of Computer Science and
algorithm in terms of execution time and speed. Information
Technologies(IJCSIT),Vol.1(2),2010,pp. 121-125.
[3]Vaishali Rajeev Patel, Rupa G. Mehta, Center”, Second International Workshop on
“Performance Analysis of MK-means Clustering Knowledge Discovery and Data Mining,2009, pp.
Algorithm with Normalization Approach”, World 790-792.
Congress on Information and Communication [9] K. A.Abdul Nazeer, M.P.Sebastian ,”Improving
Technologies, 2011, pp. 974-979. the Accuracy and Efficiency of the k-means
[4] Ran Vijay Singh, M.P.S Bhatia, “Data Clustering Algorithm”, Proceedings of the World
Clustering With Modified K means Algorithm”, Congress On Engineering 2009 Vol I, WCE 2009,
IEEE-International Conference on Recent Trends pp. 308-312.
in Information Technology, ICRTIT 2011, pp. [10] Margaret H.Dunham, “Data Mining-
717-721. Introductory and Advanced Concepts”, Pearson
[5] David Arthur & Sergei Vassilvitskii , "How Education,2006.
Slow is the k means Method?", Proceedings of the [11] R. Agrawal, T. Imielinksi and A. Swami,
22nd Symposium on Computational Geometry “Mining association rules between sets of items in
(SoCG),2006, pp. 144-153. large database”, The ACM SIGMOD Conference,
[6] J. Han and M.Kamber, “Data Mining Concepts Washington DC, USA, 1993, pp. 207-216.
and Techniques”, Morgan Kaufmann [12] UCI Repository of Machine Learning
Publishers,SanDiego, 2001. Databases,
[7]Vaishali R. Patel, Rupa G. Mehta, “Impact of Available: archive.ics.uci.edu/ml/
Outlier Removal and Normalization Approach in [13] McQueen J, “Some methods for classification
Modified k-Means Clustering Algorithm”, IJCSI and analysis of ultivariate observations,” Proc. 5th
International Journal of Computer Science Issues, Berkeley Symp. Math. Statist. Prob., Vol. 1, 1967,
Vol. 8, Issue 5, No 2, 2011, pp. 331-336. pp. 281–297.
[8]Zhang Chen, Xia Shixiong, “K-means
Clustering Algorithm with improved Initial

Guidelines For Control of Flow - Accelerated Corrosion in Fossil and Combined Cycle Plants
No ratings yet
Guidelines For Control of Flow - Accelerated Corrosion in Fossil and Combined Cycle Plants
204 pages
V5I5201647
No ratings yet
V5I5201647
13 pages
A Review On K Means Clustering
No ratings yet
A Review On K Means Clustering
7 pages
A Genetic K-Means Clustering Algorithm Based On The Optimized Initial Centers
No ratings yet
A Genetic K-Means Clustering Algorithm Based On The Optimized Initial Centers
7 pages
A Novel Approach For Data Clustering Using Improved K-Means Algorithm PDF
No ratings yet
A Novel Approach For Data Clustering Using Improved K-Means Algorithm PDF
6 pages
An Efficient Incremental Clustering Algorithm
No ratings yet
An Efficient Incremental Clustering Algorithm
3 pages
An Improvement in K Means Clustering Algorithm IJERTV2IS1385
No ratings yet
An Improvement in K Means Clustering Algorithm IJERTV2IS1385
6 pages
Efficient K-Means Clustering Algorithm Using Feature Weight and Min-Max Normalization
No ratings yet
Efficient K-Means Clustering Algorithm Using Feature Weight and Min-Max Normalization
4 pages
A Dynamic K-Means Clustering For Data Mining-Dikonversi
No ratings yet
A Dynamic K-Means Clustering For Data Mining-Dikonversi
6 pages
Research On K-Means Clustering Algorithm An Improved K-Means Clustering Algorithm
No ratings yet
Research On K-Means Clustering Algorithm An Improved K-Means Clustering Algorithm
5 pages
na2010
No ratings yet
na2010
5 pages
Implementing and Improvisation of K-Means Clustering: International Journal of Computer Science and Mobile Computing
No ratings yet
Implementing and Improvisation of K-Means Clustering: International Journal of Computer Science and Mobile Computing
5 pages
Ijert Ijert: Enhanced Clustering Algorithm For Classification of Datasets
No ratings yet
Ijert Ijert: Enhanced Clustering Algorithm For Classification of Datasets
8 pages
K-Means Clustering
No ratings yet
K-Means Clustering
8 pages
Ijcset 2016060701
No ratings yet
Ijcset 2016060701
3 pages
MMZ XRF O0 Ra Pre 0 ZB XGXW W1 Er 02 OAYQum QDD78 HQP
No ratings yet
MMZ XRF O0 Ra Pre 0 ZB XGXW W1 Er 02 OAYQum QDD78 HQP
4 pages
Research on k Mean Algorithm
No ratings yet
Research on k Mean Algorithm
5 pages
anupama luthra_2011
No ratings yet
anupama luthra_2011
21 pages
The International Journal of Engineering and Science (The IJES)
No ratings yet
The International Journal of Engineering and Science (The IJES)
4 pages
Comprehensive Review of K Means Clustering Algorithms1
No ratings yet
Comprehensive Review of K Means Clustering Algorithms1
6 pages
A Dynamic K-Means Clustering For Data Mining
No ratings yet
A Dynamic K-Means Clustering For Data Mining
6 pages
K-Means Clustering Algorithm and Its Improvement R
No ratings yet
K-Means Clustering Algorithm and Its Improvement R
6 pages
Enhancing The Exactness of K-Means Clustering Algorithm by Centroids
No ratings yet
Enhancing The Exactness of K-Means Clustering Algorithm by Centroids
7 pages
Impact of Outlier Removal and Normalization Approa
No ratings yet
Impact of Outlier Removal and Normalization Approa
6 pages
An Improved K-Means Cluster Algorithm Using Map Reduce Techniques To Mining of Inter and Intra Cluster Datain Big Data Analytics
No ratings yet
An Improved K-Means Cluster Algorithm Using Map Reduce Techniques To Mining of Inter and Intra Cluster Datain Big Data Analytics
12 pages
Application of The K-Means Clustering Algorithm in Medical Claims Fraud / Abuse Detection
No ratings yet
Application of The K-Means Clustering Algorithm in Medical Claims Fraud / Abuse Detection
10 pages
An_Efficient_Fuzzy_Clusnjkstering_Algorithm
No ratings yet
An_Efficient_Fuzzy_Clusnjkstering_Algorithm
10 pages
Comparative Analysis of Kmeans Technique On Non Convex Cluster
No ratings yet
Comparative Analysis of Kmeans Technique On Non Convex Cluster
7 pages
1 s2.0 S0020025522014633 Main
No ratings yet
1 s2.0 S0020025522014633 Main
33 pages
1 A Modified Version
No ratings yet
1 A Modified Version
7 pages
Statistical Considerations On The K - Means Algorithm
No ratings yet
Statistical Considerations On The K - Means Algorithm
9 pages
PRJ C MR 18
No ratings yet
PRJ C MR 18
4 pages
I Jsa It 04132012
No ratings yet
I Jsa It 04132012
4 pages
Data Clustering Using Kernel Based
No ratings yet
Data Clustering Using Kernel Based
6 pages
A Tutorial On Clustering Algorithms
No ratings yet
A Tutorial On Clustering Algorithms
4 pages
Analysis&Comparisonof Efficient Techniquesof
No ratings yet
Analysis&Comparisonof Efficient Techniquesof
5 pages
A Comparative Study of K-Means, K-Medoid and Enhanced K-Medoid Algorithms
No ratings yet
A Comparative Study of K-Means, K-Medoid and Enhanced K-Medoid Algorithms
4 pages
Fast_and_Robust_General_Purpose_Clustering_Algorit
No ratings yet
Fast_and_Robust_General_Purpose_Clustering_Algorit
29 pages
Journal of Computer Applications - WWW - Jcaksrce.org - Volume 4 Issue 2
No ratings yet
Journal of Computer Applications - WWW - Jcaksrce.org - Volume 4 Issue 2
5 pages
Generalized Markov Chain Monte Carlo Initialization For Clustering Gaussian Mixtures Using K-Means
No ratings yet
Generalized Markov Chain Monte Carlo Initialization For Clustering Gaussian Mixtures Using K-Means
5 pages
Balanced K-Means Revisited-1
No ratings yet
Balanced K-Means Revisited-1
3 pages
Comprehensive Review of K-Means Clustering Algorithms
No ratings yet
Comprehensive Review of K-Means Clustering Algorithms
5 pages
Data Mining Project: Cluster Analysis and Dimensionality Reduction in R Using Bank Marketing Data Set
No ratings yet
Data Mining Project: Cluster Analysis and Dimensionality Reduction in R Using Bank Marketing Data Set
31 pages
Standardization and Its Effects On K-Means Clustering Algorithm
No ratings yet
Standardization and Its Effects On K-Means Clustering Algorithm
6 pages
Lecture 14 Clustering
0% (1)
Lecture 14 Clustering
57 pages
K Means Clustering Lecture
No ratings yet
K Means Clustering Lecture
32 pages
Amalgam Clustering Algorithm
No ratings yet
Amalgam Clustering Algorithm
9 pages
Azimi 2017
No ratings yet
Azimi 2017
26 pages
An Improved K-Means Algorithm Based On Mapreduce and Grid: Li Ma, Lei Gu, Bo Li, Yue Ma and Jin Wang
No ratings yet
An Improved K-Means Algorithm Based On Mapreduce and Grid: Li Ma, Lei Gu, Bo Li, Yue Ma and Jin Wang
12 pages
A Discretization Method For Industrial Data Based On Big Data Technology
No ratings yet
A Discretization Method For Industrial Data Based On Big Data Technology
3 pages
Unit 4
No ratings yet
Unit 4
74 pages
genedata doc
No ratings yet
genedata doc
67 pages
JETIR1503025
No ratings yet
JETIR1503025
4 pages
K-Means and MAP REDUCE Algorithm
No ratings yet
K-Means and MAP REDUCE Algorithm
13 pages
K-Means Clustering Method For The Analysis of Log Data
No ratings yet
K-Means Clustering Method For The Analysis of Log Data
3 pages
Determination of The Number of Cluster A Priori Using A K-Means Algorithm
No ratings yet
Determination of The Number of Cluster A Priori Using A K-Means Algorithm
3 pages
Comparative Analysis of K-Means and Fuzzy C-Means Algorithms
No ratings yet
Comparative Analysis of K-Means and Fuzzy C-Means Algorithms
5 pages
Unit 4
No ratings yet
Unit 4
5 pages
Cluster Evaluation Techniques: Atds Assignment
No ratings yet
Cluster Evaluation Techniques: Atds Assignment
4 pages
Analysis and Study of K Means Clustering Algorithm IJERTV2IS70648
No ratings yet
Analysis and Study of K Means Clustering Algorithm IJERTV2IS70648
6 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Type_Supplement_Fighters.Jan2018
No ratings yet
Type_Supplement_Fighters.Jan2018
3 pages
CM120
No ratings yet
CM120
6 pages
Power Factor Improvement Using Series Capacitors
100% (1)
Power Factor Improvement Using Series Capacitors
5 pages
Omni-Control
No ratings yet
Omni-Control
12 pages
Application No Divyanshu
No ratings yet
Application No Divyanshu
1 page
Exercise Meeting 2: Following Question 1 - 3
No ratings yet
Exercise Meeting 2: Following Question 1 - 3
4 pages
Aua Kua Webinar
No ratings yet
Aua Kua Webinar
29 pages
Internal Assessment: Design Project - Design Technology - 2019-2021
No ratings yet
Internal Assessment: Design Project - Design Technology - 2019-2021
3 pages
Short Reinforcement Long Reinforcement: +ive in Span Due To
No ratings yet
Short Reinforcement Long Reinforcement: +ive in Span Due To
14 pages
Ficha Tecnica Generador
No ratings yet
Ficha Tecnica Generador
10 pages
Water Distribution System Modeling by Using Epanet 2.0, A Case Study of Cuet
No ratings yet
Water Distribution System Modeling by Using Epanet 2.0, A Case Study of Cuet
12 pages
Create Table SQL
No ratings yet
Create Table SQL
18 pages
Stabilization of Soil Using Fly Ash With Ground Granulated Blast Furnace Slag (GGBS) As Binder
No ratings yet
Stabilization of Soil Using Fly Ash With Ground Granulated Blast Furnace Slag (GGBS) As Binder
8 pages
Haswanth P Ram Varada - CV
No ratings yet
Haswanth P Ram Varada - CV
3 pages
Califlower Instructions V3.0
No ratings yet
Califlower Instructions V3.0
14 pages
Istar Edge Install Quick Start Guide Ra3 - LT - en
No ratings yet
Istar Edge Install Quick Start Guide Ra3 - LT - en
26 pages
Id Now: Instrument User Manual
No ratings yet
Id Now: Instrument User Manual
77 pages
Discovery: Science Engineering Technology at AWE
No ratings yet
Discovery: Science Engineering Technology at AWE
23 pages
Loop Calibrator PDF
No ratings yet
Loop Calibrator PDF
2 pages
Project
No ratings yet
Project
44 pages
Reading Your Eyeglass Prescription - How To Crack The Code - LasikPlus
No ratings yet
Reading Your Eyeglass Prescription - How To Crack The Code - LasikPlus
6 pages
User's Manual DM2 Digital Motor Protection, Overload Protection, Overcurrent-Time Protection
No ratings yet
User's Manual DM2 Digital Motor Protection, Overload Protection, Overcurrent-Time Protection
111 pages
CDM201_372 2025 01 ASGMT
No ratings yet
CDM201_372 2025 01 ASGMT
7 pages
Wind Loading (ASCE7-05)
No ratings yet
Wind Loading (ASCE7-05)
2 pages
Pv Array Layout_3.10MWac _Bandikui,Rajsthan
No ratings yet
Pv Array Layout_3.10MWac _Bandikui,Rajsthan
1 page
Usertransactionsgift Cards Infogift Order Id 1682555692923740161
No ratings yet
Usertransactionsgift Cards Infogift Order Id 1682555692923740161
1 page
ch 3 part 2-1
No ratings yet
ch 3 part 2-1
12 pages
14 Top Scripting Languages You Can Learn
No ratings yet
14 Top Scripting Languages You Can Learn
4 pages
Powerroc T50: Surface Drill Rig For Quarrying and Open Pit Mining
No ratings yet
Powerroc T50: Surface Drill Rig For Quarrying and Open Pit Mining
5 pages

Normalization Based K Means Clustering Algorithm

Uploaded by

Normalization Based K Means Clustering Algorithm

Uploaded by

Normalization based K means Clustering Algorithm

Deepali Virmani1,Shweta Taneja2,Geetika Malhotra3

Abstract- K-means is an effective clustering partitioning. In hierarchical clustering, the clusters

III. PROPOSED N- K MEANS ALGORITHM

ALGORITHM 1: Steps of N-K means Algorithm

INPUT: A dataset with d dimensions

1. Load initial data set.

Figure 1: shows the steps of N-K means algorithm

ALGORITHM 2: Initialization of centroids

1. Calculate the average score of each data point.

2. Sort the data based on average score .

3. Divide the data based on k subsets.

4. Calculate the mean value of each subset.

Figure 2: Algorithm to calculate initial number of centroids

Value of k Algorithm Time taken Speed

Figure 3: shows comparison between K-means and

The K-means clustering algorithm is widely used

You might also like