Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
137 views

Program 7-EM Algorithm-K Means Algorithm

This document summarizes and compares the results of clustering a iris dataset using k-means and Gaussian mixture model (GMM) algorithms. It loads the iris data, preprocesses it, applies each algorithm, and visualizes the cluster assignments. It observes that GMM using expectation-maximization achieved clusterings that more closely matched the true labels than k-means.

Uploaded by

Vijay Sathvika B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
137 views

Program 7-EM Algorithm-K Means Algorithm

This document summarizes and compares the results of clustering a iris dataset using k-means and Gaussian mixture model (GMM) algorithms. It loads the iris data, preprocesses it, applies each algorithm, and visualizes the cluster assignments. It observes that GMM using expectation-maximization achieved clusterings that more closely matched the true labels than k-means.

Uploaded by

Vijay Sathvika B
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Fapodkaicr i 3ation a

id
Chaibna. M
8) Apply EM algorithm to cluster a set of data stored in a CSV file.
Use the same data set for
clustering using k-Mcans algorithm.
Compare the results of these two algorithms and comment on the
quality of clustering. You can add Java/Python ML library
classes/API in the program. n
Mhi ubray (Usés Sattes rLes
bau a u h , Shg r a m d
import matplotlib.pvplot as plt i e CeAK e
from sklearn import datasets
from sklearn.cluster import KMeans
-climcnSien
import pandas as pd DatoFrame a
import numpy as np
dlata Sruture ie dkta us
allgnud n a tabulal shia e

# import some data to play with


iris datasets.load_iris()
X pd.DataFrame(iris.data)
X.columns= 'Sepal_Length,Sepal_Width', Petal_Length',Petal_Width'
yEpd.DataFrameiris.target)
y.columns = ['Targets]

#Build the KMeans Model


model KMeans(n_clusters=3)
#model.labels_: Gives cluster no for which samples belongs to numbe
ht the model.fitl Tvaius model epodo
# Visualise the cluster'ngresults kakes a turl liuda(ns
plt.igure(figsize=(14,14)) s i 3 . dcécisEl )
QsGumenu
Vclus," gfel
colormap= np.array(l'red', 'lime,"black']) Aeigie
Oidti
idh en te invhes
#Plot the Original Classifications using Petal features
S mallo n
plt.subplot(2, 2, 1)
plt.scatter(X.Petal_ Length, X.Petal_Width, c=colormaply.Targets], s=40)
pit.title(Real Clusters')
plt.xlabel('Petal Length) Sa reint
tn åph
plt ylabel'Petal Width')
pit.show()
ndex
-
m a s , Vce»
de
O S i Cn
hi oSidc
Ocacbi
3 kq

vaiables
v aiabes as dots
aas dots
d o s ha Stalaicabhip bethoan
Sadte pod each a&ible
en dmenSicnd Ce ausfan ,
caltec
catte lo
SrattelPlOE_mahia

1 ngth
Pee Re Biala data

Real Clusters Ciqnali lanyfatict


tandaidi
NERmali3a
Binari3a
aa
data
data
data

ylahet Vi
n
OASicou)

dundi
e n T t oJ a b e l

e t s l Length
,l2
x.Aabel

# Plot the Models Classifications


plt.subplot(2, 2, 2)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[model.labels_1, s=40)
plt.title(K-Means Clustering')
plt.xlabel(Petal Length')
plt.ylabel('Petal Width')
plt.show()

K-Means Clustenng

2 St4 me vêhmal disubulien

Stcukd
CLEiation=|
Petal Length
SrmetimA doke pornt Li be O, O-meen
i.e data pstnta
ho&mase ull be
#General EM for GMM Sakeled/
from sklearn import preprocessin8 data.
# transform your data such that its distribution will have a
# mean value 0 and standard deviation of 1.

ns'mcke scaler =preprocessing.StandardScaler() dantdandi Se the d d e


NCAÁAG nan Salio
Fe

scaler.fit(X Salingo
xsa = scaler.transform(XFF hanbtn he data
xS =pd.DataFrame(7Sa, columns =X.coumns dod ekoen 0 o
from sklearn.mixture import GaussianMixture h alla estimale ka
Bmm =GaussianMixture(n_components-3) T CWam
ansm mürhua Ushbubivn
gmm.fit(xs) PaamataS f UaLmcun du
Standotd
maan
Salast(*) CCrmule a
e upd alea
8 calca. by Coniei J
ron(x) pein tardendiZLicn

St(x[.y3) # CSthnale model pauamaks hum £H l


Poedit enavculen upul PsedichCns d& dhe r mlis.
uSt necduen -(dcle ponts)
Pebun Cmpenant label
dho Jaben d dat Jmples
gmm_y gmm.predict(xs) # re cl cl X X wine hviunad
plt.subplot(2, 2,3)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormaplgmm_yl, s=40) rmod
plt.title(GMM Clustering')
plt.xlabel 'Petal Length') Si d o t Poin
plt.ylabel('Petal Width')
plt.show()
print 'Observation: The GMM using EM algorithm based clustering matched the true labels more closely
than the Kmeans.)

GMM Clusternng

Clac

2 6
Petal Length
based clustering matched the true
Observation: The GMM using EM algorithm
labels more closely than the Kmeans

mas keg Si 3e 4/88


Y VS x t i

vsed Kid den deda"


idden dota
CDtrrerel4 icdely

You might also like