Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Assignment 5

The Ames Housing Project analyzes housing sales by categorizing houses into four distinct clusters based on key variables such as Sale Price, Gr_Liv_Area, Lot Area, and Overall Quality. Utilizing descriptive and cluster analysis, the project aims to provide insights for real estate stakeholders to understand market trends and buyer preferences. The findings highlight significant correlations and enable targeted strategies for future housing sales.

Uploaded by

anshulpuri50
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Assignment 5

The Ames Housing Project analyzes housing sales by categorizing houses into four distinct clusters based on key variables such as Sale Price, Gr_Liv_Area, Lot Area, and Overall Quality. Utilizing descriptive and cluster analysis, the project aims to provide insights for real estate stakeholders to understand market trends and buyer preferences. The findings highlight significant correlations and enable targeted strategies for future housing sales.

Uploaded by

anshulpuri50
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

BUSINESS INTELLIGENCE

ASSIGNMENT – 5
Submi&ed By – Pankhuri Mishra

23/UMBA/72

MBA Sec<on – B

AMESHOUSING PROJECT

Execu5ve Summary
This project focuses on uncovering pa3erns in housing sales by categorizing houses into
dis9nct clusters based on their characteris9cs. Through detailed descrip9ve and cluster
analyses, key variables such as Sale Price, Gr_Liv_Area (above ground living area), Lot Area,
and Overall Quality have been iden9fied as cri9cal factors influencing these groupings.
Leveraging these variables, the data has been systema9cally divided into four dis9nct clusters,
providing a clearer understanding of the rela9onships between housing a3ributes and their
impact on market segmenta9on.

Introduc5on
The real estate market is inherently complex, influenced by a mul9tude of factors that
determine property values and buyer preferences. To make informed decisions, it is essen9al
to analyze and group houses based on shared characteris9cs. This project addresses this need
by employing sta9s9cal techniques to explore the rela9onships between various housing
a3ributes and categorizing the dataset into clusters.

Using descrip9ve analysis, cri9cal variables such as Sale Price, Gr_Liv_Area, Lot Area, and
Overall Quality were iden9fied as significant determinants in shaping housing clusters.
Subsequently, a cluster analysis was conducted to group the houses into four categories, each
represen9ng a unique combina9on of features. This segmenta9on provides ac9onable
insights, aiding stakeholders such as developers, real estate professionals, and policymakers
in understanding market trends, targe9ng specific buyer segments, and making data-driven
decisions.

By simplifying complex rela9onships into manageable categories, this study contributes to a


more structured and insighJul understanding of the housing market dynamics.
Methodology
To understand the data, following steps have been taken into account –

A) FOR DESCRIPTIVE ANALYSIS


• MicrosoM excel has been used to load the data and perform descrip9ve sta9s9cs on it
using the Data Analysis add-ins.
• Descrip9ve sta9s9cs like mean, median, mode, standard devia9on, variance, standard
error, range, etc., have been found out for the numeric variables.
• Using the above informa9on, correla9on analysis has been done to iden9fy the
strength of rela9onship between various variables and important variables have been
iden9fied that are further used for cluster analysis.
• Visual representa9ons involving sca3er plots are also created to be3er analyze the
rela9onship between various variables.

B) FOR CLUSTER ANALYSIS


• Weka has been used to perform cluster analysis on the given dataset.
• AMer loading the dataset, K-MEANS technique has been used to iden9fy different
clusters from the data set.
• Parameters like changing number of clusters and Euclidian distance has been used to
experiment with different clusters.

Body and Analysis


Figure 1 – Descrip9ve Analysis done on excel using Data Analysis ADD-INS

Figure 2 – Correla9on Analysis Summary

Findings from Descrip<ve Analysis

• The data is less skewed for Sale Price and more skewed for Lot Area.
• Ra9ng for overall quality lies between 1-9 and for overall condi9on, it lies between 3-
9.
• Strong and posi9ve correla9on exists between sales price & Ground living area, and
sales price & overall quality.
• Weak and posi9ve correla9on exists between Lot area & overall quality, and lot area
& overall condi9on.
SalePrice vs Gr_Liv_Area
350000.00

300000.00

250000.00
Sale Price

200000.00

150000.00

100000.00

50000.00

0.00
0.00 200.00 400.00 600.00 800.00 1000.00 1200.00 1400.00 1600.00
Gr_Liv_Area

SalePrice vs Lot_Area
350000.00

300000.00

250000.00
Sale Price

200000.00

150000.00

100000.00

50000.00

0.00
0.00 5000.00 10000.00 15000.00 20000.00 25000.00 30000.00
Lot_Area

Figure 3-4 – Sca3er Plot representa9ons


Figure 5 – Cluster Analysis done using Euclidian Distance
Figure 6 – Cluster Analysis done using Manha3an Distance

Findings from Cluster Analysis

Parameters Cluster 0 Cluster 1 Cluster 2 Cluster 3


Lot Area 7140 8398 8420 8524
Overall Quality 4.11 5.28 5.08 6.29
Gr_Liv_Area 956 1208 1070 1240
Sale Price 96034 145929 124433 167730
House Style 1 story 2 story 1 story 1 story

These findings will help the client to group houses sold into different category for analysis and
implemen9ng strategies for the sale of similar houses in the future.
Conclusion
Cluster analysis is a sta9s9cal technique used to group similar data points or objects into
clusters based on their characteris9cs. It helps iden9fy pa3erns or groupings within a dataset,
which can be useful for understanding rela9onships, segmenta9on, and making predic9ons.

Using this analysis, the dealers and stakeholders involved can group customers into different
clusters according to their preferences and if someone wants a unique combina9on, they can
make customized offers for that customer.

You might also like