Assignment Data Analysis Example
Assignment Data Analysis Example
TABLE OF CONTENTS
INTRODUCTION.................................................................................................................. 1
AIM, OBJECTIVE AND POSSIBLE OUTCOME.........................................................................2
BACKGROUND.................................................................................................................... 3
RESEARCH PROJECT.......................................................................................................... 4
DATA MINING PROBLEM (BUSINESS PROBLEM)................................................................5
DATA UNDERSTANDING.................................................................................................... 5
DATA PREPARATION......................................................................................................... 6
DATA MINING METHODOLOGY.......................................................................................... 6
RESULT EVALUATION....................................................................................................... 6
DEPLOYMENT.................................................................................................................. 6
PLAN AND TIME TABLE..................................................................................................... 7
BUDGET............................................................................................................................. 8
PERSONNEL....................................................................................................................... 8
INTRODUCTION
There are various ways in which customer complaint management has been done.
Most of it by analysing the overall trend of the customer. In this proposal what we
want to do here is to track the overall trend of individual customer and predict the
likeliness of a customer having a complaint. Why wait for customer to reach us to
complaint. Lets reach the customer to file their complaints or issues that they are
facing. By doing this we can retain customers, avoid customer churns, improve
brand recognition, increase subscribers and finally increase in revenue.
To make this possible we are going to generate customer complaint analysis model
which helps the company know which customers are likely to file a complaint. So that
the company can handle its customers into a satisfied customer than an unhappy
one.
The aim and objectives of this proposal has been stated below:
AIM: The main aim of this project is to develop an approach to predict the likelihood
of a customer complaint for the customer complaint management.
Objective:
1) Collect a dataset of the customer complaints details from various sources
such as companies various departments, websites, call centres and in
Vodafone case collect from Vodafail.com
2) Divide the data set into training set and test set
3) Develop a customer complaint analysis model
4) Validate and Deploy the model developed
This project is based on CRISPDM Approach. In addition to that this project is a pilot
project and is concentrated on the low coverage pilot phase.
BACKGROUND
This phase starts with initializing data collection and documenting its report, describe
data report, explore data report and verified the data quality report if the data need
audit or not. The data we will be utilizing with be the existing data that Vodafone
should have such as customer name, customer phone number, phone status,
network status, location, plan used, age, gender, history of complaints, social media
information and has customer complained. The only data that needs to be verified is
will be the social media information.
DATA PREPARATION
In this process, we will be selecting data, cleaning data, constructing the data,
integrating data and formatting the data.
We will be selecting the data i.e. list of data that needs to be included or excluded.
Cleaning data will ensure the quality of the data. Construction of the data will include
the construction of attributes from other attributes. Integration of the data means
merging different data. Sometimes you need to change the order of the data this
comes under formatting the data.
Except for the data from social media the data that are being collected from the
organisation itself. Data such as Customer name and phone number is needed to
identify the customer, Date of Birth and Gender is to know the demographics which
will be very useful for analysing the data, location, network state and phone state of
a customer should show if the customers in that location or any individual customer
is having trouble with the network or phone service, with the history and social media
we can see the responds towards the phone services that Vodafone has given, we
can know if it is a positive or a negative one. New instances attribute and tables
might be needed to constructed and integrated to record new data.
DATA MINING METHODOLOGY
The pre-processed dataset collected will be divided into the 65% as training set and
35% as test set to develop the classifier. The tool we will be using for this data
analysis to build the classifiers will be either Weka or Rattle depending on how large
is the dataset.
In this phase we generate the model in which we will be generating profile of the
customers through the data stored. The Design will be tested in the small area where
the test will be implemented to know about the quality and validity of the data. After
that we will prepare data to build the final model and finally the model will be
assessed if it was a success or not.
RESULT EVALUATION
In this phase the result of the model is evaluated. The process is reviewed to assure
if everything has been included and not missed out to solve business problem. Last
step in this phase is to determine the next steps that is if the model needs additional
requirements or we can move to deployment
DEPLOYMENT
After doing the pilot phase in the small area we can move to bigger area and deploy
the model created. Similarly, the same process goes again and monitor if the model
needs to be upgraded if the size of the area has been increased. This is how the
plan of deployment is done.
PLAN AND TIME TABLE
The plan below is a rough schedule of different phases of data mining project. This
phases can be repetitive; hence the duration is presented in elapsed time. This plan
will be iterative for new targeted location once the whole phase is completed for the
targeted location.
The plan has the duration of 12 months
Start Date
Phase
Duration
Feb, 2016
Business
Understanding
Data
Understanding
Data Preparation
1 month
July, 2016
Data Mining
Methodology
4 months
November, 2016
Result Evaluation
1month
December, 2016
Deployment
2 months
March, 2016
June, 2016
2.5 months
1.5 months
The above image is a gantt chart of the project. The phases and the details of the
project are listed in the table above the gantt chart.
The outcome of this project will denote the customers who are likely to have
complaints, reach the customers who are likely to complain by creating strategy
before the customer complains is filed. This will be a warning trigger to the
organization to become aware of the unsatisfied customers. If the unsatisfied
customers are handled well will turn into a satisfying customer which will be
generating revenue to the company.
BUDGET
The Project Budget shows the estimation of all the cost required to complete the
project. The budget of $13 million has been allocated according the below table.
Given, the fixed budget for the project Bottom-Up approach has been considered for
estimation of the cost of each components of the budget as it decides on how much
will it cost for each budget component and it is accumulated at last.
The estimation of the cost will be estimated according to the past projects to make it
more reasonable, realistic and achievable.
Budget Components and its estimation are shown below:
Budget Component
Personnel Expenses (such as Salary of BA, DA)
Equipment
Data collection from other social media
Other Resources Required (Material and Supplies0
Deployment cost
Total
A$ (million)
3.5
3
1.5
1.5
3.5
13
PERSONNEL
The personnel required for this project are as follows:
Top Management / Executives: Includes CEO, COO and CIO who will be involved in
sponsoring the project and approving the requirements
Data Analyst/Data Miner: The person who will be analysing the data and building the
model. They will be involved in every step of this project.
Business Analyst: They will evaluate the project and document the requirements to
be handed over to project manager. They are involved at the beginning of the project
and at the end of the project when the model is ready for evaluation and deployment.
Employees: People who work at the organization. Middle and base level employees.
They will be involved in the deployment phase to start operation of the model.