ANN Report
ANN Report
ANN Report
PROJECT REPORT
Associate Professor
SCHOOL OF COMPUTING
KATTANKULATHUR
NOVEMBER 2023
BONAFIDE
Certified that this minor project report for the course 18CSE388T ARTIFICIAL
undertook the task of completing the project within the allotted time.
SIGNATURE SIGNATURE
Dr. A. Alice Nithya Dr Annie Uthra R
ANN – Course Faculty Head of the Department
Associate Professor Professor
Department of Computational Intelligence Department of Computational Intelligence
SRM Institute of Science and Technology SRM Institute of Science and Technology
Kattankulathur Kattankulathur
ABSTRACT
Customer churn, the phenomenon of customers switching from one service provider to another, is
a critical challenge in the highly competitive telecommunications industry. Predicting customer
churn is of paramount importance as it allows companies to take proactive measures to retain
valuable customers.
In this study, we employ Artificial Neural Networks (ANNs) as a predictive model to forecast
customer churn for a telecom company.
The analysis begins with data collection and preprocessing, involving the gathering and cleaning
of historical customer data. Features such as call duration, customer feedback, contract terms, and
billing information are carefully selected to create a comprehensive dataset for training the ANN.
The ANN model is then developed using a deep learning framework, enabling it to identify
complex patterns and relationships within the data. Through the iterative training process, the
model optimizes its internal parameters and weights, becoming increasingly adept at predicting
customer churn.
The results indicate that the ANN model exhibits a high predictive accuracy, enabling the telecom
company to identify potential churners with greater precision. This, in turn, allows the company
to implement targeted retention strategies, such as offering incentives or tailored service packages
to at-risk customers.
In conclusion, the application of Artificial Neural Networks for customer churn prediction in the
telecom industry proves to be a valuable tool for reducing customer attrition rates. By leveraging
this predictive model, telecom companies can enhance customer satisfaction, reduce revenue loss,
and maintain a competitive edge in a rapidly evolving market.
ACKNOWLEDGEMENT
We are highly thankful to our Course project Faculty Dr. A. Alice Nithya,
Associate Professor,Department of Computational Intelligence CINTEL, for
his/her assistance, timely suggestion and guidance throughout the duration of this
course project.
Finally, we thank our parents and friends near and dear ones who directly and
indirectly contributed to the successful completion of our project. Above all, I
thank the almighty for showering his blessings on me to complete my Course
project.
TABLE OF CONTENTS
1 INTRODUCTION
1.1 Motivation
1.2 Objective
1.4 Challenges
2 LITERATURE SURVEY
3 REQUIREMENT
ANALYSIS
4 ARCHITECTURE &
DESIGN
5 IMPLEMENTATION
6 EXPERIMENT RESULTS
& ANALYSIS
7 CONCLUSION
8 REFERENCES
1. INTRODUCTION
To address this challenge, the use of advanced data analytics and predictive modeling
techniques has become increasingly crucial. This project focuses on the application of
Artificial Neural Networks (ANNs) to predict customer churn for a telecom company.
The primary objective of this project is to develop an accurate and reliable churn prediction
model that will enable the telecom company to identify potential churners proactively. By
leveraging historical customer data and employing deep learning techniques, the company
can take targeted actions to retain valuable subscribers.
The project encompasses several key stages, including data collection, data preprocessing,
model development, and performance evaluation. It starts by gathering a comprehensive
dataset that contains various customer attributes, such as call behavior, contract terms, and
billing information. This dataset is carefully prepared and cleaned to ensure the accuracy and
quality of the data.
The heart of this project is the ANN model, a machine learning technique inspired by the
structure and functioning of the human brain. ANNs are particularly well-suited for
identifying intricate patterns and relationships within the data. As the model is trained
iteratively, it adjusts its internal parameters and weights to make increasingly precise churn
predictions.
1
Once the ANN model is trained and fine-tuned, it is rigorously evaluated to determine its
effectiveness. Key performance metrics, such as accuracy, precision, recall, and the area
under the ROC curve, are used to assess the model's predictive power. These metrics provide
insights into the model's ability to correctly classify customers as potential churners or
non-churners.
1.1 Motivation:
The motivation behind customer churn prediction in the telecom sector using Artificial Neural
Networks is multifaceted. It encompasses business sustainability, revenue protection, resource
optimization, customer satisfaction, competitiveness, data-driven decision-making,
technological advancements, and the pursuit of predictive insights. Addressing this challenge
is pivotal in ensuring telecom companies remain agile and competitive in a rapidly evolving
market.
The following are some motivations:
● Business Survival and Growth
● Revenue Protection
● Competitive Advantage
● Predictive Insights
1.2 Objective:
1. Develop an accurate ANN-based model for predicting customer churn in the telecom
company's subscriber base.
2. Identify the key features and variables that significantly contribute to churn prediction.
3. Implement real-time churn predictions for individual customers.
4. Segment customers based on their likelihood to churn, allowing tailored retention
strategies.
5. Assess the cost savings and revenue retention achieved through model deployment.
6. Utilize insights from churn predictions to enhance overall customer experiences.
7. Gain a competitive advantage in the telecom industry through effective customer
retention.
2
1.2 Problem Statement:
The problem at hand is to develop a robust and accurate Customer Churn Prediction model
using Artificial Neural Networks (ANNs) for a telecom company. The primary goal is to
identify customers at risk of churning based on historical data and patterns. This model will
enable the telecom company to take timely and personalized retention actions, thereby
reducing churn rates and enhancing overall customer satisfaction.
1.3 Challenges:
1. Data Quality and Quantity: Telecom companies deal with vast amounts of customer
data. Ensuring the quality, completeness, and accuracy of this data is a challenge.
Incomplete or noisy data can lead to model inaccuracies.
2. Data Privacy and Ethical Concerns: With sensitive customer information at stake,
ensuring data privacy and complying with regulations like GDPR is a significant
challenge. Balancing data utility with privacy is a complex task.
3. Feature Selection: Identifying the most relevant features from a plethora of data variables
is challenging. Overlooking important features or including irrelevant ones can affect
model performance.
4. Imbalanced Datasets: Customer churn datasets often exhibit class imbalance, with a
small proportion of customers actually churning. This imbalance can affect the model's
ability to predict churn accurately.
5. Model Complexity: Building and training deep learning models like ANNs requires
expertise. Determining the optimal architecture, hyperparameters, and avoiding overfitting
are ongoing challenges.
6. Real-time Processing: Implementing real-time churn prediction systems can be
challenging due to latency concerns. The model should be capable of making predictions
within the required time frame.
3
2. LITERATURE SURVEY
4
that indicates
whether the
customer left the
bank or retained
customer.
5. 2022 Customer Machine Data set used are Furthermore, there are
Churn learning Churn and Non working towards
Prediction churn data sets developing a machine
Using Machine learning model by
Learning: incorporating additional
Commercial attributes for the customer
Bank of churn prediction using
Ethiopia different machine learning
techniques.
6. 2022 Implementation the Synthetic Random Forest is So, in the future, we may
of Machine Minority a supervised expand our research to
Learning Oversampling machine learning incorporate neural
Techniques for Technique was approach networks for forecasting
predicting applied. commonly used client attrition.
Credit Card to address Experiments may be
5
Customer regression and expanded to include data
action classification from other fields such as
problems for insurance.
Imbalanced
Dataset.
6
10. 2020 Customer churn K-means and The data is This study enables the
prediction in support vector clustered into 3 banking administrators to
banking machine labels, on the mine the conduct of their
industry using algorithms. basis of the customers and may
K-means and transaction in and prompt proper strategies
support vector outflow. as per engaging quality
machine and improve proper
algorithms conducts of administrator
capacities in customer
relationship.
11. 2019 Enhanced deep Enhanced Deep Enhanced Deep The outcome demonstrates
feed forward Feed Forward Feed Forward that the proposed
neural network Neural Network Neural Network Enhanced Deep Feed
model for the Model Model Forward Neural Network
customer Employed in Model performs best in
attrition telecoms accuracy compared with
analysis in datasets. the existing machine
banking sector learning model in
predicting the customer
attrition rate with the
Banking Sector.
7
Discriminant algorithm binary generalization ability of
Boosting classification LD-Boosting.
algorithm problems, such as
churn prediction
with an
extremely
imbalanced
dataset.
14. 2023 Minimization of To forecast The DMEL Future research will take
Churn Rate customer method was into account enhancing
Through attrition, K demonstrated to the system performance
Analysis of means and a be inapplicable in using more sophisticated
Machine Naive Bayes a situation with a machine learning
Learning classifier are sizable dataset. algorithms such a neural
combined in a network, SVM, along with
special way sophisticated assembly
techniques like boosting,
bagging.
Limitations:
These studies collectively underline the growing trend of utilizing Artificial Neural Networks
and other deep learning techniques for customer churn prediction in the telecom industry.
Researchers have been experimenting with various ANN architectures, feature engineering
methods, and data preprocessing techniques to enhance the accuracy and reliability of churn
prediction models. Additionally, the studies emphasize the importance of real-time
predictions, highlighting the applicability of ANN in handling dynamic and evolving
customer data.
8
3. REQUIREMENTS
9
3.2 Hardware Requirement
10
3.3. SOFTWARE REQUIREMENTS
1. Operating System:
2. Programming Language:
Python or another appropriate language for implementing machine learning models and
data processing.
Relevant libraries and frameworks for machine learning and data processing (e.g.,
scikit-learn, TensorFlow, PyTorch).
If needed for data storage, select an appropriate database system (e.g., PostgreSQL,
MySQL).
Web development languages (HTML, CSS, JavaScript) or relevant software for creating a
user-friendly interface.
11
4. ARCHITECTURE AND DESIGN
Fig. 4.1
Input layer: The input layer is responsible for receiving the input data and passing it on to
the next layer. The input layer neurons are not connected to each other, and they simply pass
on the input data to the next layer without any processing.
Hidden layers: The hidden layers are responsible for learning complex relationships between
the input data and the output data. The hidden layers are made up of interconnected neurons,
and each neuron in a hidden layer learns a different set of weights. The weights of the hidden
layer neurons are learned during the training process.
Output layer: The output layer is responsible for making the final prediction. The output
12
layer neuron takes the outputs of the hidden layer neurons as input and produces a single
prediction as output. The output of the output layer neuron is a probability distribution over
the possible outputs of the model.
The Model has 2 hidden layers, 1 input layers and 1 output later.
The input layer has 19 neurons as we have 19 independent variables in the dataset.
And 2 hidden layers with 12 neurons whose number is determined by repetative
experimentation to achieve the maximum accuracy.
And have 1 output layer with 1 neuron as we are using for this for classification problem that
has only 2 possible outputs.
The hidden layers works on ReLU activation function. The Rectified Linear Unit (ReLU) is
a widely used activation function in neural networks. It introduces non-linearity by outputting
the input directly if it's positive, otherwise, it outputs zero.
And we used ADAM optimizer for the model and Binary Cross Entropy for loss evaluation.
Adam optimizer is a stochastic gradient descent (SGD) optimizer that is based on adaptive
estimation of first-order and second-order moments. It is one of the most popular optimizers
in machine learning due to its efficiency and effectiveness.
Binary cross entropy (BCE) is a loss function used to train models for binary classification
tasks. It measures the difference between the predicted probabilities and the actual labels.
Accuracy metrics measure the performance of machine learning models by evaluating how
well they can predict the correct output for a given input.
13
5. DATASET DESCRIPTION
Key Attributes:
4. Geography: Location or country where the customer resides, providing insights into regional
patterns of churn.
5. Gender: Customer's gender, a demographic factor that may contribute to varying churn
behaviors.
6. Age: Age of the customer, a significant factor influencing financial decisions and potentially
affecting churn.
7. Tenure: Duration of the customer's relationship with the company, indicating loyalty or
potential dissatisfaction.
8. Balance: Amount of money in the customer's account, reflecting financial stability and
engagement.
9. NumOfProducts Number of products or services the customer uses, indicating the extent of
their involvement with the company.
10. HasCrCard: Binary indicator of whether the customer possesses a credit card, influencing
their financial behavior.
11. IsActiveMember: Binary indicator of the customer's activity, signaling engagement and
potential loyalty.
12. Estimated Salary: Approximate annual salary of the customer, a factor influencing their
financial decisions and propensity to churn.
13. Exited: Binary indicator of whether the customer has churned (1 for yes, 0 for no), the target
variable for prediction.
14
Objective:
The primary objective of using this dataset is to develop a machine learning model that
can accurately predict the churn customers based on the provided features.
The model is trained to distinguish between churn and non-churn customers, ultimately
aiming customers churn prediction accurately.
Use Case:
The dataset is commonly used for binary classification tasks and serves as a valuable
resource for evaluating the performance of machine learning algorithms, particularly in
the context of banking and financial purposes .
Researchers and data scientists use this dataset to build predictive models for Customer
Churn Predictions, which can have significant banking purposes.
Data Source:
The dataset is often available through various machine learning libraries, repositories, or
research institutions focused on cancer diagnosis and research.
This dataset is widely utilized for educational, research, and practical applications, with
the goal of improving the accuracy and efficiency of Customer Churn Prediction and
contributing to the development of early business strategies.
15
6. IMPLEMENTATION
import numpy as np
import pandas as pd
import tensorflow as tf
dataset=pd.read_csv("Churn_Modelling.csv")
X=dataset.iloc[:,3:-1].values
Y=dataset.iloc[:,-1].values
print(X)
print(Y)
print(X)
print(X)
16
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
ann = tf.keras.models.Sequential()
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))
\
ann.add(tf.keras.layers.Dense(units=6, activation='relu'))
ann.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))
17
7. RESULTS AND DISCUSSION
Results:
18
Tesing Analysis:
Accuracy Analysis:
The ANN model is evaluated by Accuracy metrics in which the model has achieved 82% of
accuracy.·
The code provides valuable insights into the process of breast cancer prediction using
machine learning techniques.
Exploratory analysis and correlation checks are essential for understanding the data and
feature relevance.
19
8. CONCLUSION
In conclusion, the development of a customer churn prediction system for a telecom company
using an Artificial Neural Network (ANN) model represents a pivotal step in the modern
telecommunications industry. The objective of this project is to tackle the persistent challenge of
customer churn, a phenomenon that can significantly impact a telecom company's bottom line. By
harnessing the power of ANN, this predictive system empowers the company to analyze and
interpret vast amounts of customer data and behavioral patterns.
The importance of this project cannot be overstated, as it has the potential to revolutionize
customer relationship management within the telecom sector. By proactively identifying potential
churn risks, the company can tailor retention strategies and marketing efforts, ultimately ensuring
higher customer satisfaction and loyalty. This, in turn, can lead to enhanced revenue and
long-term profitability.
The implementation of this ANN-based churn prediction system serves as a testament to the
evolving landscape of telecommunications. As technology continues to advance, telecom
companies that invest in predictive analytics and artificial intelligence models are better equipped
to meet the ever-growing demands and expectations of their customers. In the competitive
telecom industry, the ability to foresee and mitigate customer churn is a vital step toward
sustainable success.
20
REFERENCES
Certainly, here are a few reference links related to Customer Chrun prediction using Artificial
Neural Networks (ANN):
21