Applications of Data Mining in The Banking Sector
Applications of Data Mining in The Banking Sector
Applications of Data Mining in The Banking Sector
Abstract
In todays globalization and cut throat competition the banks are struggling to gain a
competitive edge over each other. Apart from execution of business processes, the creation of
knowledge base and its utilization for the benefit of the bank is becoming a strategy tool to
compete. In recent years the ability to generate, capture and store data has increased
enormously. The information contained in this data can be very important. The wide
availability of huge amounts of data and the need for transforming such data into knowledge
encourage IT industry to use data mining. The banking industry around the world has
undergone a tremendous change in the way business is conducted. The banking industry has
started realizing the need of the techniques like data mining which can help them to compete
in the market. Leading banks are using Data Mining (DM) tools for customer segmentation
and profitability, credit scoring and approval, predicting payment default, marketing,
detecting fraudulent transactions, etc. This paper provides an overview of the concept of DM
and highlights the applications of data mining to enhance the performance of some of the
core business processes in banking industry.
1. Introduction
Technological innovations have aided the banking industry to open up proficient delivery
channels.
IT has helped the banking industry to deal with the challenges the new economy poses.
Nowadays, Banks have realized that customer relationships are a very important factor for
their success. Customer relationship management (CRM) is a strategy that can help them to
build long-term associations with their customers and increase their revenues and profits.
CRM in the banking sector has a great significance. The CRM focus is shifting from
customer acquisition to customer retention and ensuring the appropriate amounts of time,
money and managerial resources are directed at both of these key tasks. The challenge the
bank face is how to retain the most profitable customers and how to do that at the lowest cost.
At the same time, they need to find and implement this solution quickly and the solution to be
flexible. Traditional methods of data analysis have long been used to detect fraud. They
require difficult and time-consuming investigations that deal with different realms of
knowledge like financial, economics, business practices and law. Fraud instances can be
similar in content and appearance but usually are not identical. In developing countries like
India, Bankers face more problems with the fraudsters.
2. Literature Review
Data Mining is the process of extracting hidden, unknown, valid and actionable information
from large databases and then using this information to make crucial business decisions.
Previously unknown means quantities that are not hypothesized in advance, valid means if a
large collection of data is scrutinized; patterns are not there may be found, Actionable means
Action that must be translated into some business advantage (Han et al., 2011). Data mining
is the application of statistical and machine-learning techniques for extracting interesting
patterns from raw data (Hsu et al., 2012). Data Mining referred as knowledge mining from
data or knowledge extraction or data/pattern analysis or data archaeology or data dredging. It
turns a large collection of data into knowledge (Liao et al., 2012). With the mounting growth
of data in every application, data mining meets the valuable and efficient requirements for
effective, scalable and flexible data analysis. Data Mining is the process of identifying and
discovering the interesting patterns from massive amount of data (Mabroukeh and Ezeife,
2010). Data Mining can be conducted on any kind of data as long as the data are meaningful
for a target application. Data Mining can be considered as a natural evaluation of information
technology and a confluence of several related disciplines and application domains. (Blake
and Mangiameli, 2011)
Vivek Bhambhri in his paper Application of Data Mining in Banking Sector said that Data
Mining techniques can be of immense help to the banks and financial institutions in this
arena for better targeting and acquiring new customers, fraud detection in real time,
providing segment based products for better targeting the customers, analysis of the
customers purchase patterns over time for better retention and relationship, detection of
emerging trends to take proactive approach in a highly competitive market adding a lot more
value to existing products and services and launching of new product and service bundles.
Data mining has wide application domain almost in every industry where the data is
generated thats why data mining is considered one of the most important frontiers in
database and information systems and one of the most promising interdisciplinary
developments in Information Technology.
Dr. Madan Lal Bhasin in his paper Data Mining: A Competitive Tool in the Banking and
Retail Industries concluded that Data mining is a tool used to extract important information
from existing data and enable better decision-making throughout the banking and retail
industries. They use data warehousing to combine various data from databases into an
acceptable format so that the data can be mined. The data is then analysed and the
information that is captured is used throughout the organisation to support decision-making.
3. Data Mining
Data mining is a knowledge discovery process. It helps us understand the substance of the
data in a special unsuspected way. It unearths patterns and trends in the raw data we never
knew existed. Data mining centres around the automated discovery of new facts and
relationships in data. With traditional query tools, we search for known information. Data
mining tools enable us to uncover hidden information. The assumption is that more useful
knowledge lies hidden beneath the surface.
Data might be one of the most valuable resources of any bank but only if it knows how to
expose valuable knowledge hidden in raw data. Data mining allows extracting knowledge
from the historical data, and predicting outcomes of future situations. It helps optimize
business decisions, increase the value of each customer and communication, and improve
customer satisfaction.
the bank from the use of just these two clusters may be enormous enough so that they may
simply ignore the other eighteen clusters.
If there are only two or three variables or dimensions, it is fairly easy to spot the clusters,
even when dealing with many records. But if we are dealing with 500 variables from 100,000
records, you need a special tool. The most common clustering techniques are the K-nearest
neighbour, the Nave Bayes technique and self-organizing maps.
4.2 Decision Trees
This technique applies to classification and prediction. The major attraction of decision trees
is their simplicity. By following the tree, we can decipher the rules and understand why a
record is classified in a certain way. Decision trees represent rules. We can use these rules to
retrieve records falling into a certain category.
A decision tree represents a series of questions. Each question determines what follow-up
question is best to be asked next. Good questions produce a short series. Trees are drawn with
the root at the top and the leaves at the bottom, an unnatural convention. The question at the
root must be the one that best differentiates among the target classes. A database record enters
the tree at the root node. The record works its way down until it reaches a leaf. The leaf node
determines the classification of the record.
The decision tree algorithms build the trees in the following manner. First, the algorithm
attempts to find the test that will split the records in the best possible manner among the
wanted classifications. At each lower level node from the root, whatever rule works best to
split the subsets is applied. This process of finding each additional level of the tree continues.
The tree is allowed to grow until we cannot find better ways to split the input records.
4.3 Memory-Based Reasoning
We are all good at making decisions on the basis of our experiences. We depend on the
similarities of the current situation to what we know from past experience. We use the
experience to solve the current problem by identifying similar instances in the past, then we
use the past instances and apply the information about those instances to the present. The
same principles apply to the memory-based reasoning (MBR) algorithm.
MBR uses known instances of a model to predict unknown instances. This data mining
technique maintains a dataset of known records. The algorithm knows the characteristics of
the records in this training dataset. When a new record arrives for evaluation, the algorithm
finds neighbours similar to the new record, then uses the characteristics of the neighbours for
prediction and classification. When a new record arrives at the data mining tool, first the tool
calculates the distance between this record and the records in the training dataset. The
distance function of the data mining tool does the calculation. The results determine which
data records in the training dataset qualify to be considered as neighbours to the incoming
data record. Next, the algorithm uses a combination function to combine the results of the
various distance functions to obtain the final answer. The distance function and the
combination function are key components of the memory-based reasoning technique
4.4 Link Analysis
This algorithm is extremely useful for finding patterns from relationships. If we look at the
business world closely, you clearly notice all types of relationships. Airlines link cities
together. Telephone calls connect people and establish relationships. We notice relationships
everywhere. The link analysis technique mines relationships and discovers knowledge.
Depending upon the types of knowledge discovery, link analysis techniques have three types
of applications: associations discovery, sequential pattern discovery, and similar time
sequence discovery. Let us briefly discuss each of these applications.
critical issue. To do this, banks need to invest their resources to better understand their
existing and prospective customers. By using suitable data mining tools, banks can
subsequently offer tailor-made products and services to those customers. There are
numerous areas in which data mining can be used in the banking industry, which include
customer segmentation and profitability, credit scoring and approval, predicting payment
default, marketing, detecting fraudulent transactions, cash management and forecasting
operations, optimizing stock portfolios, and ranking investments. In addition, banks may use
data mining to identify their most profitable credit card customers or high-risk loan
applicants. To help bank to retain credit card customers, data mining is used. By analysing the
past data, data mining can help banks to predict customers that likely to change their credit
card affiliation so they can plan and launch different special offers to retain those customers.
Credit card spending by customer groups can be identified by using data mining. Following
are some examples of how the banking industry has been effectively utilizing data mining in
these areas.
5.1 Marketing
One of the most widely used areas of data mining for the banking industry is marketing. The
banks marketing department can use data mining to analyse customer databases. Data
mining carry various analyses on collected data to determine the consumer behaviour with
reference to product, price and distribution channel. The reaction of the customers for the
existing and new products can also be known based on which banks will try to promote the
product, improve quality of products and service and gain competitive advantage. Bank
analysts can also analyse the past trends, determine the present demand and forecast the
customer behaviour of various products and services in order to grab more business
opportunities and anticipate behaviour patterns. Data mining technique also helps to identify
profitable customers from non-profitable ones. The data mining techniques can be used to
determine that how customers will react to adjustments in interest rates, the risk profile of a
customer segment for defaulting on loans.
5.2 Risk Management
Data mining is widely used for risk management in the banking industry. Bank executives
need to know whether the customers they are dealing with are reliable or not. Offering new
customers credit cards, extending existing customers lines of credit, and approving loans can
be risky decisions for banks if they do not know anything about their customers. Banks
provide loan to its customers by verifying the various details relating to the loan such as
amount of loan, lending rate, repayment period, type of property mortgaged, demography,
income and credit history of the borrower. Customers with bank for longer periods, with high
income groups are likely to get loans very easily. Even though, banks are cautious while
providing loan, there are chances for loan defaults by customers. Data mining technique helps
to distinguish borrowers who repay loans promptly from those who don't. Bank executives by
using Data mining technique can also analyse the behaviour and reliability of the customers
while selling credit cards too. It also helps to analyse whether the customer will make prompt
or delay payment if the credit cards are sold to them. Credit scoring, in fact, was one of the
earliest financial risk management tools developed. Credit scoring can be valuable to lenders
in the banking industry when making lending decisions. Data mining can also derive the
credit behaviour of individual borrowers with instalment, mortgage and credit card loans,
using characteristics such as credit history, length of employment and length of residency. A
score is thus produced that allows a lender to evaluate the customer and decide whether the
person is a good candidate for a loan, or if there is a high risk of default. By knowing what
the chances of default are for a customer, the bank is in a better position to reduce the risks.
time for better retention and relationship. Those banks that have realized the usefulness of
data mining and are in the process of building a data mining environment for their decisionmaking process will obtain huge benefit and derive considerable competitive advantage in
future.
7. References
1) Vivek Bhambri Application of Data Mining in Banking Sector, International
Journal of Computer Science and Technology Vol. 2, Issue 2, June 2011
2) Dr. Madan Lal Bhasin, Data Mining: A Competitive Tool in the Banking and Retail
Industries, The Chartered Accountant October 2006
3) Han, J., M. Kamber and J. Pie, Data Mining Concepts and Techniques 3rd Ed.,
Elsevier, Burlington, ISBN-10: 9780123814807, pp: 744., 2011
4) Hsu, F.M., L.P. Lu and C.M. Lin, Segmenting customers by transaction data with
concept hierarchy 2012
5) Liao, S.H., P.H. Chu and P.Y. Hsiao, Data mining techniques and applications-A
decade review from 2000 to 2011. ,2012
6) Mabroukeh, N.R. and C.I. Ezeife, A taxonomy of sequential pattern mining
algorithms, 2010.
7) Blake, R. and P. Mangiameli, The effects and interactions of data quality and
problem complexity on classification, 2011.