Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '09)
DATA DRIVEN DECISION SUPPORT TO SUPERMARKET LAYOUT
IBRAHIM CIL1, DERYA AY1, YUSUF S. TURKAN2
1
Sakarya University, Department of Industrial Engineering, Adapazari,TURKEY
2
Beykent University, Department of Industrial Engineering, Istanbul,TURKEY
icil@sakarya.edu.tr, http://www.icil.sakarya.edu.tr, deryaayie@gmail.com, ysturkan@gmail.com,
Abstract: - Knowledge is the most valuable asset in today’s dynamic business environment. In many organizations,
decisions are made based on a combination of judgment and knowledge extracted from databases. Successful business
organization to be able to react rapidly to the changing market demands both locally and globally, by utilizing the latest
data mining techniques of extracting previously unknown and potentially useful knowledge from vast resources of raw
data. We propose a methodological framework for the use of the knowledge discovery process to improve store layout.
In this paper, we propose a data driven decision support for store layout and present an empirical study. This paper
develops a relational database and uses Apriori algorithm and multidimensional scaling techniques as
methodologies for the store layout issue. As the empirical study, a supermarket analysis has done for Migros Turk A.Ş,
a leading Turkish retailing company.
Key-Words: - Data Mining, Decision Support Systems, Association Rules, Market Basket Analysis, Apriori Algorithm,
Multidimensional Scaling Technique, Store Layout
decision making [3]. Different technologies are invented
to meet different decision making goals.
1 Introduction
Data mining is an exciting and challenging field with the
ability to solve many complex scientific and business
problems. In recent years, the field of data mining has
seen an explosion of interest from both academia and
industry [1]. Increasing volume of data, increasing
awareness of inadequacy of human brain to process data
and increasing affordability of machine learning are
reasons of growing popularity of data mining [2]. Data
Database
1.1 Data Mining and Decision Support
Data mining is the process of extracting previously
unknown, valid and actionable patterns or knowledge
from large databases for decision support. In context of
this study, relationship between data mining and
decision support is shown in Fig.1
Data
Mining
Patterns or
Knowledge
Decision
Support
Fig.1 Data Driven Decision Support
mining is a top ten emerging technology.
Data mining is a set of automated techniques used to
extract buried or previously unknown pieces of
information from large databases, using different criteria,
which makes it possible to discover patterns and
relationships. This new derived information can be
utilized in the areas such as decision support, prediction,
forecasting and estimation to make important business
decisions, which can help in giving a particular business
the competitive edge.
Nowadays, it has been becoming critically important
to make a decision that based on evidences than
convictions of the authorities. Decision support system is
a computer based information system designed to
facilitate the decision making process of semi structured
tasks. Central issue in DSS support is improvement of
ISSN: 1790-5109
The extraction of hidden predictive information from
large databases is a powerful tool with great potential to
help organizations to define the information market
needs of tomorrow. Data mining tools predict future
trends and behaviors, allowing businesses to make
knowledge-driven decisions that will affect the
company, both short term and long term. The automated
prospective analysis offered by data mining tools of
today is much more effective than the analysis provided
by tools in the past. Data mining answers business
questions that traditionally were too time-consuming to
resolve. Data mining tools search databases for hidden
patterns, finding predictive information that experts may
miss because it was outside their expectations.
Data mining is not new. Although data mining is a
relatively new term, the technology is not. In the 1960s,
Management information Systems and later, in the
465
ISBN: 978-960-474-051-2
Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '09)
1970s, Decision Support Systems (DSS) were praised
for their great potential to supply executives with
mountains of data needed to carry out their jobs. After
1995s, corporate intranets were developed to support
information exchange and knowledge management. The
primary decision support tools in use included ad hoc
query and reporting tools, optimization and simulation
models, online analytical processing and data
visualization. On the other hand, a data warehouse is the
newest form of decision support system [4]. Data mining
was defined as one of the hottest technologies in
decision support applications to date.
Advances in data collection, the widespread use of
bar codes for most commercial products, and the
computerization of many business transactions have
flooded us with information, and generated an urgent
need for new techniques and tools that can intelligently
and automatically assist us in transforming this data into
useful knowledge. Today, there is a huge amount of
information locked up in the mountains of data in
companies' databases, information that is potentially
important but has not yet discovered.
Since almost all mid to large size retailers today
possess electronic sales transaction systems, retailers
realize that competitive advantage will no longer be
achieved by the mere use of these systems for purposes
of inventory management or facilitating customer checkout. In contrast, competitive advantage will be gained by
those retailers who are able to extract the knowledge
hidden in the data, generated by those systems, and use it
to optimize their marketing decision making. In this
context, knowledge about how customers are using the
retail store is of critical importance and distinctive
competencies will be built by those retailers who best
succeed in extracting actionable knowledge from these
data.
Data Mining provides many different techniques to
extract knowledge from data. It is an exiting
multidisciplinary field of research which has many
extremely useful applications. At present the techniques
are becoming more commonly used but have not been
applied adequately in the store layout. Store layout
problem is motivated by applications known as market
basket analysis to find relationships between items
purchased by customers.
to improve traffic flow, increase their shelf space, and
increase their sales. The effects of retail store layouts are
too big to overlook.
The store layout has been well studied in both
academic and practitioner literature. Merrilees and
Miller report that store layout design is one of the more
important determinants of store loyalty [5]. Store layout
design can play a key role not only in satisfying buyers’
requirements but also in influencing their wants and
preferences. Store layout affects consumers’ price
acceptability, which is positively related to purchase
intentions. They also report that superstores are currently
revolutionizing the nature of retail service, mainly by
creating more effective self-service arrangements as a
result of improvements in store layout design.
The routine store layout in the supermarkets based on
the industrial logic implementation, which means putting
products that share some functional characteristics or
origins in the same area. So we will find product
categories. Despite improvements, the store layout
remains organized in product categories as defined by
the manufacturers or category buyers. This approach is
company oriented and it fails to respond to the needs of
the time pressured consumer. Some retailers are trying to
move from this organization to consumer oriented [6].
This paper proposes a new store layout approach
based on the association rule mining. We assume that
attractive store layout, navigational aids, salespeople
contact and in-store events induce transitions from
recreational shopping to purchase-oriented shopping,
whereas retail crowding and time pressure engender shift
from purchase-oriented shopping to recreational
shopping. In addition, it is predicted that environmental
design characteristics have greater impact on shopping
path than on purchase decision; marketing interventions
exert more influence on purchase decision than on
movement; while contextual factors have comparable
effect on shopping path and purchase decision. Retailers
can utilize the proposed model to dynamically improve
their in-store conversion rate.
1.3 Association Rules Mining
The applications of data mining in retail trade enterprises
are mainly concentrated in association rules mining.
Association rule mining is an initial data exploration
approach that is often applied to extremely large data set.
An example is grocery store market basket data.
Association rules mining provides valuable information
in assessing significant correlations. By mining
association rules, marketing analysts try to find sets of
products that are frequently bought together. They have
been applied to a variety of fields.
Market Basket Analysis is used to determine which
products sell together, the input data to a Market Basket
1.2 Store Layout Based on Knowledge
Store layout is an important retailing decision that can
help or hurt sales and store profitability. Store layouts
are extremely important because they strongly influence
in-store traffic patterns, shopping atmosphere, shopping
behavior, and operational efficiency. Most retailers know
that their retail store layout has great impact on their
business. And because of this they contemplate changes
ISSN: 1790-5109
466
ISBN: 978-960-474-051-2
Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '09)
Analysis is normally a list of sales transactions, where
each has two dimensions, one represents a product and
the other represents a customer, depending on whether
the goal of the analysis is to find which items sell
together to the same person. Apriori algorithm is one of
the most widely used and famous techniques for finding
association rules [7]. This algorithm was chosen
primarily due to the speed of application.
procedure begins with association rules mining from
transaction data.
Migros Turk AS
Database
Transactions Data
Barcode
Data
1.4 Multi Dimensional Scaling
Data Integration
Multi Dimensional Scaling (MDS) is a whole class of
methods for the visualization of high-dimensional data
[8]. Visualization involves the transformation of the
high-dimensional geometry into a low-dimensional
picture, two or at most three dimensions. The objective
of the MDS is to perform this transformation in such a
way that distances between objects are preserved as
much as possible. The MDS addresses the problem how
proximity data can be faithfully visualized as points in a
low-dimensional Euclidean space. The quality of a data
embedding is measured by a stress function which
compares proximity values with Euclidean distances of
the respective points. In marketing research practice,
meaningful cross correlational structures are merely
determined by visual inspection. Thus, the marketing
analyst usually aims for a parsimonious representation of
the cross-category associations in a compressed and
meaningful fashion. The MDS techniques are typically
employed to accomplish this task.
(SQL Server 2005)
Data Preparation &
Data Transforming
(SPSS Clementine)
Data Mining, Association Rules
Apriori Algorithm
(SPSS Clementine)
Multidimensional
Scaling Technique
(SPSS)
Knowledge &
Store Layout
2 Methodology
This study initially followed the Cross Industry Standard
Practice for Data Mining (CRISP-DM) methodology [9].
CRISP-DM is vendor-independent so it can be used with
any DM tool and it can be applied to solve any DM
problem. CRISP-DM also defines for each phase the
tasks and the deliverables for each task. The CRISP-DM
data mining methodology is described by its authors in
terms of a hierarchical process model, consisting of sets
of tasks described at four levels of abstraction (from
general to specific): phase, generic task, specialized task
and process instance. At the top level, the data mining
process is organized in six phases as follow: 1) context
understanding, 2) data understanding, 3) data
preparation, 4) modeling, 5) evaluation and 6)
deployment.
Customers
Fig.2 Flowchart of the Proposed Approach
In the first stage, using the CRISP-DM Methodology,
market basket analysis has been proceeded to extract
knowledge that will be used in store layout design. In the
next stage, the MDS is used for visualization, i.e., the
procedure execute the MDS to display the set of
products in the store space. By assembling categories
with strong buying association rules obtained in the first
stage, we have tried to propose a new store layout, where
consumers find every thing they want in the same store
area, maximizing the consumer’s use of the spent in the
store. The proposed framework proceeds in the next
section with an empirical study.
2.1 System Framework
This paper proposes an integrated framework for the
store layout problem. The proposed framework proceeds
in a stepwise manner as depicted in Fig.2. The proposed
ISSN: 1790-5109
467
ISBN: 978-960-474-051-2
Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '09)
The store carries 12077 different products grouped in
558 product categories. In the product data file there are
barcode, product name, and group item code (Fig.4).
Migros Turk has used grid layout. According to current
layout, Migros Turk has hierarchy categories like item
category and sub item category. In this context we have
determined 35 main categories.
3 Empirical Study and Results
This paper develops a relational database and uses
Apriori algorithm and Multi Dimensional Scaling
techniques as methodologies for the store layout.
Knowledge extraction by association rule mining results
is illustrated as knowledge patterns/rules and clusters in
order to propose suggestions and solutions to the case
firm for store layout. This paper investigates the store
layout issue based on market basket analysis in Migros
Turk AS.
Migros Turk AS, one of Turkey's largest supermarket
chains. Founded in 1954, Migros Turk, the leader of the
Turkish retail sector, has 7,000 employees and serves
160 million customers a year. Listed in the Istanbul
Stock Exchange under the symbol MGRS, Migros Turk
serves Turkish consumers with seven different retailing
operations: Migros Stores, Sok Stores, Shopping
Centers, Ramstores, Online Shopping, Bakkalim and
Wholesale Stores. Migros Turk operates 150 Migros
stores, 292 Sok stores, 3 Ramstores in Baku, 3 Ramstore
Shopping Centers in Moscow, and Migros Shopping
Centers in Beylikduzu, Ankara and Antalya. For more
information
on
Migros
Turk,
visit
(http://www.migros.com.tr).
First of all, we need to measure the relationship
among products. To do so, we have got six month
database from a Turkish supermarket. The database has
1 million transactions during the period (from 1.1.2005
to 31.1. 2005.). they gave as two text file. The first file
contains transactions during the period. Each transaction
has the date, cache no, receipt no, and barcode. Fig.3
shows the raw data.
Fig.4 A sale item description relation
At the end of this stage, we have constituted a pivot
table, columns are considered item categories, and rows
are customer numbers (Fig.5). We have chosen all the
categories, which are 35 different categories, to construct
a correlation matrix.
Fig.5 Pivot Table
Fig.3 A sale transaction table
Data modeling is where the data mining software is used
to generate results for various situations. In the study,
SPSS Clementine is employed as a data mining tool for
analysis. The data processing in Clementine is done
through the use of nodes, which are then connected
together to form a stream frame. In addition, data
visualization can be presented to users after the mining
process has been done. Fig.6 shows our final model.
The second database file contains product data. The
first step toward a data mining is to transform text files
into SQL Server 2005. This study established relational
data base tables and transferred them on MS SQL Server
within OBDC environment in order to implement the
data table on SPSS Clementine. The text formatted data
of transaction records was loaded into a relational
database for querying. For the repeat of cache no
reasons, we created a new filed, customer no (Mid),
which contains date, cache no and receipt no by using
SQL query and in total, 186886 customer have been
identified.
ISSN: 1790-5109
468
ISBN: 978-960-474-051-2
Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '09)
Fig.6 Created Model in SPSS Clementine
Association rules are discovered by the Apriori
algorithm, in which minimum support and minimum
confidence are set to 0.05% and 20.0%, respectively.
518 rules then are identified and presented in Fig.7.
Fig.8 Association diagram (Line Val. >18%)
Here, the MDS technique is used to provide us with a
tool for the visualization of the database at different
levels of details. We have chosen 35 different categories
to construct a correlation matrix. Once we have
established the correlation matrix, we are able to
calculate the spatial representation of these relationships
through the MDS. We have used data as distance and an
asymmetric matrix to produce the results. In order to use
the correlation matrix as distances among categories, we
have inversed the values by subtracting 1 from all
values. So, if two products have a strong correlation the
proximities will be small, this means that those
categories are similar and should be represented in a
nearby space on the map.
Fig.7 Association rules of basic product (min sup=0.05%,
min conf=20%)
A diagram produced in determining product-purchasing
relationships. In the process of discovering affinities in
the basket contents, Fig.8 illustrates associations
between the products, with the darker lines indicating the
strongest association between products. The Web
diagram suggests a strong association between I5, I10
and I21 purchases. Further analysis on this group using
an association rule detection technique was performed to
find out more information on this group.
We use the buying association measure to create a
category correlation matrix and we apply the multi
dimensional scale technique to display the set of
products in the store space.
ISSN: 1790-5109
Fig.9 MDS on Association Matrix
469
ISBN: 978-960-474-051-2
Proceedings of the 8th WSEAS Int. Conf. on ARTIFICIAL INTELLIGENCE, KNOWLEDGE ENGINEERING & DATA BASES (AIKED '09)
However, the model stress is 0,093 and the square
correlation is 0,96, which means that these results are
acceptable and significant about accepting this model.
We represent all categories in the multidimensional
space as showed in the Fig.9.
As shown from Fig.9, Four Clusters can be formed.
The first cluster in the Figure 9 is comprises dried beans,
spaghetti, canned food, hot drinks, and breakfast
products. Cluster 2 shows oil, sugar, egg, and salt,
spices, flour, and dessert products being bought together
in many shopping. Cluster 3 comprises paper products,
detergent, cleanness products, and the personnel care
products floury and diabetic products. Cluster 4
represents tobacco, alcoholic drinks, books, magazine,
writing material toy, baby products, and textile. These
clusters also mean four consumption universes for the
new store layout.
Migros Turk’ current store layout is shown in figure
10. According to obtained results from this study, the
products having strong association have been located in
the same area and generally customer visit this area for
their shopping. The visited area surround in red
rectangular in Fig.10. For space reasons, we will not
show the other layout maps.
layout, where consumers find everything they want in
the same store area, maximizing the consumer’s use of
time spent in the store.
Conclusions
This paper discuses association rules for data mining
extract knowledge from a database and a new
supermarket store layout based on the association among
categories. This approach allows supermarkets to cluster
products around meaningful purchase opportunities
related to use association.
Acknowledgments
We would like to thank the Commission for the
Scientific Research Projects of Sakarya University. We
also would like to thank Mrs. Tiryal Demirkılınç,
Manager of Data Warehouse in Migros Turk T.A.Ş.
References
[1] Olafson, S.X. Li and S. Wu, Operations research and
data mining, European Journal of Operational Research
187 (2008), pp. 1429–1448.
[2] Marakas, G.M. Decision support systems in the 21st
Century, Prentice-Hall of India, Second Edition, 2004.
[3] Cil, I., Alptürk, O., Yazgan, H.R., A New
Collaborative System Framework Based on Multiple
Perspectives Approach: InteliTeam, Decision Support
Systems 39, 4, 2005, 545-685.
[4] Brohman, M. K, The Business Intelligence Value
Chain: Data-Driven Decision Support in a Data
Warehouse Environment: An Exploratory Study,
Proceedings of the 33rd Hawaii International
Conference on System Sciences – 2000,
[5] Merrilees, B. Miller, D. Superstore interactivity: a
new self-service paradigm of retail service?
International Journal of Retail & Distribution
Management, Volume 29, Number 8, 2001.
[6] Borges, A.,Toward a new supermarket layout: from
industrial categories to one stop shopping organization
through a data mining approach,. Proceedings of the
2003 Society for Marketing Advances Annual
Symposium on Retail Patronage and Strategy, Montreal,
November 4-5, 2003
[7] Agrawal Rakesh, R.S.: Fast algorithms for mining
association rules. In: Proceedings of the 20th VLDB
Conference, Santiago, Chile, 1994.
[8] Borg,I., P. Groenen, Modern Multidimensional
Scaling, Springer Series in Statistics, Springer, Berlin,
1997.
Fig.10 Current store layout of the branch of the Migros
[9] Shearer, C. The crisp-dm model: The new blueprint for
data mining. Journal of Data Warehousing, 5, 2000.
By assembling categories with strong buying
associations, we have tried to propose a new store
ISSN: 1790-5109
470
ISBN: 978-960-474-051-2