Ch.5 - Association Rule Mining
Ch.5 - Association Rule Mining
Ch.5 - Association Rule Mining
Chapter 5
Association mining
• It is also known as “market
basket” analysis, which is a very
effective technique to find the
association of sale of item X with
item Y.
• In simple words, market basket
analysis consists of examining the
items in the basket of shoppers
checking out at a market to see
what types of items “go together”.
Example
• For example, “IF one buys bread and milk, THEN he/she also buys
eggs with high probability”. This information is useful for store
manager for better planning of store to improve its sale and efficiency.
Applications of Association Mining
Interesting Case
• Sometimes analysis of sale records leads to very interesting and
unexpected results. In one very popular case study by Walmart USA,
they had identified that people buying diapers often also bought beer.
By putting the beer next to the diapers in their stores, it was found
that the sales of each skyrocketed
Defining Association Mining
• Association rule mining can be defined as identification of frequent
patterns, correlations, associations, or causal structures among sets of
objects or items in transactional databases, relational databases, and
other information repositories.
• Association rules are generally if/then statements that help in
discovering relationships between seemingly unrelated data in a
relational database or other information repository. For example, ‘If a
customer buys a dozen eggs, he is 80% likely to also purchase milk.’
Association rules
• An association rule consists of two parts, i.e., an antecedent (if) and a
consequent (then). An antecedent is an object or item found in the data
while a consequent is an object or item found in the combination with the
antecedent.
• Association rules are often written as X → Y meaning that whenever X
appears Y also tends to appear. X and Y may be single items or sets of items.
Here, X is referred to as the rule’s antecedent and Y as its consequent.
• For example, the rule found in the sales data of a supermarket could specify
that if a customer buys onions and potatoes together, he or she will also like
to buy burgers. This rule will be represented as onions, potatoes → burger.
Representation of items for Association
Mining
• Let us assume that the number of items
in the shop stocks is n.
• For example, there are 6 items in stock,
namely, bread, milk, diapers, beer, eggs,
cola, thus n = 6 for this shop.
• The item list is represented by I and its
items are represented by {i1,i2,…in}.
• The number of transactions are
represented by N transactions, i.e., N = 5
for the shop data.
Metrics for evaluating Association rules
• Support
• Confidence
• Lift
Support
• Let N is the total number of transactions. Support of X is represented as the
number of times X appears in the database divided by N, while the support for
X and Y together is represented as the number of times they appear together
divided by N as given below.
• Support(X) = (Number of times X appears) / N = P(X)
• Support(XY) = (Number of times X and Y appear together) / N = P(X ∩ Y)
• Thus, Support of X is the probability of X while the support of XY is the
probability of X ∩ Y.
Support
• Support(Bread) = Number of times Bread appears / total number of
translations = 4/5 = P(Bread)
• Support(Milk) = Number of times Milk appears / total number of
translations = 4/5 = P(Milk)
• Support(Diapers) = Number of times Diapers appears / total number
of translations = 4/5 =P(Diapers)
• Support(Beer) = Number of times Beer appears / total number of
translations =3/5 = P(Beer)
• Support(Eggs) = Number of times Eggs appears / total number of
translations =1/5 = P(Eggs)
• Support(Cola) = Number of times Cola appears / total number of
translations = 2/5 = P(Cola)
• Support(Bread, Milk) = Number of times Bread, Milk appear together
/ total number of translations = 3/5 = P(Bread ∩ Milk)
• Support(Diapers, Beer) = Number of times Diapers, Beer appears
together / total number of translations = 3/5 = P(Diapers ∩ Beer)
Confidence
• To understand the concept of confidence, let us suppose that support for X→Y is 80%,
then it means that X→Y is very frequent and there are 80% chances that X and Y will
appear together in a transaction. This would be of interest to the sales manager.
• Let us suppose we have another pairs of items (A and B) and support for A→B is 50%.
• Of course it is not as frequent as X→Y, but if this was higher, such as whenever A
appears there is 90% chance that B also appears, then of course it would be of great
interest.
• Let us suppose we have another pairs of items (A and B) and support for A→B is 50%.
Of course it is not as frequent as X→Y, but if this was higher, such as whenever A
appears there is 90% chance that B also appears, then of course it would be of great
interest.
Confidence
• Confidence for X→Y is defined as the ratio of the support for X and Y together
to the support for X (which is same as the conditional probability of Y when X
has already been occurred).
• Therefore if X appears much more frequently than X and Y appearing together,
the confidence will be low.
• Confidence of (X→Y) = Support(XY) / Support(X) = P(X ∩ Y) / P(X) = P(Y | X)
• P(Y | X) is the probability of Y once X has taken place, also called the
conditional probability of Y.
Support and Confidence
Another representation of Association rules
Need for “Lift”
• Let us consider the rule below:
• X→Y
• Lift is the ratio of conditional probability of Y when X is given to the unconditional
probability of Y in the dataset.
• In simple words, it is Confidence of X→Y divided by the probability of Y.
Lift = P(Y | X) / P(Y)
Or
Lift = Confidence of (X→Y) / P(Y)
Or
Lift = (P(X ∩ Y) / P(X))/P(Y)
• Thus, lift can be computed by dividing the confidence by the unconditional
probability of consequent Y.
Need for “Lift”
Let us suppose that Coke is a very common sales item in a store and that it usually appears in most of the
transactions.
Let us suppose that we have a rule of Candle→Coke which has a support of 20% and has a confidence of
90%.
It is very logical to think that if coke is very popular and it appears in 95% of transactions, then obviously
it also appears quite often with the candle as well. So, the rule for association of candle and coke will not
be all that useful. But if we find that Candle→ Matchbox also has a support of 20% and a confidence of
90% then it is logical to suppose that the frequency of matchbox sales is very little as compared to the
sale of coke.
And the rule suggests that when we make a sale of candles, 90% chance indicates that a matchbox will
also be sold in the same transaction. It is more effective and logical to conclude that when we sell a
candle then we also sell a coke (coke is popular and will appear with every item not just with candle).
As support and confidence are unable to handle this case, it is handled by the lift of the rule.
In this case, the probability of Y is very low in case of Candle→ Matchbox (Here, Y is matchbox) and will
be very high in case of Candle→Coke (Here, Y is coke).
Naïve algorithm for finding Association rules
• This is a brute force approach to find desired Association mining rules
that satisfy threshold value of support and confidence.
• Find Association rules with minimum support 50% and confidence
75%.
Transaction ID Items
100 Bread, Cornflakes
101 Bread, Cornflakes, Jam
102 Bread, Milk
103 Cornflakes, Jam, Milk
Naïve algorithm: Itemsets Frequency
Step 1 Bread
Cornflakes
3
3
Find frequency of each Jam 2
Milk 2
item and all possible item
Bread, Cornflakes 2
pairs
Bread, Jam 1
Bread, Milk 1
Cornflakes, Jam 2
Cornflakes, Milk 1
Jam, Milk 1
Bread, Cornflakes, Jam 1
Bread, Cornflakes, Milk 0
Bread, Jam, Milk 0
Cornflakes, Jam, Milk 1
Bread, Cornflakes, Jam, Milk 0
Naïve algorithm: Step 2
Identify item pairs that qualify threshold value of support
Naïve algorithm: Step 3
Generate & identify rules that qualify threshold value confidence
As we are interested in association rules that can only occur with item pairs, thus individual frequent items Bread, Cornflakes, Jam
and Milk are ignored, and item pairs (Bread, Cornflakes) and (Cornflakes, Jam) are considered for association rule mining.
Now, the association rules for the two 2-itemsets (Bread, Cornflakes) and (Cornflakes, Jam) are determined with a required
confidence of 75%.
It is important to note that every 2-itemset (A, B) can lead to two rules A→B and B→A if both satisfy the required confidence. As
stated earlier, confidence of A→B is given by dividing the support for A and B together, by the support for A.
Therefore, we have four possible rules which are given as follows along with their confidence:
Bread→Cornflakes
Confidence = Support of (Bread, Cornflakes) / Support of (Bread) = 2/3 = 67%
Cornflakes→Bread
Confidence = Support of (Cornflakes, Bread) / Support of (Cornflakes) = 2/3 = 67%
Cornflakes→Jam
Confidence = Support of (Cornflakes, Jam) / Support of (Cornflakes) = 2/3 = 67%
Jam→Cornflakes
Confidence = Support of (Jam, Cornflakes) / Support of (Jam) = 2/2 = 100%
Therefore, only the last rule Jam→Cornflakes has more than the minimum required confidence, i.e., 75% and it qualifies. The rules
having more than the user-specified minimum confidence are known as confident.
Apriori algorithm for finding Association
rules
• Phase 1: Identification of frequent itemsets
• Phase 2: Generation of Association Mining rules
Apriori algorithm
Step 1: Identify Candidate – 1 Itemset C1
Step 2: Identify Frequent – 1 Itemset F1
Step 3: Identify Candidate – 2 Itemset C2
• C2 = L1 JOIN L1
• The joining of L1 with itself is shown below
Item pairs Frequency
Bread, Cornflakes 2
Bread, Milk 2
Bread, Jam 3
Cornflakes, Milk 1
Cornflakes, Jam 3
Milk, Jam 2
Step 4: Identify Frequent – 2 Itemset F2
Step 5: Identify Candidate – 3 itemset (C3)
Generation of C3 from L2
C3 is generated from L2 by carrying out a JOIN operation over L2 as shown below.
C3 = L2 JOIN L2
It will involve the same steps as performed for C2, but it has one important pre-requisite for Join,
i.e., two items are joinable if their first item is common. In a generalized case:
Ck = Lk-1 JOIN Lk-1
And they are joinable if their first k-2 items are the same. So, in case of C3, the first item should
be the same in L2, while in case of C2 there is no requirement of first item similarity because k-2
in the C2 case is 0.
From {Bread, Jam} and {Cornflakes, Jam} two frequent 2-itemsets, we do not obtain a candidate
3-itemset since we do not have two 2-itemsets that have the same first item. This completes the
first phase of the Apriori algorithm.
Phase 2: Generation of association rules
The two frequent 2-itemsets given above lead to the following possible rules.
Bread→Jam
Jam→Bread
Cornflakes→Jam
Jam→Cornflakes
The confidence of these rules is obtained by dividing the support for both
items in the rule by the support for the item on the left-hand side of the rule.
The confidence of the four rules therefore are 3/4 = 75%, 3/4 = 75%, 3/3 =
100%, and 3/4 = 75% respectively. Since all of them have a minimum 75%
confidence, they all qualify.