Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
111 views

FP Growth Algorithm

FP-growth is an algorithm for mining frequent itemsets without candidate generation. It compresses the transaction database into a frequent-pattern tree (FP-tree) and then divides the FP-tree into conditional databases associated with frequent items, mining each database separately to find the frequent itemsets. FP-growth adopts a divide-and-conquer strategy to avoid the costly generation of candidate itemsets used in Apriori.

Uploaded by

Indrani Majumdar
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views

FP Growth Algorithm

FP-growth is an algorithm for mining frequent itemsets without candidate generation. It compresses the transaction database into a frequent-pattern tree (FP-tree) and then divides the FP-tree into conditional databases associated with frequent items, mining each database separately to find the frequent itemsets. FP-growth adopts a divide-and-conquer strategy to avoid the costly generation of candidate itemsets used in Apriori.

Uploaded by

Indrani Majumdar
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 10

Mining Frequent Item sets without

Candidate Generation

Apriori with candidate generation is costly for two reasons:

1. It may need to generate a huge number of candidate sets.

For Example : if there are 104 frequent 1-itemsets, the Apriori


algorithm will need to generate more than 107 candidate 2-itemsets.

2. It is costly to go over each transaction in the database to determine the


support of the candidate item sets

December 7, 2021 Data Mining: Concepts and Techniques 1


Mining Frequent Itemsets without
Candidate Generation

“Can we design a method that mines the complete set of frequent itemsets
without candidate generation?”

FP-growth (frequent-pattern growth,): adopts a divide-and-conquer strategy


as follows :
1. First, it compresses the database representing frequent items into a
frequent-pattern tree, or FP-tree
2. It then divides the compressed database into a set of conditional databases
,each associated with one frequent item and mines each such database
separately.

December 7, 2021 Data Mining: Concepts and Techniques 2


Mining Frequent Itemsets without
Candidate Generation

FP-growth : Example
We re-examine the mining of transaction database, D

December 7, 2021 Data Mining: Concepts and Techniques 3


Mining Frequent Itemsets without
Candidate Generation

FP-growth : Example
We re-examine the mining of transaction database, D

The first scan of the database is the same as Apriori. Let the min_sup = 2

Here the set of frequent items is sorted in the order of descending


support count. we have L ={{I2: 7}, {I1: 6}, {I3: 6}, {I4: 2}, {I5: 2}}

Sorted itemsets

We say that the


items are in L - order

December 7, 2021 Data Mining: Concepts and Techniques 4


Mining Frequent Itemsets without
Candidate Generation

FP-growth : Example (Constructing FP – tree)

December 7, 2021 Data Mining: Concepts and Techniques 5


Mining Frequent Itemsets without
Candidate Generation

FP-growth : Example (Constructing conditional databases)


Next FP-tree is mined as follows :

1. Starting with last item in table and constructing its conditional pattern
base : A “subdatabase” which consists of the set of prefix paths in the FP-tree co-
occurring with the suffix pattern
2. Construct conditional FP-tree
3. Frequent itemsets are found by the concatenation of the suffix pattern with the
frequent patterns generated from a conditional FP-tree

December 7, 2021 Data Mining: Concepts and Techniques 6


Mining Frequent Itemsets without
Candidate Generation
FP-growth : Example (Constructing conditional databases)

December 7, 2021 Data Mining: Concepts and Techniques 7


2. Example

You might also like