Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
416 views

Sequential Patterns The GSP Algorithm

The document discusses sequential patterns and the Generalized Sequential Pattern (GSP) algorithm. Sequential patterns are statistically relevant patterns found in sequentially ordered data, where the order matters. GSP is similar to the Apriori algorithm for finding frequent itemsets but accounts for sequence. It works by finding individual frequent items, then using those to generate candidate itemsets of increasing size to find longer frequent sequences. Candidate generation joins larger frequent sequences and prunes any infrequent sequences.

Uploaded by

Chu Văn Nam
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
416 views

Sequential Patterns The GSP Algorithm

The document discusses sequential patterns and the Generalized Sequential Pattern (GSP) algorithm. Sequential patterns are statistically relevant patterns found in sequentially ordered data, where the order matters. GSP is similar to the Apriori algorithm for finding frequent itemsets but accounts for sequence. It works by finding individual frequent items, then using those to generate candidate itemsets of increasing size to find longer frequent sequences. Candidate generation joins larger frequent sequences and prunes any infrequent sequences.

Uploaded by

Chu Văn Nam
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 10

SEQUENTIAL PATTERNS & THE GSP ALGORITHM

BY: JOE

CASABONA

INTRO
What are Sequential Patterns? Why don't ARs suffice? The General Sequential Pattern Algorithm o Finding Frequent Sets o Candidate Generation o Rule Generation

WHAT ARE SEQUENTIAL PATTERNS?


"Finding statistically relevant patterns between data examples where the values are delivered in a sequence." [3] Very similar to Association Rules, but sequence in this case matters. There may be times when order is important.

SEQUENTIAL PATTERN EXAMPLES


In Transaction Processing: Do customers usually buy a new controller or a game first after buying an Xbox? In Text Mining: Order of the words important for finding linguistic or language patterns [1]

OBJECTIVE
Given a set S of input data sequences, find all sequences that have a user-specified minimum support. This is called a 'frequent sequence' or sequential pattern. [1] We will use the Generalized Sequential Pattern Algorithm (GSP)

GSP
Similar to Apriori Algorithm Find individual items with minSupport (1-sequences) Use them to find 2-sequences Continue using k-sequences to find (k+1)-sequences Stop when there are no more frequent sequences. Difference is in Candidate Generation

GSP: CANDIDATE GENERATION


Input : Frequent Set k-1 (F[k-1]) Output: Candidate Set C[k] How it works:

Join F[k-1] with F[k-1] Get rid of infrequent sequences (prune) Note: Order of items matter

CANDIDATE EXAMPLE
F[3] = <{1, 2} {4}>, <{1, 2} {5}>, <{1} {4, 5}>, <{1, 4} {6}>, <{2} {4, 5}>, <{2} {4} {6}> After Join: <{1, 2} {4, 5}>, <{1, 2} { 4} {6}>

After Prune: <{1, 2} {4, 5}>


C[4]= <{1, 2} {4, 5}>

RULE GENERATION
Objective not to generate rules, but it can be done. Sequential Rule: Apply confidence to Frequent Sequences Label Sequential Rules: Replace some elements in X with *

RERERENCES
[1] The Book I am using: Liu, Bing. Web Data Mining, Chapter 2: Association Rules and Sequential Patterns. Springer, December, 2006 Wikipedia: [2] "GSP Algorithm." http://en.wikipedia.org/wiki/GSP_Algorithm June 3, 2008

[3] "Sequence Mining." http://en.wikipedia.org/wiki/Sequence_mining Oct. 30, 2008

You might also like