This document proposes an Evolutionary Quantum Behaved Particle Swarm Optimization (QPSO) algorithm to improve the performance of traditional Particle Swarm Optimization (PSO) for mining association rules from transaction databases. QPSO combines classical PSO with quantum mechanics principles to provide a global search ability. The QPSO algorithm outperforms basic PSO and other PSO variants, finding the global optimum faster while maintaining comparable computational efficiency.
This document proposes an Evolutionary Quantum Behaved Particle Swarm Optimization (QPSO) algorithm to improve the performance of traditional Particle Swarm Optimization (PSO) for mining association rules from transaction databases. QPSO combines classical PSO with quantum mechanics principles to provide a global search ability. The QPSO algorithm outperforms basic PSO and other PSO variants, finding the global optimum faster while maintaining comparable computational efficiency.
This document proposes an Evolutionary Quantum Behaved Particle Swarm Optimization (QPSO) algorithm to improve the performance of traditional Particle Swarm Optimization (PSO) for mining association rules from transaction databases. QPSO combines classical PSO with quantum mechanics principles to provide a global search ability. The QPSO algorithm outperforms basic PSO and other PSO variants, finding the global optimum faster while maintaining comparable computational efficiency.
This document proposes an Evolutionary Quantum Behaved Particle Swarm Optimization (QPSO) algorithm to improve the performance of traditional Particle Swarm Optimization (PSO) for mining association rules from transaction databases. QPSO combines classical PSO with quantum mechanics principles to provide a global search ability. The QPSO algorithm outperforms basic PSO and other PSO variants, finding the global optimum faster while maintaining comparable computational efficiency.
Abstract: Association Rule Mining is a technique of data mining that aims to extract interesting patterns, frequent correlations or associations among set of items in the transaction databases. Particle Swarm Optimization (PSO) is one of several methods for mining Association Rules. It is a technique used to explore the search space of a given problem to find the settings or parameters required to maximize a particular objective. However, avoiding local optima and improving the convergence speed is still a tedious task. In this work, an Evolutionary Quantum behaved Particle Swarm Optimization (QPSO) algorithm is proposed by combining the classical PSO philosophy and quantum mechanics to improve the performance of PSO. It is a global-convergence guaranteed algorithm and has a better search ability than the traditional PSO. The performance of QPSO is compared with basic PSO, and three other variants of PSO and the experimental results shows that the proposed algorithm outperforms PSO quite significantly and is also found to be as effective as the PSO variants with marginally better computational efficiency.
Keywords: Particle Swarm Optimization, Quantum behavior, Association Rule Mining
1. Introduction
Data Mining, the analysis step of the Knowledge Discovery in Databases process, is the practice of examining large pre-existing databases in order to generate new information. However, continuous increase in the amount of data stored in databases and types of databases, the extraction of critical hidden information from these databases has become tedious. Several methods such as classification, clustering and association rules have been used to deduce interferences from large databases. Among these methods, association rule mining is the most widely used method.
Association rules are usually required to satisfy a user-specified minimum support and a user- specified minimum confidence at the same time. Association rule generation is usually split up into two separate steps: first, minimum support is applied to find all frequent itemsets in a database. Second, these frequent itemsets and the minimum confidence constraints are used to form rules.
A very influential association rule mining algorithm, Apriori have been developed for rule mining in large transaction databases. Many other algorithms developed are derivative or extensions of this algorithm. A major step forward in improving the performances of these algorithms was made by the introduction of compact data structure, referred to as frequent pattern tree or FP-tree, and the associated mining algorithms, FP-growth. Then, an alternative of Apriori Itemset Generation called Dynamic Itemset Counting was introduced. In this algorithm, the itemsets are dynamically added and deleted as transaction are read. It relies on the fact that for an itemset, all of its subsets must also be frequent, so only those itemsets whose subsets are all frequent can be examined.
2
Genetic Algorithm and Particle Swarm Optimization are both population based search methods and it moves from one set of points (population) to another set of points in a single iteration with likely improvement using set of control operators. However, the fundamental problem in both algorithms is their imbalanced local and global search and also while trying to reduce the convergence time it gets trapped in the local optimal region when solving multimodal problems. These weaknesses have restricted their wider applications. These appealing goals in association rule mining research has been overcome in Quantum Particle Swarm Optimization (QPSO) that performs a diversified search over the entire search space for better convergence.
The paper is organized as follows. In section 2 the various input and output parameters are discussed. In section 3 a brief introduction to basic PSO is given, section 4, 5 and 6 describes the proposed system in detail. Section 7 explains the results and discussion and section 8 gives the conclusion.
2. Methodology
2.1. Association Rule Mining
Association rule mining is a technique of data mining that is very widely used to deduce inferences from large databases. Typically the relationship will be in the form of a rule: IF {X}, THEN {Y}. Here the X part is called Antecedent and Y part is called Consequent. The rule should be such that XY = .
The two parameter that indicate the importance of association rules are support and confidence. The support indicates how often the rule holds in a set of data. It is given by
support(X) = No. of transactions containing X Total No. of transactions
The confidence for a given rule is a measure of how often the consequent is true, given that the antecedent is true. If the consequent is false while the antecedent is true, then the rule is also false. If the antecedent is not matched by a given data item, then this item does not contribute to the determination of the confidence of the rule. It is given by
confidence(XY) = support( ) support()
2.2. Output Parameters
Much work on association rule mining has been done focusing on efficiency, effectiveness and redundancy. Focus is also needed on the quality of rules mined. In this paper association rule mining using QPSO is treated as a multi-objective problem where rules are evaluated quantitatively and qualitatively based on the following measures.
3
2.2.1. Predictive Accuracy
Predictive Accuracy (PA) measures the effectiveness of the rules mined. The mined rules must have high predictive accuracy. The formula is given by:
Predictive accuracy = |X & Y| |X|
Where |X&Y| is the number of records that satisfy both the antecedent X and consequent Y, |X| is the number of rules satisfying the antecedent X.
2.2.2. Number of rules generated
The count of the rules generated above a certain PA.
2.2.3. Laplace
It is a confidence estimator that takes support into account, becoming more pessimistic as the support of X decreases. It is useful to detect spurious rules that may occur by chance. It is defined as lapl(X Y) = support(X Y) +1 support(X) +2
2.2.4. Fitness
Fitness value is utilized to evaluate the importance of each particle. It is given by
The objective of this fitness function is maximization. The larger the particle support and confidence, the greater the strength of the association, meaning that it is an important association rule.
2.2.5. Conviction
Conviction measures the weakness of confidence. Conviction is infinite for logical implications (confidence 1), and is 1 if X and Y are independent.
2.2.6. Convergence Rate
The generation in which the global optimum is obtained indicates the convergence of the population. 4
3. PSO Algorithm
PSO introduced by Kennedy and Eberhart simulates the behaviors of bird flocking. In PSO, the solutions to the problem are represented as particles in the search space. PSO is initialized with a group of random particles (solutions) and then searches for optimum value by updating particles in successive generations. In each iteration, all the particles are updated by following two best values. The first one is the best solution (fitness) it has achieved so far. This value is called pbest. Another "best" value that is tracked by the particle swarm optimizer is the best value, obtained so far by any particle in the population. This best value is a global best and called gbest.
Particle Swarm has two primary operations: Velocity update and Position update. During each generation each particle is accelerated toward the particles previous best position and the global best position. At each iteration, a new velocity value for each particle is calculated based on its current velocity, the distance from its previous best position, and the distance from the global best position. The new velocity value is then used to calculate the next position of the particle in the search space. This process is then iterated a set number of times or until a minimum error is achieved.
After finding the two best values, the particle updates its velocity and positions with the following formulae
v[ ] is the particle velocity, present[ ] is the current particle. pbest[ ] and gbest[ ] are local best and global best position of particles. rand () is a random number between (0,1). c1 and c2 are acceleration factors. Usually c1 = c2 = 2.05. [4]
The outline of basic Particle Swarm Optimizer is as follows
Step 1: Initialize the population Step 2: Evaluate fitness of the individual particle (update pbest) Step 3: Keep track of the individuals highest fitness (gbest) Step 4: Modify velocities based on pbest and gbest Step 5: Update the particle position Step 6: Terminate if the condition is met Step 7: Go to step 2 5
4. Quantum PSO
The Quantum inspired Particle Swarm Optimization (QPSO) is one of the recent optimization methods based on quantum mechanics. Like any other evolutionary algorithm, a quantum inspired particle swarm algorithm relies on the representation of the individual, the evaluation function and the population dynamics. The particularity of quantum particle swarm algorithm stems from the quantum representation it adopts which allows representing the superposition of all potential solutions for a given problem. Moreover, the position of a particle depends on the probability amplitudes a and b of the wave function . QPSO also stems from the quantum operators it uses to evolve the entire population through generations. QPSO constitutes a powerful strategy to diversify the QPSO population and enhance the QPSOs performance in avoiding premature convergence to local optima.
In terms of classical mechanics, a particle is depicted by its position vector xi and velocity vector vi, which determine the trajectory of the particle. The particle moves along a determined trajectory following Newtonian mechanics, but this is not the case in quantum mechanics. In quantum world, the term trajectory is meaningless, because xi and vi of a particle cannot be determined simultaneously according to uncertainty principle. Therefore, if individual particles in a PSO system have quantum behavior, the PSO algorithm is bound to work in a different fashion.
In the quantum model of a PSO, the state of a particle is depicted by wave function (x, t), instead of position and velocity. The dynamic behavior of the particle is widely divergent from that of the particle in traditional PSO systems. The particles move according to the following iterative equations:
x(t+1) = p + * |mbest x(t)| * ln(1/u), if k >= 0.5 x(t+1) = p * |mbest x(t)| * ln(1/u), if k < 0.5
Where, Local attractor, p = (c * pid + (1-c) * pgd) c = (c1 * r1)/ (c1 * r1 + c2 * r2) u,k,r1,r2 are uniformly distributed random numbers in the interval (0,1) is the contraction-expansion coefficient in the interval (0,1) c1 and c2 are the acceleration factors (usually c1 = 1.82 and c2 = 1.97) [5] pid is the pbest of the i th particle pgd is the global best particle
Mean best position or Mainstream Though point (mbest) of the population is defined as the mean of the best position of all particles.
6
5. Adaptive QPSO
QPSO is mainly conducted by four key parameters important for the convergence and efficiency of the algorithm: the contraction-expansion coefficient (), mean best position (mbest) and two positive acceleration factors (c1 and c2). Contraction-expansion coefficient is a convergence factor used to balance between exploration and exploitation by using the previous flying experience of the particles. Mean best position is replaced with Weighted mean best position to determine whether a particle is elitist or not. Acceleration parameters are typically two positive constants, called the cognitive c1 and social parameters c2.
Adaptive Quantum Particle Swarm Optimization updates the algorithmic parameters dynamically by identifying the evolutionary state that the particle belongs during each generation. Earlier QPSO alternatives for parameter adaptation do not consider the particle state while AQPSO takes into account the state of the particle obtained from diversity information. The Evolutionary State Estimation (ESE) approach is adopted to identify the evolutionary states that the particle undergoes in each generation. The parameters are adjusted according to the estimated state in order to provide a better balance between global exploration and local exploitation. Additionally, an Elitist Learning Strategy (ELS) is developed for the best particle to jump out of possible local optima.
5.1. Evolutionary State Estimation (ESE)
The Evolutionary State Estimation approach uses the population distribution information of the particles to identify the state of the particle. Initially, the particles are dispersed throughout the search space. During the evolutionary process the particles tend to crowd towards a globally optimum region. Hence the population distribution information is necessary to identify the state that the particle undergoes. The distribution information can be calculated by finding the mean distance between each and every particles in the population. It is seen that the particles are closer to globally best particle during the convergence state when compared to other states. The ESE approach is detailed in the following steps:
Step 1: At the current position, calculate the mean distance of each particle i to all the other particles. The mean distance can be measured using an Euclidian metric-
d i = 1 N 1 (x i k x j k ) 2 D k=1 N j=1,ji
Where, N and D are population size and number of dimensions respectively.
Step 2: The distance of globally best particle is denoted as dg. Compare all distances and determine the maximum (dmax) and minimum (dmin) distances. Determine the evolutionary factor f using-
f = d g d min d max d min [0,1] 7
Step 3: Classify f into one of the four sets S1, S2, S3, and S4, which represent the states of exploration, exploitation, convergence, and jumping out respectively.
Fig 5.1.1 Fuzzy membership functions for the four evolutionary states
Step 4: Adaptation of the Acceleration Factors: The acceleration factors c1 and c2 should be controlled dynamically depending on the identified evolutionary state of the particle. Adaptive control can be designed for the acceleration coefficients based on the following notion. Parameter c1 denotes the self-cognition that brings the particle to its own historical best position, helping explore local niches and maintaining the diversity of the swarm. Parameter c2 denotes the social influence that pushes the swarm to converge to the current globally best region, helping with fast convergence. Initialize c1 = 1.82 and c2 = 1.97, which satisfies the convergence condition of the particles = (c1 + c2)/2 1. Since c2 > c1, the particles will converge faster to the global optimal position than the local optimal position of each particle. [5]
These are two different learning mechanisms and should be given different treatments in different evolutionary states. In this paper, the acceleration factors are both initialized to 1.82 and 1.97 respectively and adaptively controlled according to the evolutionary state, with strategies developed as mentioned in table below. The values of c1 and c2 are varied in the range of 0.15.[5]
State Strategy c1 c2 Exploration Strategy 1 Increase Decrease Exploitation Strategy 2 Increase slightly Decrease slightly Convergence Strategy 3 Increase slightly Increase slightly Jumping out Strategy 4 Decrease Increase Table 5.1.1 Strategies for the control of c1 and c2
Strategy 1: In Exploration state c1 is increased so that the particles search for the maximum number of possible solutions instead of crowding at the same region. Decreasing c2 helps the particles to refine its search in finding their historical best position instead of trying to find new solutions. 8
Strategy 2: In exploitation state the particles make use of local information and group around the current pbest of each particle. Thus, increasing c1 slightly will promote exploitation around pbest. Since the global optima will not be found at this stage decreasing c2 slowly and keeping a low value will avoid possible illusion of local optima.
Strategy 3: In convergence state the swarm is about to find the globally optimum region. Hence, increasing c2 slowly will guide the other entire particles to this region. But c2 is decreased to avoid probable local searches to let the particles converge fast.
Strategy 4: In jumping out state the globally best particle shifts towards a new optimal region as soon as it finds that the particles are crowed in the local optimal region in convergence state. The whole swarm should be updated to the new region as soon as possible. Hence maintaining a large value of c2 with smaller c1 will help us in achieving this.
5.2. Adaptation of Contraction-Expansion Coefficient ()
The contraction-expansion coefficient is used to keep the balance between exploration and exploitation by using the previous flying experience of the particles. The particles change its path according to its best position and also by using the information from its neighbors. In addition, the contraction-expansion coefficient is also an important convergence factor; from the results of stochastic simulations, QPSO has relatively better performance by varying the value from 1.0 at the beginning of the search to 0.5 at the end of the search to balance the exploration and exploitation. [6]
5.3. Elitist Learning Strategy (ELS)
The most important problem is to determine whether a particle is an elitist or not, or say it exactly, how to evaluate its importance in calculate the value of mean best position (m). It is natural, as in other evolutionary algorithm, that we associate elitism with the particles fitness value. The greater the fitness, the more important the particle is. Describing it formally, we can rank the particle in descendent order according to their fitness value first. Then assign each particle a weight coefficient i linearly decreasing with the particles rank, that is, the nearer the best solution, the larger its weight coefficient is.
The Weighted Mean best position (or) Mainstream Thought point (m) is calculated as
Where, i is the weight coefficient (decreases linearly from 1.5 to 0.5) [8] M is the population size 9
The steps involved in Adaptive Quantum Particle Swarm Optimizer are as follows
Step 1: Initialize the population Step 2: Set c1 to 1.82 and c2 to 1.97 Step 3: Evaluate fitness of the individual particle (update pbest) Step 4: Keep track of the individuals highest fitness (gbest) Step 5: Update mean best position Step 6: Compute evolutionary factor by using the distribution information of particles in the search space Step 7: Estimate the evolutionary state using fuzzy classification and adaptively control the algorithmic parameters as mentioned in ESE Step 8: Perform ELS once the particles get into convergence state Step 9: Update the particle position Step 10: Terminate if the condition is met Step 11: Go to step 2
6. Memetic QPSO
Memetic Algorithm combines local search techniques into existing algorithm. It performs more refined search around potential solutions of the problem at hand. Here we have introduced Shuffled Frog Leaping (SFL) Algorithm to improve the local search ability of Quantum Particle Swarm Optimisation algorithm.
6.1 Shuffled Frog Leaping Algorithm
In the SFLA, the population consists of a set of frogs (solutions) that is partitioned into subsets and it is named as memeplexes. The different memeplexes are considered as different cultures of frogs, each performing a local search. Within each memeplex, the individual frogs have different ideas, that can be influenced by the ideas of other frogs, and evolve through a process of memetic evolution. After a defined number of memetic evolution steps, ideas are passed among memeplexes in a shuffling process. The local search and the shuffling processes continue until defined convergence criteria are satisfied.
An initial population of P frogs is created randomly. Then, the frogs are sorted in a descending order according to their fitness. Then, the entire population is divided into m memeplexes, each containing n frogs. In this process, the first frog goes to the first memeplex, the second frog goes to the Second memeplex, frog m goes to the m th memeplex, and frog m+1 goes back to the first memeplex, etc. Within each memeplex, the frogs with the best and the worst fitness are identified as Xb and Xw, respectively. Also, the frog with the global best fitness is identified as Xg. Then, a 10
process similar to PSO is applied to improve only the frog with the worst fitness (not all frogs) in each cycle.
Accordingly, the position of the frog with the worst fitness is adjusted as follows:
Change in frog position(D i ) = rand() X b X w
New Position ( X w ) = Current Position ( X w ) + (D i )
Where rand ( ) is a random number between 0 and 1; X b is the position of best frog in the group; X w is the position of worst frog in the group. If this process produces a better solution, it replaces the worst frog. Otherwise, the calculations in Equations are repeated but with respect to the global best frog (i.e. Xg replaces Xb).
The steps involved in Memetic Quantum Particle Swarm Optimizer are as follows
Step 1: Initialize the population Step 2: Set c1 to 1.82 and c2 to 1.97 Step 3: Evaluate fitness of the individual particle (update pbest) Step 4: Keep track of the individuals highest fitness (gbest) Step 5: Update mean best position Step 6: Sorting of population in descending order in terms of fitness value Step 7: Distribution of frog into M memeplexes Step 8: Iterative update of worst frog in each memeplexes Step 9: Combining all frogs to form a new population Step 10: Update the particle position Step 11: Terminate if the condition is met else go to step 2
7. Results and Discussion
The datasets used and the comparison of general PSO and QPSO are presented in this section. To confirm the effectiveness of PSO and QPSO, both the algorithms were coded in Java. Lenses, Car evaluation, Post-Operative Patient Care, Habermans Survival and Zoo datasets were taken up from UCI Irvine repository for the experiment. The details of the datasets are listed in Table 6.1. The association rule mining on the above mentioned datasets were performed by both PSO and QPSO. The efficiency of both algorithms were compared by utilizing the output parameters mentioned earlier.
The conviction value obtained for both PSO and QPSO is found to be infinity which implies that the confidence of the rule is maximum.
8. Conclusion:
The QPSO algorithm is superior to the standard PSO mainly in three aspects. Firstly, quantum theory is an uncertain system in which different state of the particles and a wider searching space of the algorithm can be generated. Secondly, the introduction of mbest into QPSO is a benchmark in Quantum Theory. In the standard PSO, it converges fast, but at times, the fast convergence happens in the first few iterations but falls into a local optimal situation easily in the next few iterations. With the introduction of mbest in QPSO, the average error is lowered, since each particle cannot converge fast without considering its colleagues, which makes the frequency lower than PSO. Lastly, QPSO has fewer parameters compared to standard PSO and it is much easier to implement and run. Hence the performance of the algorithm is significantly improved by QPSO.
References 1. K.Indira, S.Kanmani, R.Jagan, G.Balaji, F.Milton Joseph, Comparative Study on the Association Rule Mining Algorithms, in 2 nd National Conference on Information Technology NCIT 2013, pp. 255-261. 2. K.Indira, S.Kanmani, P.Prashanth, V.Harish, Konda Ramcharan Teja, Population Based Search Methods in Mining Association Rules, in Third International Conference on Advances in Communication, Network, and Computing CNC 2012, LNCS, 2012, pp.255-261. 3. K.Indira, S.Kanmani, Gaurav Sethia.D, Kumaran.S, Prabhakar.J: Rule Acquisition in Data Mining using a Self Adaptive Genetic Algorithm, Communication in Computer and Information Science, Volume 204, Part I, pp.171-178, 2011. 4. Zhan Z-H. and Zhang, J. and Li, Y. and Chung, H.S-H, Adapative Particle Swarm Optimization. IEEE Transactions on Systems Man, and Cybernatics Part B: Cybernatics, 39 (6), 2009, pp.1362-1381. 13
5. Yourui Huang, Liguo Qo and Chaoli Tang, Optimal Coverage Scheme based on QPSO in Wireless Sensor Networks, Journal of Networks, VOL.07, NO.09, September 2012. 6. L.D.S. Coelho, A Quantum Particle Swarm Optimizer with Chaotic Mutation Operator, Chaos, Solitons and Fractals, Vol.37, No.5, 2008, pp. 1409-1418. 7. Layeb. A and Saidoini D.E : Quantum Genetic Algorithm for Binary Decision Diagram Ordering Problem, In the proceeding of International Journal of Computer Science and Network Securiy, Vol.7, No 9 (2007) 130-135. 8. J. Sun, W.B. Xu, W. Fang, Quantum-behaved particle swarm optimization algorithm with controlled diversity, in: International Conference on Computational Science (3) 2006, 2006, pp. 847854. 9. J. Sun, W.B. Xu, W. Fang, Enhancing Global Search Ability of Quantum-Behaved Particle Swarm Optimization by Maintaining Diversity of the Swarm, in: RSCTC 2006, 2006, pp. 736745.
9. CERTIFICATE
This is to certify that the write up was prepared after discussing with me and I approve the content presented.