Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Discovering Closed Items Based on Logic Pattern

2013, IAEME PUBLICATION

In Data Mining, to discover closed item’s relation among the items, association rule can be used. I propose a method to find closed and dependent items using propositional logic property Biimplication. Generating association rules, finding confidence of rules and apply propositional logic with consideration of minimum confidence we can find closed items, thereby we can consider those items as closed and dependable items one on each other.

International Journal of Research in Computer Applications & Information Technology Volume 1, Issue 1, July-September, 2013, pp. 32-37, © IASTER 2013 www.iaster.com, ISSN Online: 2347-5099, Print: 2348-0009 Discovering Closed Items Based on Logic Pattern K.V. Kalyan, A. Muralidhar School of Computing Science and Engineering, VIT University, Chennai ABSTRACT In Data Mining, to discover closed item’s relation among the items, association rule can be used. I propose a method to find closed and dependent items using propositional logic property Biimplication. Generating association rules, finding confidence of rules and apply propositional logic with consideration of minimum confidence we can find closed items, thereby we can consider those items as closed and dependable items one on each other. Keywords: Association rules, Bi-Implication, Closed and Dependent items, Minimum Confidence, Propositional logic. 1. INTRODUCTION In data mining association rule mining gives relation among items. This generates more association rules among the given items. Some of the algorithms specified with the background knowledge generate the association rules and some without using any background knowledge also generates the association rules. As the background knowledge specifies some threshold values, with the consideration these threshold values some of algorithms generates the association rules. Frequent occurring items can find from the given set of items as per the threshold value. As well known algorithm apriori uses the knowledge of thresholds like minimum support and minimum confidence. Minimum support specifies to generate the item sets that in which occurred item sets support is greater than the minimum support, from the obtained frequent item sets association are generated. Minimum confidence specifies the rules which are strong rules. If the confidence of specified rule is greater than the given confidence then it is a strong rule. But here predicting of the items will takes at one side only. For example if the item1 is associated with item2 (I1I2) then item2 should compulsory happened if item1 happens. It says only single item side prediction can takes, but not from the item2 side, so here it doesn’t says about the closed items sets. Some algorithms without any background knowledge association rules are generated by using some logical proportions; here there is no consideration of keeping any threshold values. Here also the prediction of one side of item to other only states. Some of the algorithms generate closed frequent item sets. For example closet algorithm generates frequent closed item sets. In this using frequent pattern tree structure association rules are generated with consideration of minimum support; depending on the minimum support value generate the closed frequent items. Here also only single item side prediction only taken and predicted as a closed frequent item sets depending on single side of item. 32 International Journal of Research in Computer Applications & Information Technology, Volume-1, Issue-1, July-September, 2013, www.iaster.com ISSN (O) 2347-5099 (P) 2348-0009 Here a method to predict the closed item sets which we can predict at the two sides of the different items using propositional logic, as the two sides specifies that I1I2 and I2I1. By this we can predict the two items are close and dependable on each other. The proportional logic that used is Bi-Implication. Using bi-implication logic we can predict the two items are and dependent on one another. 2. DEFINITION Let I = {I1, I2, I3, ..........., In} be a set of items. An item set P is non-empty subset of I. Representation of < tid, P> is called a transaction. If tid is transaction identifier and P is an item set i.e. P = {Ij1, Ij2, Ij3, . . . . . . Ijn}. A TDB is a set of transactions which represents transaction data base. An item set P is contain in transaction < tid, Q> if P⊆Q. Given TDB transaction database ,the support of an item set P, denoted as sup(P), is the number of transactions in TDB which contains P. An association rule R: P=>Q is an implication between two item sets P and Q Where P, Q⊂I and P∩Q = ∅. The support of the rule, denoted as sup(P=>Q), is defined as sup(P∪Q). The confidence of the rule, denoted as conf(P=>Q), is defined as sup(P∪Q)/sup(P).Mining association rules can be divided into two problems. 1. Find all frequent item sets in the transaction database with respect to the given support threshold. An item set is called frequent item set if its support is no less than min_sup. 2. For each frequent item set Q found, generate all association rules P=>Q - P where P⊂ Q, if its confidence is no less than min_conf. 3. PROPOSED WORK Using minimum support, minimum confidence and bi-implication logic we can derive the closed and dependable items. Using apriori algorithm the association rules can be generated using minimum support and minimum confidence. In the apriori, frequent item sets are generate using candidate set with considering minimum support value for every candidate set, finally as per the minimum support value the frequent item sets are generated. Using obtained item sets by applying confidence formula we can generate the association rules. The rules which are greater than the minimum confidence are the strong rules. In this, single side of the item will be specified. After obtaining of the association rules using confidence bi-implication is applied. As the confidences of different rule of same items i.e. I1I2 and I2I1 are greater than the minimum confidence then the two items are closed and dependable. 3.1 Implication and Bi-Implication Implication says hypothesis and a conclusion which represents AB, A implies B where A is a hypothesis and B is a conclusion. The implication also state as if something is true then it can only imply to something is true. The truth table of implication as follows 33 International Journal of Research in Computer Applications & Information Technology, Volume-1, Issue-1, July-September, 2013, www.iaster.com A B AB T T F F T F T F T F T T ISSN (O) 2347-5099 (P) 2348-0009 Bi-implication is a bi-conditional statement in which it known as if and only if saying AB and BA are true. AB also satisfies the relation (AB) ∧ (BA) is true. Bi-implication can be stated as follows  If A is observe then B is also observe and  If B is observe then A is also observe. Then A and B are more associative and dependent on each other. The truth table of bi-implication as follows A B AB BA T T T T T F F F T F F (AB) ∧(BA) AB T T T F F T F F F T T T T 3.2 Mapping of Association Rules to Bi-Implication Implication can be mapped with association rules, AB can be association rule says if A occurred then B also occurs. According truth results can apply, but as per the truth values of implication false implication to true is a false, so as per the association rules if anything is predicted from one item to another it is to be true when prediction is positive one. When positive prediction is considered then it should be a true prediction only. For example if an association rule I1I2 is a positive prediction then the result of the association rule should be true. The result should not be false. Then it considered as a negative rule. Therefore the implication cannot be completely applicable in predicting. Bi-implication can be mapped with the association rules. From the above implication it can applicable as for the truth values, if item A occurred then item B definitely occurs, similarly for the BA applicable the same. As per Bi-implication says (AB) ∧ (BA) states the truth values. If AB and BA are then Bi-implication results the true values. Therefore according to association rule if it is a positive prediction then prediction should be true. From Bi-implication AB can be mapped as a association and BA is an another association are true then (AB) ∧ (BA) [AB] is true. From these we can conclude that A and B are closed and dependable on each other. Example: The Bi-implication can be stated with the following statements. If mobile is observed then SIM also observe and If SIM is observed then mobile is also observe. From these statements SIM and mobiles are closed and dependent on each other. 34 International Journal of Research in Computer Applications & Information Technology, Volume-1, Issue-1, July-September, 2013, www.iaster.com 4. ISSN (O) 2347-5099 (P) 2348-0009 EXPERIMENT Considering a numerical dataset we can apply apriori algorithm generating frequent item sets using minimum support and by using minimum confidence we can generate association rules. Numerical data is as follows Transaction ID Items 100 200 1, 3, 4 2, 3, 5 300 400 500 600 1,2,3,5 2,5 2,3,5,6 2,3 Here the minimum support is considered as 2 and the minimum confidence is 70%. To find closed and dependent items from the dataset, the following steps should be considered. 1. By using aprori algorithm the frequent item set to be generating by considering minimum support. From the given dataset the frequent item set obtained is {2, 3, 5} where minimum support is greater than equal to 2. 2. From the frequent item set the association rules are to be generated. As per dataset the association rules obtained as follows Rule1: {2}{3, 5} Rule2: {3}{2,5} Rule3: {5}{2, 3} Rule4: {2, 3}{5} Rule5: {3, 5}{2} Rule6: {2, 5}{3} 3. After generating the association rules confidence of each rule has to be find. To find confidence of association rule like pq is defined as sup(P∪Q)/sup(P). The confidences of each rule are RULE {2}{3, 5} CONFIDENCE 60 {3}{2,5} {5}{2, 3} {2, 3}{5} {3, 5}{2} {2, 5}{3} 60 75 75 100 75 35 International Journal of Research in Computer Applications & Information Technology, Volume-1, Issue-1, July-September, 2013, www.iaster.com ISSN (O) 2347-5099 (P) 2348-0009 4. Find the strong rule in which confidence is greater than minimum confidence. The strong rules that are obtained with confidence greater than 70% are Rule3: {5}{2,3} Rule4: {2,3}{5} Rule5: {3,5}{2} Rule6: {2,5}{3} 5. Checking the Bi-Conditional (bi-implication) from the obtained strong rules that means AB confidence and BA confidence should be greater than minimum confidence. Applying Bi-implication for the obtained strong rules of Rule3, Rule4, Rule5, Rule6 For Rule3: {5}{2,3} confidence is 75 {2,3}{5} confidence is 75 Bi-Implication ({5}{2,3} ) ∧ ({2,3}{5}) > 75 ------- 1 For Rule5: {3,5}{2} confidence is 100 {2}{3,5} confidence is 60 Bi-Implication ({3,5}{2}) ∧ ({2}{3,5}) ≯ 75 -------- 2 For Rule6: {2,5}{3} confidence is 75 {3}{2,5}confidence is 60 Bi-Implication ({2,5}{3}) ∧({3}{2,5})≯75 ---------- 3 6. Consider the rules which satisfy the Bi-Implication truth value which confidence is greater than minimum confidence and from these rules the obtained items are close and dependable items. From 1, 2 & 3 Rule3 satisfies True condition for Bi- implication in which the confidence is greater than minimum confidence 70 Therefore items 5, 2, 3 are closed and dependent on each other. 5. CONCLUSION Using apriori we generate the association rules based on the minimum support and minimum confidence which says only single side of a rule, through logical Bi-implication which can mapped to the association rule is used to find the closed and dependent items based on confidence of rules saying both side of a rule satisfying the truth values of Bi-Implication. 36 International Journal of Research in Computer Applications & Information Technology, Volume-1, Issue-1, July-September, 2013, www.iaster.com ISSN (O) 2347-5099 (P) 2348-0009 REFERENCES [1]. Prasadh.K and Sutherr.T: Logic based pattern detection based on multi-level propositional process: International journal of computer science & engineering technology (IJCSET), Vol.3 No.12 Dec 2012. [2]. K.Srinivas, G.Raghavenddra Rao and A.Govardhan: Mining Association Rules from large datasets towards diseases prediction: International Conference on information and computer Networks(ICICN 2012)IPCSIT vol27,2012. [3]. CH.Sandeep Kumar, K.Srinivas, PeddiKishor, T.Bhaskar: An alternative approach to mine Association rules,:2011 IEEE [4]. Alex Tze Hiang Sim, aria Indrawan, Samar Zutshi and Bala Srinivasan:Logic-Based Pattern Discovery:IEEE Transactions on knowledge and data engineering, VOL.22,No.6,june 2010. [5]. LUO Junwei and Zhang Bo: Research on mining positive and negative association rules: International conference on computer and communication technologies in agriculture engineering 2010. [6]. S.L. Hershberger, D.G. Fisher, Measures of Association (Encyclopedia of Statistics in Behavioral Science), JohnWiley & Sons, 2005. [7]. Chia-Wen Liao, Yeng-Horng Perng, Tsung-Lung Chiang Discovery unapparent association rules based on Extracted probability, Journal Decision Support Systems Volume 47 Issue 4, November, 2009 [8]. Yun Sing Koh1 Russel Pears Rare Association Rule Mining via Transaction Clustering Seventh Australasian Data Mining Conference (AusDM 2008), Glenelg, Australia. [9]. Jiawei, H., Jian, P., Yiwen, Y., Runying, and M.: Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach. In: ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 53–87 (2004). [10]. R. Agarwal, C. Aggarwal, and V. V. V. Prasad. A tree projection algorithm for generation of frequent item sets. In Journal of Parallel and Distributed Computing (Special Issue on High Performance Data Mining), 2000. 37