Abstract
Numerical data (e.g., DNA micro-array data, sensor data) pose a challenging problem to existing frequent pattern mining methods which hardly handle them. In this framework, gradual patterns have been recently proposed to extract covariations of attributes, such as: “When X increases, Y decreases”. There exist some algorithms for mining frequent gradual patterns, but they cannot scale to real-world databases. We present in this paper GLCM, the first algorithm for mining closed frequent gradual patterns, which proposes strong complexity guarantees: the mining time is linear with the number of closed frequent gradual itemsets. Our experimental study shows that GLCM is two orders of magnitude faster than the state of the art, with a constant low memory usage. We also present PGLCM, a parallelization of GLCM capable of exploiting multicore processors, with good scale-up properties on complex datasets. These algorithms are the first algorithms capable of mining large real world datasets to discover gradual patterns.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
Source from Yahoo Finance! http://finance.yahoo.com/.
References
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th VLDB conference, 1994, pp 487–499
Pei J, Han J, Mortazavi-Asl B, Pinto H, Chen Q, Dayal U, Hsu M (2001) Prefixspan: mining sequential patterns by prefix-projected growth. ICDE 2001:215–224
Asai T, Abe K, Kawasoe S, Arimura H, Sakamoto H, Arikawa S (2002) Efficient substructure discovery from large semi-structured data. In: Proceedings of the second SIAM international conference on data mining (SDM2002), Arlington, VA, April 2002, pp 158–174
Inokuchi A, Washio T, Motoda H (2000) An apriori-based algorithm for mining frequent substructures from graph data. PKDD 2000:13–23
Srikant R, Agrawal R (1996) Mining quantitative association rules in large relational tables. In: SIGMOD Conference 1996:1–12
Aumann Y, Lindell Y (2003) A statistical theory for quantitative association rules. J Intell Inf Syst 20(3):255–283
Washio T, Mitsunaga Y, Motoda H (2005) Mining quantitative frequent itemsets using adaptive density-based subspace clustering. ICDM 2005:793–796
Di Jorio L, Laurent A, Teisseire M (2009) Mining frequent gradual itemsets from large databases. In: International conference on intelligent data analysis, IDA’09, 2009
Pasquier N, Bastide Y, Taouil R, Lakhal L (1999) Efficient mining of association rules using closed itemset lattices. Inf Syst 24:25–46
Goethals B (2003–2004) Fimi repository website. http://fimi.cs.helsinki.fi/, 2003–2004
Uno T, Kiyomi M, Arimura H (2004) Lcm ver. 2: efficient mining algorithms for frequent/closed/maximal itemsets. In: FIMI, 2004
Negrevergne B, Termier A, Mehaut J-F, Uno T (2010) Discovering closed frequent itemsets on multicore: parallelizing computations and optimizing memory accesses. In: The 2010 International Conference on High Performance Computing and Simulation (HPCS 2010), 2010
Negrevergne B (2011) A generic and parallel pattern mining algorithm for multi-core architectures. In: PhD dissertation, 2011
Arimura H, Uno T (2005) An output-polynomial time algorithm for mining frequent closed attribute trees. In: 15th international conference on inductive logic programming (ILP’05), 2005
Berzal F, Cubero J-C, Sanchez D, Vila M-A, Serrano JM (2007) An alternative approach to discover gradual dependencies. Int J Uncertain Fuzziness Knowl Based Syst (IJUFKS) 15(5):559–570
Hüllermeier E (2002) Association rules for expressing gradual dependencies. In: Proceedings of the 6th European conference on principles of data mining and knowledge discovery, PKDD’02. Springer-Verlag 2002:200–211
Laurent A, Négrevergne B, Sicard N, Termier A (2010) Pgp-mc: towards a multicore parallel approach for mining gradual patterns. In: DASFAA (1), 2010, pp 78–84
Ayouni S, Laurent A, Yahia SB, Poncelet P (2010) Mining closed gradual patterns. In: 10th international conference on artificial intelligence and soft computing, ICAISC 2010, ser. LNCS, vol 6113, 2010, pp 267–274
Laurent A, Lesot M-J, Rifqi M (2009) Graank: exploiting rank correlations for extracting gradual dependencies. In Proceedings of FQAS’ 09:2009
Kendall M, Babington Smith B (1939) The problem of m rankings. Ann Math Stat 10(3):275–287
Lucchese C, Orlando S, Perego R (2004) Dci closed: a fast and memory efficient algorithm to mine frequent closed itemsets. In: FIMI, 2004
Uno T, Asai T, Uchida Y, Arimura H (2004) An efficient algorithm for enumerating closed patterns in transaction databases. Discov Sci 2004:16–31
Gelernter D (1989) Multiple tuple spaces in linda. 1989, pp 20–27. doi:10.1007/3-540-51285-3_30
Dubois D, Prade H (1996) What are fuzzy rules and how to use them. Fuzzy Sets Syst 84(2):169–185
Dubois D, Prade H (1992) Gradual inference rules in approximate reasoning. Inf Sci 61:103–122
Dubois D, Prade H, Grabisch M (1995) Gradual rules and the approximation of control laws. In: Theoretical aspects of fuzzy control, pp 147–181
Dubois D, Prade H, Ughetto L (2003) A new perspective on reasoning with fuzzy rules. Int J Intell Syst 18(5):541–567
Cheng H, Yan X, Han J, Hsu C-W (2007), Discriminative frequent pattern analysis for effective classification. In: International conference on data, engineering, pp 717–725
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Do, T.D.T., Termier, A., Laurent, A. et al. PGLCM: efficient parallel mining of closed frequent gradual itemsets. Knowl Inf Syst 43, 497–527 (2015). https://doi.org/10.1007/s10115-014-0749-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-014-0749-8