Abstract
Obtaining accurate class prediction of a query object is an important component of supervised classification. However, it could be also important to understand the classification in terms of the application domain, mostly if the prediction disagrees with the expected results. Many accurate classifiers are unable to explain their classification results in terms understandable by an application expert. Classifiers based on emerging patterns, on the other hand, are accurate and easy to understand. The goal of this article is to review the state-of-the-art methods for mining emerging patterns, classify them by different taxonomies, and identify new trends. In this survey, we present the most important emerging pattern miners, categorizing them on the basis of the mining paradigm, the use of discretization, and the stage where the mining occurs. We provide detailed descriptions of the mining paradigms with their pros and cons, what helps researchers and users to select the appropriate algorithm for a given application.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alhammady H (2007) Mining streaming emerging patterns from streaming data. In: IEEE/ACS International conference on computer systems and applications, pp 432–436, Amman
Andruszkiewicz P (2011) Lazy approach to privacy preserving classification with emerging patterns. In: Ryzko D (ed) Emerging intelligent technologies in industry, pp 253–268
Appice A, Ceci M, Malgieri C, Malerba D (2007) Discovering relational emerging patterns. In: AI*IA 2007: artificial intelligence and human-oriented computing, pp 206–217
Bailey J, Manoukian T, Ramamohanarao K (2002) Fast algorithms for mining emerging patterns. In: Proceedings of the 6th European conference on principles of data mining and knowledge discovery, Lecture notes in computer sciences, vol 2431, pp 187–208. Springer, Berlin (2002)
Bailey J, Manoukian T, Ramamohanarao K (2003a) Classification using constrained emerging patterns. In: Fourth international conference on web-age information management, pp 226–237. Chengdu, China
Bailey J, Manoukian T, Ramamohanarao K (2003b) A fast algorithm for computing hypergraph transversals and its application in mining emerging patterns. In: ICDM ’03: Proceedings of the third IEEE international conference on data mining, p 485. IEEE Computer Society, Washington, DC, USA
Bayardo Jr RJ (1998) Efficiently mining long patterns from databases. In: SIGMOD ’98: Proceedings of the 1998 ACM SIGMOD international conference on management of data, pp 85–93. ACM, New York, NY, USA. http://doi.acm.org/10.1145/276304.276313
Bongard MN (1963) Solution to geological problems with support of recognition programs. Sov Geologia 6: 33–50
Ceci M, Appice A, Caruso C, Malerba D (2008) Discovering emerging patterns for anomaly detection in network connection data. Lect Notes Artif Intell 4994: 179–188
Chen L, Dong G (2006) Masquerader detection using oclep: one-class classification using length statistics of emerging patterns. In: WAIMW ’06: Proceedings of the seventh international conference on web-age information management workshops, p 5. IEEE Computer Society, Washington, DC, USA. http://dx.doi.org/10.1109/WAIMW.2006.19
Dasarathy B (1991) Nearest Neighbor (NN) Norms: NN pattern classification techniques. IEEE Computer Society Press, Los Alamitos, California
Dong G, Li J (1999a) Efficient mining of emerging patterns: Discovering trends and differences. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining, pp 43–52. ACM, San Diego, California, United States
Dong G, Li J (1999b) Efficient mining of emerging patterns: discovering trends and differences. In: Proceedings of the fifth ACM SIGKDD international conference on knowledge discovery and data mining, pp 43–52. ACM, San Diego, California, United States
Dong G, Zhang X, Wong L, Li J (1999) Caep: classification by aggregating emerging patterns. In: DS ’99: Proceedings of the second international conference on discovery science, pp 30–42. Springer, London, UK
Fan H, Fan M, Ramamohanarao K, Liu M (2006) Further improving emerging pattern based classifiers via bagging. In: Ng W, Kitsuregawa M, Li J (eds) PAKDD 2006, Lecture notes in artificial intelligence, vol 3918, pp 91–96
Fan H, Ramamohanarao K (2002) An efficient single-scan algorithm for mining essential jumping emerging patterns for classification. In: PAKDD ’02: Proceedings of the 6th Pacific-Asia conference on advances in knowledge discovery and data mining, pp 456–462. Springer, London, UK
Fan H, Ramamohanarao K (2003) A bayesian approach to use emerging patterns for classification. In: ADC ’03: Proceedings of the 14th Australasian database conference, pp 39–48. Australian Computer Society, Inc., Darlinghurst, Australia, Australia
Fan H, Ramamohanarao K (2006) Fast discovery and the generalization of strong jumping emerging patterns for building compact and accurate classifiers. IEEE Trans Knowl Data Eng 18(6): 721–737
Fayyad UM, Irani KB (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: 13th int’l joint conf. artificial intelligence (IJCAI), pp 1022–1029
Garcia-Borroto M (2010) Searching extended emerging patterns for supervised classification. Ph.D. thesis
García-Borroto M, Martínez-Trinidad JF, Carrasco-Ochoa JA (2010a) Cascading an emerging pattern based classifier. In: Carrasco-Ochoa JA, Martínez-Trinidad JF, Kittler J (eds) Advances in pattern recognition, Lecture notes in computer science, vol 6256, pp 240–249. Springer, Berlin/Heidelberg
García-Borroto M, Martínez-Trinidad JF, Carrasco-Ochoa JA (2010b) Fuzzy emerging patterns for classifying hard domains. Knowl Inf Syst, pp 1–17. http://dx.doi.org/10.1007/s10115-010-0324-x. doi:10.1007/s10115-010-0324-x
García-Borroto M, Martínez-Trinidad JF, Carrasco-Ochoa JA (2010c) A new emerging pattern mining algorithm and its application in supervised classification. In: Zaki M, Yu J, Ravindran B, Pudi V (eds) Advances in knowledge discovery and data mining, Lecture notes in computer science, vol 6118, pp 150–157. Springer, Berlin/Heidelberg. doi:10.1007/978-3-642-13657-3_18
García-Borroto M, Martínez-Trinidad JF, Carrasco-Ochoa JA, Medina-Pérez MA, Ruiz-Shulcloper J (2010d) Lcmine: an efficient algorithm for mining discriminative regularities and its application in supervised classification. Pattern Recogn 43(9):3025–3034. http://dx.doi.org/10.1016/j.patcog.2010.04.008
Gavrishchaka VV, Bykov V (2007) Market-neutral portfolio of trading strategies as universal indicator of market micro-regimes: from rare-event forecasting to single-example learning of emerging patterns. In: ICICIC ’07: Proceedings of the second international conference on innovative computing, informatio and control, p 215. IEEE Computer Society, Washington, DC, USA
Gu T, Wu Z, Tao X, Pung HK, Lu J (2009) epsicar: an emerging patterns based approach to sequential, interleaved and concurrent activity recognition. In: PERCOM ’09: Proceedings of the 2009 IEEE international conference on pervasive computing and communications, pp 1–9. IEEE Computer Society, Washington, DC, USA. http://dx.doi.org/10.1109/PERCOM.2009.4912776
Hämälïnen W (2009) Statapriori: an efficient algorithm for searching statistically significant association rules. Knowl Inf Syst. doi:10.1007/s10115-009-0229-8
Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1): 53–87
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8): 832–844
Jin R, Breitbart Y, Muoh C (2009) Data discretization unification. Knowl Inf Syst 19: 1–29
Kobylinski L, Walczak K (2008) Jumping emerging patterns with occurrence count in image classification. In: Washio T (ed) PAKDD 2008, Lecture notes in artificial inteligence, vol 5012, pp 904–909. Springer, Berlin
Kuncheva LI (2004) Combining pattern classifiers. Methods and algorithms. Wiley, Hoboken
Li J, Dong G, Ramamohanarao K, Wong L (2004) Deeps: a new instance-based lazy discovery and classification system. Mach Learn 54(2): 99–124
Li J, Liu G, Wong L (2007) Mining statistically important equivalence classes and delta-discriminative emerging patterns. In: KDD ’07: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, pp 430–439. ACM, New York, NY, USA. http://doi.acm.org/10.1145/1281192.1281240
Li J, Ramamohanarao K, Dong G (2000) The space of jumping emerging patterns and its incremental maintenance algorithms. In: Seventeenth international conference on machine learning. Stanford, CA
Li J, Ramamohanarao K, Dong G (2001) Combining the strength of pattern frequency and distance for classification. In: PAKDD ’01: Proceedings of the 5th Pacific-Asia conference on knowledge discovery and data mining, pp 455–466. Springer, London, UK (2001)
Loekito E, Bailey J (2006) Fast mining of high dimensional expressive contrast patterns using zero-suppressed binary decision diagrams. In: KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 307–316. ACM, New York, NY, USA. http://url.acm.org/10.1145/1150402.1150438
Loekito E, Bailey J (2009) Using highly expressive contrast patterns for classification - is it worthwhile? In: PAKDD ’09: Proceedings of the 13th Pacific-Asia conference on advances in knowledge discovery and data mining, pp 483–490. Springer, Berlin, Heidelberg. http://dx.url.org/10.1007/978-3-642-01307-2_44
Martens D, Baesens B, Van Gestel T, Vanthienen J (2007) Comprehensible credit scoring models using rule extraction from support vector machines. Eur J Oper Res 183(3): 1466–1476
Merz CJ, Murphy PM (1998) Uci repository of machine learning databases. Technical report, University of California at Irvine, Department of Information and Computer Science
Minato SI (1993) Zero-suppressed bdds for set manipulation in combinatorial problems. In: DAC ’93: Proceedings of the 30th international design automation conference, pp 272–277. ACM, New York, NY, USA. http://url.acm.org/10.1145/157485.164890
Muyeba MK, Khan MS, Warnars S, Keane J (2011) A framework to mine high-level emerging patterns by attribute-oriented induction. In: Yin H, Wang W, Rayward-Smith V (eds) IDEAL 2011, LNCS 6936, pp 170–177. Springer, Berlin
Pasquier N, Pasquier C, Brisson L, Collard M (2008) Mining gene expression data using domain knowledge. Int J Softw Inf 2(2): 215–231
Piatetsky-Shapiro G, Frawley WJ (1991) Knowledge discovery in databases. AAAI/MIT Press, Cambridge
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1): 81–106
Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers Inc., Los Altos, CA
Ramamohanarao K, Fan H (2007) Patterns based classifiers. World Wide Web 10(1): 71–83
Schafer JL, Graham JW (2002) Missing data: our view of the state of the art. Psychol Methods 7(2): 147–177
Sun Y, Wong AK (2006) Boosting an associative classifier. IEEE Trans Knowl Data Eng 18(7):988–992. http://dx.url.org/10.1109/TKDE.2006.105
Terlecki P, Walczak K (2008a) Efficient discovery of top-k minimal jumping emerging patterns. In: Chang C (ed) RSCTC, Lecture notes in artificial intelligence, vol 5306, pp 438–447 (2008)
Terlecki P, Walczak K (2008b) Local projection in jumping emerging patterns discovery in transaction databases. In: PAKDD’08: Proceedings of the 12th Pacific-Asia conference on advances in knowledge discovery and data mining, pp 723–730. Springer, Berlin, Heidelberg (2008)
Wang L, Zhao H, Dong G, Li J (2005) On the complexity of finding emerging patterns. Theor Comput Sci 335(1):15–27. http://dx.url.org/10.1016/j.tcs.2004.12.014
Wang Z, Fan H, Ramamohanarao K (2004) Exploiting maximal emerging patterns for classification. In: 17th Australian joint conference on artificial intelligence, pp 1062–1068. Cairns, Queensland, Australia (2004)
Zaki MJ, Hsiao CJ (2005) Efficient algorithms for mining closed itemsets and their lattice structure. IEEE Trans Knowl Data Eng 17(4): 462–478
Zhang X, Dong G, Kotagiri R (2000a) Exploring constraints to efficiently mine emerging patterns from large high-dimensional datasets. In: KDD ’00: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, pp 310–314. ACM, New York, NY, USA. http://url.acm.org/10.1145/347090.347158
Zhang X, Dong G, Ramamohanarao K (2000b) Information-based classification by aggregating emerging patterns. In: IDEAL ’00: Proceedings of the second international conference on intelligent data engineering and automated learning, data mining, financial engineering, and intelligent agents, pp 48–53. Springer, London, UK (2000)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
García-Borroto, M., Martínez-Trinidad, J.F. & Carrasco-Ochoa, J.A. A survey of emerging patterns for supervised classification. Artif Intell Rev 42, 705–721 (2014). https://doi.org/10.1007/s10462-012-9355-x
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-012-9355-x