Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
tutorial

Unsupervised Rare Pattern Mining: A Survey

Published: 24 May 2016 Publication History

Abstract

Association rule mining was first introduced to examine patterns among frequent items. The original motivation for seeking these rules arose from need to examine customer purchasing behaviour in supermarket transaction data. It seeks to identify combinations of items or itemsets, whose presence in a transaction affects the likelihood of the presence of another specific item or itemsets. In recent years, there has been an increasing demand for rare association rule mining. Detecting rare patterns in data is a vital task, with numerous high-impact applications including medical, finance, and security. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for rare pattern mining. We investigate the problems in finding rare rules using traditional association rule mining. As rare association rule mining has not been well explored, there is still specific groundwork that needs to be established. We will discuss some of the major issues in rare association rule mining and also look at current algorithms. As a contribution, we give a general framework for categorizing algorithms: Apriori and Tree based. We highlight the differences between these methods. Finally, we present several real-world application using rare pattern mining in diverse domains. We conclude our survey with a discussion on open and practical challenges in the field.

References

[1]
Mehdi Adda, Lei Wu, and Yi Feng. 2007. Rare itemset mining. In Proceedings of the 6th International Conference on Machine Learning and Applications (ICMLA’07). IEEE Computer Society, Washington, DC, 73--80.
[2]
Mehdi Adda, Lei Wu, Sharon White, and Yi Feng. 2012. Pattern detection with rare item-set mining. CoRR abs/1209.3089 (2012).
[3]
Charu C. Aggarwal, Yan Li, Jianyong Wang, and Jing Wang. 2009. Frequent pattern mining with uncertain data. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’09). ACM, New York, NY, 29--38.
[4]
Rakesh Agrawal, Tomasz Imieliński, and Arun Swami. 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data (SIGMOD’93). ACM, New York, NY, 207--216.
[5]
Rakesh Agrawal and Ramakrishnan Srikant. 1994. Fast algorithms for mining association rules. In Proceedings of the 20th International Conference on Very Large Databases, VLDB'94, Jorge B. Bocca, Matthias Jarke, and Carlo Zaniolo (Eds.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 487--499.
[6]
Roberto J. Bayardo, Rakesh Agrawal, and Dimitrios Gunopulos. 2000. Constraint-based rule mining in large, dense databases. Data Mining and Knowledge Discovery 4, 2/3 (2000), 217--240.
[7]
Varun Chandola, Arindam Banerjee, and Vipin Kumar. 2009. Anomaly detection: A survey. ACM Computing Surveys (CSUR) 41, 3 (2009), Article 15, 58 pages.
[8]
Edith Cohen, Mayur Datar, Shinji Fujiwara, Aristides Gionis, Piotr Indyk, Rajeev Motwani, Jeffrey D. Ullman, and Cheng Yang. 2001. Finding interesting association rules without support pruning. IEEE Transactions on Knowledge and Data Engineering 13, 1, 64--78.
[9]
Ashish Gupta, Akshay Mittal, and Arnab Bhattacharya. 2012. Minimally infrequent itemset mining using pattern-growth paradigm and residual trees. CoRR abs/1207.4958 (2012).
[10]
David J. Haglin and Anna M. Manning. 2007. On minimal infrequent itemset mining. In Proceedings of the 2007 International Conference on Data Mining, DMIN 2007, June 25-28, 2007, Las Vegas, Nevada, Robert Stahlbock, Sven F. Crone, and Stefan Lessmann (Eds.). CSREA Press, Athens, Georgia, 141--147.
[11]
Jiawei Han, Jian Pei, and Yiwen Yin. 2000. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD’00. ACM Press, New York, NY, 1--12.
[12]
C. Sweetlin Hemalatha, V. Vaidehi, and R. Lakshmi. 2015. Minimal infrequent pattern based approach for mining outliers in data streams. Expert Systems with Applications 42, 4 (2015), 1998--2012.
[13]
N. Hoque, B. Nath, and D. K. Bhattacharyya. 2013. An efficient approach on rare association rule mining. In Proceedings of 7th International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA 2012), Jagdish Chand Bansal, Pramod Kumar Singh, Kusum Deep, Millie Pant, and Atulya K. Nagar (Eds.). Advances in Intelligent Systems and Computing, Vol. 201. Springer, India, 193--203.
[14]
Zhongyi Hu, Hongan Wang, Jiaqi Zhu, Maozhen Li, Ying Qiao, and Changzhi Deng. 2014. Discovery of rare sequential topic patterns in document stream. In Proc. Siam International Conference on Data Mining, Mohammed Zaki, Zoran Obradovic, Pang Ning Tan, Arindam Banerjee, Chandrika Kamath, and Srinivasan Parthasarathy (Eds.). Society for Industrial and Applied Mathematics, Philadelphia, PA, 533--541.
[15]
David Huang, Yun Sing Koh, and Gillian Dobbie. 2012. Rare pattern mining on data streams. In Data Warehousing and Knowledge Discovery, Alfredo Cuzzocrea and Umeshwar Dayal (Eds.). Lecture Notes in Computer Science, Vol. 7448. Springer, Berlin, 303--314.
[16]
David Tse Jung Huang, Yun Sing Koh, Gillian Dobbie, and Russel Pears. 2014. Detecting changes in rare patterns from data streams. In Advances in Knowledge Discovery and Data Mining, Vincent S. Tseng, Tu Bao Ho, Zhi-Hua Zhou, Arbee L. P. Chen, and Hung-Yu Kao (Eds.). Lecture Notes in Computer Science, Vol. 8444. Springer International Publishing, Switzerland, 437--448.
[17]
Yan Huang, Jian Pei, and Hui Xiong. 2006. Mining co-location patterns with rare events from spatial data sets. GeoInformatica 10, 3 (2006), 239--260.
[18]
Yanqing Ji, Hao Ying, John Tran, Peter Dews, Ayman Mansour, and R. Michael Massanari. 2013. A method for mining infrequent causal associations and its application in finding adverse drug reaction signal pairs. IEEE Transactions on Knowledge and Data Engineering 25, 4 (April 2013), 721--733.
[19]
Nan Jiang and Le Gruenwald. 2006. Research issues in data stream association rule mining. SIGMOD Rec. 35, 1 (March 2006), 14--19.
[20]
R.Uday Kiran and Polepalli Krishna Reddy. 2012. An efficient approach to mine rare association rules using maximum items support constraints. In Data Security and Security Data (Lecture Notes in Computer Science), LachlanM. MacKinnon (Ed.), Vol. 6121. Springer, Berlin, 84--95.
[21]
Yun Sing Koh and Nathan Rountree. 2005. Finding sporadic rules using apriori-inverse. In PAKDD (Lecture Notes in Computer Science), Tu Bao Ho, David Cheung, and Huan Liu (Eds.), Vol. 3518. Springer, Berlin, 97--106.
[22]
Yun Sing Koh, Nathan Rountree, and Richard O’Keefe. 2006. Mining interesting imperfectly sporadic rules. In PAKDD (Lecture Notes in Computer Science), Wee Keong Ng, Masaru Kitsuregawa, Jianzhong Li, and Kuiyu Chang (Eds.), Vol. 3918. Springer, Berlin, 473--482.
[23]
Jennifer Lavergne, Ryan Benton, and Vijay V. Raghavan. 2012. TRARM-relsup: Targeted rare association rule mining using itemset trees and the relative support measure. In Foundations of Intelligent Systems, Li Chen, Alexander Felfernig, Jiming Liu, and Zbigniew W. Raś (Eds.). Lecture Notes in Computer Science, Vol. 7661. Springer, Berlin, 61--70.
[24]
Gangin Lee, Unil Yun, Heungmo Ryang, and Donggyu Kim. 2015. Multiple minimum support-based rare graph pattern mining considering symmetry feature-based growth technique and the differing importance of graph elements. Symmetry 7, 3 (2015), 1151.
[25]
Carson Kai-Sang Leung and Dale A. Brajczuk. 2010. Efficient algorithms for the mining of constrained frequent patterns from uncertain data. SIGKDD Explor. Newsl. 11, 2 (May 2010), 123--130.
[26]
Jinyan Li, Xiuzhen Zhang, Guozhu Dong, Kotagiri Ramamohanarao, and Qun Sun. 1999. Efficient mining of high confidence association rules without support thresholds. In Principles of Data Mining and Knowledge Discovery (Lecture Notes in Computer Science), Jan M. Zéytkow and Jan Rauch (Eds.), Vol. 1704. Springer, Berlin, 406--411.
[27]
Bing Liu, Wynne Hsu, and Yiming Ma. 1999a. Mining association rules with multiple minimum supports. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 337--341.
[28]
Bing Liu, Wynne Hsu, and Yiming Ma. 1999b. Pruning and summarizing the discovered associations. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’99. ACM Press, New Yor, NY, 125--134.
[29]
Raymond T. Ng, Laks V. S. Lakshmanan, Jiawei Han, and Alex Pang. 1998. Exploratory mining and pruning optimizations of constrained associations rules. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, SIGMOD’98. ACM Press, New York, NY, 13--24.
[30]
Jyothi Pillai, O. P. Vyas, and Maybin Muyeba. 2013. HURI a novel algorithm for mining high utility rare itemsets. In Advances in Computing and Information Technology, Natarajan Meghanathan, Dhinaharan Nagamalai, and Nabendu Chaki (Eds.). Advances in Intelligent Systems and Computing, Vol. 177. Springer, Berlin, 531--540.
[31]
Imad Rahal, Dongmei Ren, Weihua Wu, and William Perrizo. 2004. Mining confident minimal rules with fixed-consequents. In Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence, ICTAI’04. IEEE Computer Society, 6--13.
[32]
Ahmedur Rahman, C. I. Ezeife, and A. K. Aggarwal. 2010. WiFi miner: An online apriori-infrequent based wireless intrusion system. In Knowledge Discovery from Sensor Data, Mohamed Medhat Gaber, Ranga Raju Vatsavai, Olufemi A. Omitaomu, João Gama, Nitesh V. Chawla, and Auroop R. Ganguly (Eds.). Lecture Notes in Computer Science, Vol. 5840. Springer, Berlin, 76--93.
[33]
Cristòbal Romero, Josè Raul Romero, Jose Marìa Luna, and Sebastiàn Ventura. 2010. Mining rare association rules from e-learning data. In Proceedings of the 3rd International Conference on Educational Data Mining, Pittsburgh, PA, USA, June 11-13, 2010. International Educational Data Mining Society, Massachusetts, USA, 171--180.
[34]
Kanimozhi SC Sadhasivam and Tamilarasi Angamuthu. 2011. Mining rare itemset with automated support thresholds. Journal of Computer Science 7, 3 (2011), 394.
[35]
B. Saha, M. Lazarescu, and S. Venkatesh. 2007. Infrequent item mining in multiple data streams. In Seventh IEEE International Conference on Data Mining Workshops. 569--574.
[36]
Masakazu Seno and George Karypis. 2001. LPMiner: An algorithm for finding frequent itemsets using length-decreasing support constraint. In Proceedings of the 2001 IEEE International Conference on Data Mining ICDM, Nick Cercone, Tsau Young Lin, and Xindong Wu (Eds.). IEEE Computer Society, 505--512.
[37]
Ramakrishnan Srikant, Quoc Vu, and Rakesh Agrawal. 1997. Mining association rules with item constraints. In Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, KDD’97. 67--73.
[38]
Laszlo Szathmary, Amedeo Napoli, and Petko Valtchev. 2007. Towards rare itemset mining. In Proceedings of the 19th IEEE International Conference on Tools with Artificial Intelligence - Volume 01 (ICTAI’07). IEEE Computer Society, Washington, DC, 305--312.
[39]
Feng Tao, Fionn Murtagh, and Mohsen Farid. 2003. Weighted association rule mining using weighted support and significance framework. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’03. ACM Press, New York, NY, 661--666.
[40]
Yongxin Tong, Lei Chen, Yurong Cheng, and Philip S. Yu. 2012. Mining frequent itemsets over uncertain databases. Proc. VLDB Endow. 5, 11 (July 2012), 1650--1661.
[41]
Luigi Troiano, Giacomo Scibelli, and Cosimo Birtolo. 2009. A fast algorithm for mining rare itemsets. In Proceedings of the 2009 Ninth International Conference on Intelligent Systems Design and Applications (ISDA’09). IEEE Computer Society, Washington, DC, 1149--1155.
[42]
Sidney Tsang, Yun Sing Koh, and Gillian Dobbie. 2011. RP-tree: Rare pattern tree mining. In DaWaK (Lecture Notes in Computer Science), Alfredo Cuzzocrea and Umeshwar Dayal (Eds.), Vol. 6862. Springer, Berlin, 277--288.
[43]
Ke Wang, Yu He, and David W. Cheung. 2001. Mining confident rules without support requirement. In Proceedings of the 10th International Conference on Information and Knowledge Management. ACM Press, New York, NY, 89--96.
[44]
Ke Wang, Yu He, and Jiawei Han. 2003. Pushing support constraints into association rules mining. IEEE Transactions Knowledge Data Engineering 15, 3 (2003), 642--658.
[45]
Gary M. Weiss. 2004. Mining with rarity: A unifying framework. SIGKDD Exploration Newsletter 6, 1 (2004), 7--19.
[46]
Hui Xiong, Pang-Ning Tan, and Vipin Kumar. 2003. Mining strong affinity association patterns in data sets with skewed support distribution. In ICDM. IEEE Computer Society, 387--394.
[47]
Hyunyoon Yun, Danshim Ha, Buhyun Hwang, and Keun Ho Ryu. 2003. Mining association rules on significant rare data using relative support. The Journal of Systems and Software 67, 3 (15 September 2003), 181--191.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 10, Issue 4
Special Issue on SIGKDD 2014, Special Issue on BIGCHAT and Regular Papers
July 2016
417 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/2936311
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 May 2016
Accepted: 01 February 2016
Revised: 01 October 2015
Received: 01 December 2014
Published in TKDD Volume 10, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Association rule mining
  2. infrequent patterns
  3. rare rules

Qualifiers

  • Tutorial
  • Research
  • Refereed

Funding Sources

  • University of Auckland (FRDF)
  • University of Malaya and Ministry of Education (High Impact Research)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Colossal Trajectory MiningExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.122055238:PDOnline publication date: 15-Mar-2024
  • (2024)Machine Learning for Big Data AnalyticsBig Data Analytics10.1007/978-3-031-55639-5_9(193-231)Online publication date: 8-May-2024
  • (2023)MCoR-Miner: Maximal Co-Occurrence Nonoverlapping Sequential Rule MiningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2023.324121335:9(9531-9546)Online publication date: 1-Sep-2023
  • (2023)Anomaly Rule Detection in Sequence DataIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.313908635:12(12095-12108)Online publication date: 1-Dec-2023
  • (2023)Mining Rare Utility Patterns within Target Items2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386702(6015-6024)Online publication date: 15-Dec-2023
  • (2023)Mining Interesting Negative Sequential Patterns Based on InfluenceIEEE Access10.1109/ACCESS.2023.324232711(12925-12936)Online publication date: 2023
  • (2023)Fast privacy-preserving utility mining algorithm based on utility-list dictionaryApplied Intelligence10.1007/s10489-023-04791-253:23(29363-29377)Online publication date: 1-Dec-2023
  • (2022)Towards Revenue Maximization with Popular and Profitable ProductsACM/IMS Transactions on Data Science10.1145/34880582:4(1-21)Online publication date: 24-May-2022
  • (2022)Mining with Rarity for Web IntelligenceCompanion Proceedings of the Web Conference 202210.1145/3487553.3524708(973-981)Online publication date: 25-Apr-2022
  • (2022)Generic Itemset Mining Based on Reinforcement LearningIEEE Access10.1109/ACCESS.2022.314180610(5824-5841)Online publication date: 2022
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media