research-article

Mining Profitable and Concise Patterns in Large-Scale Internet of Things Environments

Authors:

Jerry Chun-Wei Lin,

Youcef Djenouri,

Gautam Srivastava,

Philippe Fournier-Viger Academic Editor:

Xingsi XueAuthors Info & Claims

Wireless Communications and Mobile Computing, Volume 2021

https://doi.org/10.1155/2021/6653816

Published: 01 January 2021 Publication History

Abstract

In recent years, HUIM (or a.k.a. high-utility itemset mining) can be seen as investigated in an extensive manner and studied in many applications especially in basket-market analysis and its relevant applications. Since current basket-market scenario also involves IoT equipment to collect information, i.e., sensor or smart devices, it is necessary to consider the mining of HUIs (or a.k.a. high-utility itemsets) in a large-scale database especially with IoT situations. First, a GA-based MapReduce model is presented in this work known as GMR-Miner for mining closed patterns with high utilization in large-scale databases. The k-means model is initially adopted to group transactions regarding their relevant correlation based on the frequency factor. A genetic algorithm (GA) is utilized in the developed MapReduce framework that can be used to explore the potential and possible candidates in a limited time. Also, the developed 3-tier MapReduce model can be easily deployed in Spark for the handlings of any database of large scale for knowledge discovery of closed patterns with high utilization. We created sets of extensive experimental environments for evaluating the results of the developed GMR-Miner compared to the well-known and state-of-the-art CLS-Miner. We present our in-depth results to show that the developed GMR-Miner outperforms CLS-Miner in many criteria, i.e., memory usage, scalability, and runtime.

References

[1]

M. J. Zaki and C. J. Hsiao, “Efficient algorithms for mining closed itemsets and their lattice structure,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 462–478, 2005.

Digital Library

[2]

B. Lin, F. Zhu, J. Zhang, J. Chen, X. Chen, N. Xiong, and J. Lloret Mauri, “A time-driven data placement strategy for a scientific workflow combining edge computing and cloud computing,” IEEE Transactions on Industrial Informatics, vol. 15, no. 7, pp. 4254–4265, 2019.

[3]

Y. Qu and N. Xiong, “RFH: a resilient, fault-tolerant and high-efficient replication algorithm for distributed cloud storage,” in 2012 41st International Conference on Parallel Processing, pp. 520–529, Pittsburgh, PA, USA, 2012.

[4]

R. Agrawal, T. Imielinski, and A. N. Swami, “Database mining: a performance perspective,” IEEE Transactions on Knowledge and Data Engineering, vol. 5, no. 6, pp. 914–925, 1993.

Digital Library

[5]

A. Belhadi, Y. Djenouri, J. C. W. Lin, and A. Cano, “A general-purpose distributed pattern mining system,” Applied Intelligence, vol. 50, no. 9, pp. 2647–2662, 2020.

Digital Library

[6]

G. Grahne and J. Zhu, “Fast algorithms for frequent itemset mining using FP-trees,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 10, pp. 1347–1362, 2005.

Digital Library

[7]

R. U. Kiran, A. Anirudh, C. Saideep, M. Toyoda, P. K. Reddy, and M. Kitsuregawa, “Finding periodic-frequent patterns in temporal databases using periodic summaries,” Data Science and Pattern Recognition, vol. 3, no. 2, pp. 24–46, 2019.

[8]

H. Si, J. Zhou, Z. Chen, J. Wan, N. N. Xiong, W. Zhang, and A. V. Vasilakos, “Association rules mining among interests and applications for users on social networks,” IEEE Access, vol. 7, pp. 116014–116026, 2019.

[9]

U. Yun, H. Ryang, and K. H. Ryu, “High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates,” Expert Systems with Applications, vol. 41, no. 8, pp. 3861–3878, 2014.

Digital Library

[10]

J. Han, J. Pei, Y. Yin, and R. Mao, “Mining frequent patterns without candidate generation: a frequent-pattern tree approach,” Data Mining and Knowledge Discovery, vol. 8, no. 1, pp. 53–87, 2004.

Digital Library

[11]

R. Chan, Q. Yang, and Y. D. Shen, “Mining high utility itemsets,” in IEEE International Conference on Data Mining, pp. 19–26, Melbourne, FL, USA, 2003.

[12]

W. Gan, J. C. W. Lin, P. Fournier-Viger, H. C. Chao, V. Tseng, and P. S. Yu, “A survey of utility-oriented pattern mining,” IEEE Transactions on Knowledge and Data Engineering, vol. 33, pp. 1306–1327, 2021.

[13]

R. Gunawan, E. Winarko, and R. Pulungan, “A BPSO-based method for high-utility itemset mining without minimum utility threshold,” Knowledge-Based Systems, vol. 190, article 105164, 2020.

Digital Library

[14]

Y. Liu, W. Liao, and A. N. Choudhary, “A two-phase algorithm for fast discovery of high utility itemsets,” in Advances in Knowledge Discovery and Data Mining. PAKDD 2005, vol. 3518 of Lecture Notes in Computer Science, T. B. Ho, D. Cheung, and H. Liu, Eds., pp. 689–695, Springer, Berlin, Heidelberg, 2005.

Digital Library

[15]

M. Liu and J. Qu, “Mining high utility itemsets without candidate generation,” in ACM International Conference on Information and Knowledge Management, pp. 55–64, Maui, HI, USA, 2012.

Digital Library

[16]

H. Yao, H. J. Hamilton, and C. J. Butz, “A foundational approach to mining itemset utilities from databases,” in SIAM International Conference on Data Mining, pp. 482–486, Lake Buena Vista, Florida, US, 2004.

[17]

V. S. Tseng, B. Shie, C. Wu, and P. S. Yu, “Efficient algorithms for mining high utility itemsets from transactional databases,” IEEE Transactions Knowledge and Data Engineering, vol. 25, no. 8, pp. 1772–1786, 2013.

Digital Library

[18]

J. C. W. Lin, T. Hong, and W. Lu, “An effective tree structure for mining high utility itemsets,” Expert Systems with Applications, vol. 38, no. 6, pp. 7419–7424, 2011.

Digital Library

[19]

P. Fournier-Viger, C. W. Wu, S. Zida, and V. S. Tseng, “FHM: faster high-utility itemset mining using estimated utility co-occurrence pruning,” in Foundations of Intelligent Systems. ISMIS 2014, vol. 8502 of Lecture Notes in Computer Science, T. Andreasen, H. Christiansen, J. C. Cubero, and Z. W. Raś, Eds., pp. 83–92, Springer, Cham, 2014.

[20]

J. Liu, K. Wang, and B. C. M. Fung, “Direct discovery of high utility itemsets without candidate generation,” in 2012 IEEE 12th International Conference on Data Mining, pp. 984–989, Brussels, Belgium, 2012.

Digital Library

[21]

C. Yin, S. Zhang, J. Wang, and N. N. Xiong, “Anomaly detection based on convolutional recurrent autoencoder for IoT time series,” IEEE Transactions on Systems, Man, and Cybernetics: Systems, vol. 21, no. 14, pp. 15626–15634, 2020.

[22]

N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal, “Efficient mining of association rules using closed itemset lattices,” Information Systems, vol. 24, no. 1, pp. 25–46, 1999.

Digital Library

[23]

C. Lucchese, S. Orlando, and R. Perego, “Fast and memory efficient mining of frequent closed itemsets,” IEEE Transactions on Knowledge and Data Engineering, vol. 18, no. 1, pp. 21–36, 2006.

Digital Library

[24]

B. Vo, L. T. T. Nguyen, N. Bui, T. D. D. Nguyen, V. N. Huynh, and T. P. Hong, “An efficient method for mining closed potential high-utility itemsets,” IEEE Access, vol. 8, pp. 31813–31822, 2020.

[25]

T. Wei, B. Wang, Y. Zhang, K. Hu, Y. Yao, and H. Liu, “FCHUIM: efficient frequent and closed high-utility itemsets mining,” IEEE Access, vol. 8, pp. 109928–109939, 2020.

[26]

V. S. Tseng, C. W. Wu, P. Fournier-Viger, and P. S. Yu, “Efficient algorithms for mining the concise and lossless representation of high utility itemsets,” IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 3, pp. 726–739, 2015.

Digital Library

[27]

C. W. Wu, P. Fournier-Viger, J. Y. Gu, and V. S. Tseng, “Mining closed+ high utility itemsets without candidate generation,” in 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI), pp. 187–194, Tainan, Taiwan, 2015.

[28]

T. L. Dam, K. Li, P. Fournier-Viger, and Q. H. Duong, “CLS-Miner: efficient and effective closed high-utility itemset mining,” Frontiers of Computer Science, vol. 13, no. 2, pp. 357–381, 2019.

Digital Library

[29]

Y. C. Lin, C. W. Wu, and V. S. Tseng, “Mining high utility itemsets in big data,” in Advances in Knowledge Discovery and Data Mining. PAKDD 2015, vol. 9078 of Lecture Notes in Computer Science, T. Cao, E. P. Lim, Z. H. Zhou, T. B. Ho, D. Cheung, and H. Motoda, Eds., pp. 649–661, Springer, Cham, 2015.

[30]

J. Dean and S. Ghemawat, “MapReduce,” Communications of the ACM, vol. 51, no. 1, pp. 107–113, 2008.

Digital Library

[31]

M. Y. Lin, P. Y. Lee, and S. C. Hsueh, “Apriori-based frequent itemset mining algorithms on MapReduce,” in The International Conference on Ubiquitous Information Management and Communication, pp. 1–8, Kuala Lumpur, Malaysia, 2012.

Digital Library

[32]

J. H. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence, MIT Press, 1992.

[33]

K. Elbaz, S. L. Shen, A. Zhou, D. J. Yuan, and Y. S. Xu, “Optimization of EPB shield performance with adaptive neuro-fuzzy inference system and genetic algorithm,” Applied Sciences, vol. 9, no. 4, pp. 780–797, 2019.

[34]

R. Guha, M. Ghosh, S. Kapri, S. Shaw, S. Mutsuddi, V. Bhateja, and R. Sarkar, “Deluge based genetic algorithm for feature selection,” Evolutionary Intelligence, vol. 14, pp. 357–367, 2021.

[35]

H. R. Qodmanan, M. Nasiri, and B. Minaei-Bidgoli, “Multi objective association rule mining with genetic algorithm without specifying minimum support and minimum confidence,” Expert Systems with Applications, vol. 38, no. 1, pp. 288–298, 2011.

Digital Library

[36]

S. Kannimuthu and K. Premalatha, “Discovery of high utility itemsets using genetic algorithm with ranked mutation,” Applied Artificial Intelligence, vol. 28, no. 4, pp. 337–359, 2014.

[37]

W. Song and C. Huang, “Mining high average-utility itemsets based on particle swarm optimization,” Data Science and Pattern Recognition, vol. 4, no. 2, pp. 19–32, 2020.

[38]

S. Zida, P. Fournier-Viger, J. C. W. Lin, C. W. Wu, and V. S. Tseng, “EFIM: a fast and memory efficient algorithm for high-utility itemset mining,” Knowledge and Information Systems, vol. 51, no. 2, pp. 595–625, 2017.

Digital Library

[39]

G. Srivastava, J. C. W. Lin, M. Pirouz, Y. Li, and U. Yun, “A pre-large weighted-fusion system of sensed high-utility patterns,” IEEE Sensors Journal, 2021.

[40]

C. Zhang, G. Almpanidis, W. Wang, and C. Liu, “An empirical evaluation of high utility itemset mining algorithms,” Expert Systems with Applications, vol. 101, pp. 91–115, 2018.

Digital Library

[41]

P. Franti and S. Sieranoja, “How much can k-means be improved by using better initialization and repeats?” Pattern Recognition, vol. 93, pp. 95–112, 2019.

Digital Library

[42]

E. Schubert, J. Sander, M. Ester, H. P. Kriegel, and X. Xu, “DBSCAN revisited, revisited,” ACM Transactions on Database Systems, vol. 42, no. 3, pp. 1–21, 2017.

Digital Library

[43]

P. Fournier-Viger, J. C. W. Lin, A. Gomariz, T. Gueniche, A. Soltani, Z. Deng, and H. T. Lam, “The SPMF open-source data mining library version 2,” in Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2016, vol. 9853 of Lecture Notes in Computer Science, B. Berendt et al., Eds., pp. 36–40, Springer, Cham, 2016.

Index Terms

Mining Profitable and Concise Patterns in Large-Scale Internet of Things Environments
1. Information systems
  1. Information systems applications
    1. Data mining

Index terms have been assigned to the content through auto-classification.

Recommendations

Dataless Transitions Between Concise Representations of Frequent Patterns

For many data mining problems in order to solve them it is required to discover frequent patterns. Frequent itemsets are useful e.g. in the discovery of association and episode rules, sequential patterns and clusters. Nevertheless, the number of ...
Efficient mining of long frequent patterns from very large dense datasets
Design and application of hybrid intelligent systems

Discovering association rules that identify relationships among sets of items in a transaction database is an important problem in Data Mining. Finding frequent itemsets has been an active research area since it is the crucial step in association rule ...
Incrementally mining high utility patterns based on pre-large concept

In traditional association rule mining, most algorithms are designed to discover frequent itemsets from a binary database. Utility mining was thus proposed to measure the utility values of purchased items for revealing high utility itemsets from a ...

Comments

Information & Contributors

Information

Published In

cover image Wireless Communications & Mobile Computing

Wireless Communications & Mobile Computing Volume 2021, Issue

2021

14355 pages

ISSN:1530-8669

Issue’s Table of Contents

Copyright © 2021 Jerry Chun-Wei Lin et al.

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Publisher

John Wiley and Sons Ltd.

United Kingdom

Publication History

Published: 01 January 2021

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents