research-article

Mining Closed High Utility Itemsets based on Propositional Satisfiability

Authors:

Badran Raddaoui,

Boutheina Ben YaghlaneAuthors Info & Claims

Volume 136, Issue C

https://doi.org/10.1016/j.datak.2021.101927

Published: 01 November 2021 Publication History

Abstract

A high utility itemset mining problem is the question of recognizing a set of items that have utility values greater than a given user utility threshold. This generalization of the classical problem of frequent itemset mining is a useful and well-known task in data analysis and data mining, since it is used in a wide range of real applications. In this paper, we first propose to use symbolic Artificial Intelligence for computing the set of all closed high utility itemsets from transaction databases. Our approach is based on reduction to enumeration problems of propositional satisfiability. Then, we enhance the efficiency of our SAT-based approach using the weighted clique cover problem. After that, in order to improve scalability, a decomposition technique is applied to derive smaller and independent sub-problems in order to capture all the closed high utility itemsets. Clearly, our SAT-based encoding can be constantly enhanced by integrating the last improvements in powerful SAT solvers and models enumeration algorithms. Finally, through empirical evaluations on different real-world datasets, we demonstrate that the proposed approach is very competitive with state-of-the-art specialized algorithms for high utility itemsets mining, while being sufficiently flexible to take into account additional constraints to finding closed high utility itemsets.

References

[1]

Ahmed C.F., Tanbeer S.K., Jeong B.-S., Lee Y.-K., Efficient tree structures for high utility pattern mining in incremental databases, IEEE Trans. Knowledge and Data Eng 21 (12) (2009) 1708–1721.

[2]

B.-E. Shie, H.-F. Hsiao V., S. Tseng, P.S. Yu, Mining High Utility Mobile Sequential Patterns in Mobile Commerce Environments, in: Proc. 16th Int’l Conf. DAtabase Systems for Advanced Applications (DASFAA ’11), vol. 6587/2011, 2011, pp. 224–238.

[3]

S.J. Yen, Y.S. Lee, Mining High Utility Quantitative Association Rules. in: Proc. Ninth Int’l Conf. Data Warehousing and Knowledge Discovery (DaWaK), 2007, pp. 283–292.

[4]

S. Jabbour, L. Sais, Y. Salhi, The top-k frequent closed itemset mining using top-k sat problem, in: Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases (ECML/PKDD’03), 2013, pp. 403–418.

[5]

Dlala Imen, Jabbour Said, Badran Raddaoui, Sais Lakhdar, A parallel sat-based framework for closed frequent itemsets mining, in: 24th International Conference, CP 2018, Lille, France, August 27-31, 2018, Proceedings, 2018,.

[6]

Dlala Imen, Jabbour Said, Badran Raddaoui, Sais Lakhdar, A parallel SAT-based framework for closed frequent itemsets mining, in: 24th International Conference, CP 2018, Lille, France, August 27-31, 2018, Proceedings, 2018,.

[7]

S. Jabbour, et al. Boolean satisfiability for sequence mining, in proceedings of CIKM’13, pp. 649-658.

[8]

A. Boudane, S. Jabbour, L. Sais, Y. Salhi, A SAT-Based Approach for Mining Association Rules, in: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI’16), 2016, pp. 2472–2478.

[9]

A. Boudane, S. Jabbour, L. Sais, Y. Salhi, Enumerating Non-redundant Association Rules Using Satisfiability, in: Proceedings of the Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining (PAKDD’17), 2017, pp. 824–836.

[10]

Guns T., Nijssen S., Raedt L.D., Itemset mining: A constraint programming perspective, Artificial Intelligence 175 (12-13) (2011) 1951–1983.

[11]

L.D. Raedt, T. Guns, S. Nijssen, Constraint programming for itemset mining, in ACM SIGKDD, 2008, 204–212.

[12]

G. Tseitin, On the complexity of derivations in the propositional calculus, in: Structures in Constructives Mathematics and Mathematical Logic, Part II. 1968, pp. 115-125.

[13]

Fournier-Viger P., Lin J.C.-W., Vo B, Chi T.T., Zhang J., Le H.B., A survey of itemset mining, in: WIREs Interdisciplinary Reviews - Data Mining and Knowledge Discovery, Wiley, 2017.

[14]

Zhang. Chonsheng, et al. An empirical evaluation of high utility itemset mining algorithms, in: 101, 2018, pp. 91–115.

[15]

Rahmati Bahareh, Sohrabi Mohammad, A systematic survey on high utility itemset mining, Int. J. Inf. Technol. Decis. Mak. 18 (2019),.

[16]

Y. Liu, W. Liao, A. Choudhary, A two-phase algorithm for fast discovery of high utility itemsets, in: Proc. 9th Pacic-Asia Conf. on Knowl. Discovery and Data Mining, 2005, pp. 689695.

[17]

Krishnamoorthy S., Pruning strategies for mining high utility itemsets, Expert Syst. Appl. 42 (5) (2015) 2371–2381.

Digital Library

[18]

Tseng Vincent S., Wu Cheng-Wei, Shie Bai-En, Yu Philip S., UP-Growth: an efficient algorithm for high utility itemset mining, in: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’10), Association for Computing Machinery, New York, NY, USA, 2010, pp. 253–262,.

Digital Library

[19]

M. Liu, J. Qu, Mining high utility itemsets without candidate generation, in: Proc. 22nd ACM Intern. Conf. Info. and Know. Management, 2012, pp. 5564.

[20]

Liu Junqiang, Wang ke, Fung Benjamin, Direct discovery of high utility itemsets without candidate generation, in: Proceedings - IEEE International Conference on Data Mining, ICDM, 2012, pp. 984–989,.

Digital Library

[21]

P. Fournier-Viger, C.-W. Wu, S. Zida, V.S. Tseng, FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning, in: Proc. 21st Intern. Symp. on Methodologies for Intell. Syst., 2014, pp. 8392.

[22]

Zida Souleymane, Viger Fournier, Lin Philippe, Wu Chun-Wei, Tseng Vincent, EFIM: a fast and memory efficient algorithm for high-utility itemset mining, Knowl. Inf. Syst. 51 (2017),.

Digital Library

[23]

Peng Alex Yuxuan, Koh Yun Sing, Riddle Patricia, mHUIMiner: A Fast high utility itemset mining algorithm for sparse datasets. en, in: Kim Jinho, et al. (Eds.), Advances in Knowledge Discovery and Data Mining, vol. 10235, Springer International Publishing, Cham, 2017, pp. 196–207.

[24]

Quang-Huy Duong, et al. Efficient High Utility Itemset Mining Using Buffered Utility-lists, in: Applied Intelligence, vol. 48.7, 2018, pp. 1859–1877.

[25]

V.S. Tseng, C. Wu, P. Fournier-Viger, P.S. Yu, Efficient Algorithms for Mining the Concise and Lossless Representation of High Utility Itemsets, in: IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 3, pp. 726-739, 1 March 2015, https://doi.org/10.1109/TKDE.2014.2345377.

[26]

Selvan S., Nataraj R.V., Efficient mining of large maximal bicliques from 3D symmetric adjacency matrix, in: IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 12, pp. 1797-1802, Dec. 2010, 2010,.

Digital Library

[27]

Lucchese C., Orlando S., Perego R., Fast and memory efficient mining of frequent closed itemsets, IEEE Trans. Knowl. Data Eng. 18 (1) (2006) 21–36,.

Digital Library

[28]

Fournier-Viger P., Zida S., Lin J.C.W., Wu C.W., Tseng V.S., EFIM-Closed: Fast and memory efficient discovery of closed high-utility itemsets, in: Perner P. (Ed.), Machine Learning and Data Mining in Pattern Recognition. MLDM 2016. Lecture Notes in Computer Science, Vol 9729, Springer, Cham, 2016,.

[29]

C. Wu, P. Fournier-Viger, J. Gu, V.S. Tseng, Mining closed+ high utility itemsets without candidate generation, in: 2015 Conference on Technologies and Applications of Artificial Intelligence (TAAI), 2015, pp. 187–194.

[30]

Davis Martin, Logemann George, Loveland Donald, A machine program for theorem-proving, Commun. ACM 5 (7) (1962) 394–397,.

Digital Library

[31]

Hsu Wen-Lian, Nemhauser George L., A polynomial algorithm for the minimum weighted clique cover problem on claw-free perfect graphs, Discrete Math. (1982) 65–71.

[32]

Cheng James, Ke Yiping, Fu Ada Wai-Chee, Yu Jeffrey Xu, Zhu Linhong, Finding maximal cliques in massive networks, ACM Trans. Database Syst 36 (2011) 4.

[33]

Eblen J.D., Phillips C.A., Rogers G.L., et al., The maximum clique enumeration problem: algorithms, applications, and implementations, BMC Bioinformatics 13 (2012) S5.

[34]

A. Hidouri, S. Jabbour, B. Raddaoui, B.B. Yaghlane, A SAT-Based Approach for Mining High Utility Itemsets from Transaction Databases, in: Data Analytics and Knowledge Discovery. DAWAK 2020.

[35]

Fournier-Viger P., SPMF: A Java Open-Source Data Mining Library, 2018, URL: www.philippe-fournier-viger.com/spmf/. (visited on 08/15/2018).

[36]

Niklas Eént, Niklas Sörensson, An Extensible SAT-solver, in: Proceedings of SAT, 2003, pp. 502–518.

Cited By

Baek YKim HCho MKim HLee CRyu TKim HVo BGan VFournier-Viger PLin JPedrycz WYun U(2024)An efficient approach for incremental erasable utility pattern mining from non-binary dataKnowledge and Information Systems10.1007/s10115-024-02185-566:10(5919-5958)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1007/s10115-024-02185-5
Hidouri ARaddaoui BJabbour SElkind E(2023)Targeting minimal rare itemsets from transaction databasesProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/235(2114-2121)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/235
Maggi FMarrella APatrizi FSkydanienko V(2023)Data-Aware Declarative Process Mining with SATACM Transactions on Intelligent Systems and Technology10.1145/360010614:4(1-26)Online publication date: 10-Aug-2023
https://dl.acm.org/doi/10.1145/3600106
Show More Cited By

Index Terms

Mining Closed High Utility Itemsets based on Propositional Satisfiability

Index terms have been assigned to the content through auto-classification.

Recommendations

Mining Top-k Regular High-Utility Itemsets in Transactional Databases

Mining high-utility itemsets is an important task in the area of data mining. It involves exponential mining space and returns a very large number of high-utility itemsets. In a real-time scenario, it is often sufficient to mine a small number of high-...
A survey of incremental high-utility itemset mining

Traditional association rule mining has been widely studied. But it is unsuitable for real-world applications where factors such as unit profits of items and purchase quantities must be considered. High-utility itemset mining HUIM is designed to find ...
Mining closed high utility itemsets in uncertain databases
SoICT '16: Proceedings of the 7th Symposium on Information and Communication Technology

In order to reduce the number of high-utility itemsets (HUIs), closed high-utility itemsets (CHUIs) have been proposed. However, most techniques for mining CHUIs require certain databases; i.e., there are no probabilities. However, in many real-world ...

Comments

Information & Contributors

Information

Published In

cover image Data & Knowledge Engineering

Data & Knowledge Engineering Volume 136, Issue C

Nov 2021

93 pages

ISSN:0169-023X

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 November 2021

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Baek YKim HCho MKim HLee CRyu TKim HVo BGan VFournier-Viger PLin JPedrycz WYun U(2024)An efficient approach for incremental erasable utility pattern mining from non-binary dataKnowledge and Information Systems10.1007/s10115-024-02185-566:10(5919-5958)Online publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1007/s10115-024-02185-5
Hidouri ARaddaoui BJabbour SElkind E(2023)Targeting minimal rare itemsets from transaction databasesProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/235(2114-2121)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/235
Maggi FMarrella APatrizi FSkydanienko V(2023)Data-Aware Declarative Process Mining with SATACM Transactions on Intelligent Systems and Technology10.1145/360010614:4(1-26)Online publication date: 10-Aug-2023
https://dl.acm.org/doi/10.1145/3600106
Li MHan MChen ZWu HZhang X(2023)FCHM-stream: fast closed high utility itemsets mining over data streamsKnowledge and Information Systems10.1007/s10115-023-01831-865:6(2509-2539)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.1007/s10115-023-01831-8
Han MGao ZLi ALiu SMu D(2022)An overview of high utility itemsets mining methods based on intelligent optimization algorithmsKnowledge and Information Systems10.1007/s10115-022-01741-164:11(2945-2984)Online publication date: 1-Nov-2022
https://dl.acm.org/doi/10.1007/s10115-022-01741-1

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents