Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3487553.3524708acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

Mining with Rarity for Web Intelligence

Published: 16 August 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Mining with rarity makes sense to take advantage of data mining for Web intelligence. In some scenarios, the rare patterns are meaningful in data intelligent systems. Interesting pattern discovery plays an important role in real-world applications. In this field, a great deal of work has been done. In general, a high-utility pattern may include frequent items and also rare items. Rare pattern discovery emerges gradually and helps policy-makers making related marketing strategies. However, the existing Apriori-like methods for discovering high-utility rare itemsets (HURIs) are not efficient. In this paper, we address the problem of mining with rarity and propose an efficient algorithm, named HURI-Miner, which uses the data structure called revised utility-list to find HURIs from a transaction database. Furthermore, we utilize several powerful pruning strategies to prune the search space and save the computational complexity. In the process of rare pattern mining, the HURIs are directly generated without the generate-and-test method. Finally, a series of experimental results show that this proposed method has superior effectiveness and efficiency.

    References

    [1]
    Rakesh Agarwal, Ramakrishnan Srikant, 1994. Fast algorithms for mining association rules. In The 20th VLDB Conference. 487–499.
    [2]
    Charu C Aggarwal, Yan Li, Jianyong Wang, and Jing Wang. 2009. Frequent pattern mining with uncertain data. In The 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 29–38.
    [3]
    Chowdhury Farhan Ahmed, Syed Khairuzzaman Tanbeer, Byeong-Soo Jeong, and Young-Koo Lee. 2009. Efficient tree structures for high utility pattern mining in incremental databases. IEEE Transactions on Knowledge and Data Engineering 21, 12(2009), 1708–1721.
    [4]
    Luigi Atzori, Antonio Iera, and Giacomo Morabito. 2010. The internet of things: A survey. Computer Networks 54, 15 (2010), 2787–2805.
    [5]
    Anindita Borah and Bhabesh Nath. 2019. Rare pattern mining: challenges and future perspectives. Complex & Intelligent Systems 5, 1 (2019), 1–23.
    [6]
    Raymond Chan, Qiang Yang, and Yi-Dong Shen. 2003. Mining high utility itemsets. In Third IEEE International Conference on Data Mining. IEEE Computer Society, 19–19.
    [7]
    Chien-Ming Chen, Lili Chen, Wensheng Gan, Lina Qiu, and Weiping Ding. 2021. Discovering high utility-occupancy patterns from uncertain data. Information Sciences 546(2021), 1208–1229.
    [8]
    Hsinchun Chen, Roger HL Chiang, and Veda C Storey. 2012. Business intelligence and analytics: From big data to big impact. MIS Quarterly (2012), 1165–1188.
    [9]
    Philippe Fournier-Viger, Cheng-Wei Wu, Souleymane Zida, and Vincent S Tseng. 2014. FHM: Faster high-utility itemset mining using estimated utility co-occurrence pruning. In International Symposium on Methodologies for Intelligent Systems. Springer, 83–92.
    [10]
    Wensheng Gan, Jerry Chun-Wei Lin, Han-Chieh Chao, Hamido Fujita, and Philip S Yu. 2019. Correlated utility-based pattern mining. Information Sciences 504(2019), 470–486.
    [11]
    Wensheng Gan, Jerry Chun-Wei Lin, Han-Chieh Chao, Shyue-Liang Wang, and Philip S Yu. 2018. Privacy preserving utility mining: a survey. In IEEE International Conference on Big Data. IEEE, 2617–2626.
    [12]
    Wensheng Gan, Jerry Chun-Wei Lin, Han-Chieh Chao, and Justin Zhan. 2017. Data mining in distributed environment: a survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 7, 6(2017), e1216.
    [13]
    Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, and Hamido Fujita. 2018. Extracting non-redundant correlated purchase behaviors by utility measure. Knowledge-Based Systems 143 (2018), 30–41.
    [14]
    Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Tzung-Pei Hong, and Hamido Fujita. 2018. A survey of incremental high-utility itemset mining. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 8, 2(2018), e1242.
    [15]
    Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, Vincent S Tseng, and Philip S Yu. 2021. A survey of utility-oriented pattern mining. IEEE Transactions on Knowledge and Data Engineering 33, 4(2021), 1306–1327.
    [16]
    Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, and Philip S Yu. 2020. HUOPM: High-utility occupancy pattern mining. IEEE Transactions on Cybernetics 50, 3 (2020), 1195–1208.
    [17]
    Vikram Goyal, Siddharth Dawar, and Ashish Sureka. 2015. High utility rare itemset mining over transaction databases. In International Workshop on Databases in Networked Information Systems. Springer, 27–40.
    [18]
    Jiawei Han, Hong Cheng, Dong Xin, and Xifeng Yan. 2007. Frequent pattern mining: current status and future directions. Data Mining and Knowledge Discovery 15, 1 (2007), 55–86.
    [19]
    Jiawei Han, Jian Pei, Yiwen Yin, and Runying Mao. 2004. Mining frequent patterns without candidate generation: A frequent-pattern tree approach. Data Mining and Knowledge Discovery 8, 1 (2004), 53–87.
    [20]
    Yun Sing Koh and Sri Devi Ravana. 2016. Unsupervised rare pattern mining: a survey. ACM Transactions on Knowledge Discovery from Data 10, 4 (2016), 1–29.
    [21]
    Jerry Chun-Wei Lin, Philippe Fournier-Viger, and Wensheng Gan. 2016. FHN: An efficient algorithm for mining high-utility itemsets with negative unit profits. Knowledge-Based Systems 111 (2016), 283–298.
    [22]
    Jerry Chun-Wei Lin, Wensheng Gan, Philippe Fournier-Viger, Tzung-Pei Hong, and Han-Chieh Chao. 2017. FDHUP: Fast algorithm for mining discriminative high utility patterns. Knowledge and Information Systems 51, 3 (2017), 873–909.
    [23]
    Jerry Chun-Wei Lin, Wensheng Gan, Philippe Fournier-Viger, Tzung-Pei Hong, and Vincent S Tseng. 2016. Efficient algorithms for mining high-utility itemsets in uncertain databases. Knowledge-Based Systems 96 (2016), 171–187.
    [24]
    Jerry Chun-Wei Lin, Wensheng Gan, Philippe Fournier-Viger, Tzung-Pei Hong, and Vincent S Tseng. 2017. Efficiently mining uncertain high-utility itemsets. Soft Computing 21, 11 (2017), 2801–2820.
    [25]
    Jerry Chun-Wei Lin, Wensheng Gan, Tzung-Pei Hong, and Vincent S Tseng. 2015. Efficient algorithms for mining up-to-date high-utility patterns. Advanced Engineering Informatics 29, 3 (2015), 648–661.
    [26]
    Mengchi Liu and Junfeng Qu. 2012. Mining high utility itemsets without candidate generation. In The 21st ACM International Conference on Information and Knowledge Management. 55–64.
    [27]
    Ying Liu, Wei-Keng Liao, and Alok Choudhary. 2005. A fast high utility itemsets mining algorithm. In Proceedings of the 1st international workshop on Utility-based data mining. 90–99.
    [28]
    S Zanzote Ninoria and SS Thakur. 2019. An efficient algorithm for mining high utility rare itemsets over uncertain databases. International Journal of Computer Engineering and Technology 10, 2(2019).
    [29]
    Daniel E O’Leary. 2013. Artificial intelligence and big data. IEEE Intelligent Systems 28, 2 (2013), 96–99.
    [30]
    Jyothi Pillai and OP Vyas. 2011. High utility rare item set mining (HURI): an approach for extracting high utility rare item sets. Journal on Future Engineering and Technology 7, 1 (2011), 1.
    [31]
    Jyothi Pillai and OP Vyas. 2012. CSHURI-modified HURI algorithm for customer segmentation and transaction profitability. arXiv preprint, arXiv:1205.1609(2012).
    [32]
    Sunidhi Shrivastava and Punit Kumar Johari. 2017. Privacy preservation of infrequent itemsets mining using GA approach. In Recent Developments in Intelligent Computing, Communication and Devices. Springer, 97–104.
    [33]
    Wei Song, Caiyu Fang, and Wensheng Gan. 2021. TopUMS: Top-k utility mining in stream data. In International Conference on Data Mining Workshops. IEEE, 615–622.
    [34]
    Laszlo Szathmary, Amedeo Napoli, and Petko Valtchev. 2007. Towards rare itemset mining. In 19th IEEE International Conference on Tools with Artificial Intelligence. IEEE, 305–312.
    [35]
    Vincent S Tseng, Bai-En Shie, Cheng-Wei Wu, and Philip S Yu. 2012. Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Transactions on Knowledge and Data Engineering 25, 8(2012), 1772–1786.
    [36]
    Hong Yao, Howard J Hamilton, and Cory J Butz. 2004. A foundational approach to mining itemset utilities from databases. In The SIAM International Conference on Data Mining. SIAM, 482–486.
    [37]
    Deze Zeng, Song Guo, and Zixue Cheng. 2011. The web of things: A survey. Journal of Communication 6, 6 (2011), 424–438.
    [38]
    Chunkai Zhang, Zilin Du, Wensheng Gan, and Philip S Yu. 2021. TKUS: Mining top-k high utility sequential patterns. Information Sciences 570(2021), 342–359.
    [39]
    Chunkai Zhang, Zilin Du, Yuting Yang, Wensheng Gan, and Philip S Yu. 2021. On-shelf utility mining of sequence data. ACM Transactions on Knowledge Discovery from Data 16, 2 (2021), 1–31.
    [40]
    Souleymane Zida, Philippe Fournier-Viger, Jerry Chun-Wei Lin, Cheng-Wei Wu, and Vincent S Tseng. 2017. EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowledge and Information Systems 51, 2 (2017), 595–625.

    Cited By

    View all
    • (2023)Mining Rare Utility Patterns within Target Items2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386702(6015-6024)Online publication date: 15-Dec-2023
    • (2022)Fast Mining RFM Patterns for Behavioral Analytics2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA)10.1109/DSAA54385.2022.10032434(1-10)Online publication date: 13-Oct-2022
    • (2022)Targeted Mining of Rare High-Utility Patterns2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020226(6271-6280)Online publication date: 17-Dec-2022

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    WWW '22: Companion Proceedings of the Web Conference 2022
    April 2022
    1338 pages
    ISBN:9781450391306
    DOI:10.1145/3487553
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 August 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Web of Things
    2. artificial intelligence
    3. data analytics
    4. rarity.

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    WWW '22
    Sponsor:
    WWW '22: The ACM Web Conference 2022
    April 25 - 29, 2022
    Virtual Event, Lyon, France

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)29
    • Downloads (Last 6 weeks)2

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Mining Rare Utility Patterns within Target Items2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386702(6015-6024)Online publication date: 15-Dec-2023
    • (2022)Fast Mining RFM Patterns for Behavioral Analytics2022 IEEE 9th International Conference on Data Science and Advanced Analytics (DSAA)10.1109/DSAA54385.2022.10032434(1-10)Online publication date: 13-Oct-2022
    • (2022)Targeted Mining of Rare High-Utility Patterns2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020226(6271-6280)Online publication date: 17-Dec-2022

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media