Abstract
As technology advances, floods of data can be produced and shared in many applications such as wireless sensor networks or Web click streams. This calls for efficient mining techniques for extracting useful information and knowledge from streams of data. In this paper, we propose a novel algorithm for stream mining of frequent itemsets in a limited memory environment. This algorithm uses a compact tree structure to capture important contents from streams of data. By exploiting its nice properties, such a tree structure can be easily maintained and can be used for mining frequent itemsets, as well as other patterns like constrained itemsets, even when the available memory space is small.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Agrawal, R., et al.: Mining association rules between sets of items in large databases. In: Proc. ACM SIGMOD, pp. 207–216 (2003)
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. VLDB, pp. 487–499 (1994)
Bashir, S., Baig, A.R.: Max-FTP: mining maximal fault-tolerant frequent patterns from databases. In: Cooper, R., Kennedy, J. (eds.) BNCOD 2007. LNCS, vol. 4587, pp. 235–246. Springer, Heidelberg (2007)
Bucila, C., et al.: DualMiner: a dual-pruning algorithm for itemsets with constraints. In: Proc. ACM KDD, pp. 42–51 (2002)
Chi, Y., et al.: Moment: maintaining closed frequent itemsets over a stream sliding window. In: Proc. IEEE ICDM, pp. 59–66 (2004)
El-Hajj, M., Zaïane, O.R.: COFI-tree mining: a new approach to pattern growth with reduced candidacy generation. In: Proc. FIMI (2003)
El-Hajj, M., Zaïane, O.R.: Non-recursive generation of frequent k-itemsets from frequent pattern tree representations. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 371–380. Springer, Heidelberg (2003)
Gaber, M.M., et al.: Mining data streams: a review. ACM SIGMOD Record 34(2), 18–26 (2005)
Giannella, C., et al.: Mining frequent patterns in data streams at multiple time granularities. In: Data Mining: Next Generation Challenges and Future Directions, ch. 6. AAAI/MIT Press (2004)
Guo, Y., et al.: A FP-tree based method for inverse frequent set mining. In: Bell, D.A., Hong, J. (eds.) BNCOD 2006. LNCS, vol. 4042, pp. 152–163. Springer, Heidelberg (2006)
Han, J., et al.: Mining frequent patterns without candidate generation. In: Proc. ACM SIGMOD, pp. 1–12 (2000)
Jin, R., Agrawal, G.: An algorithm for in-core frequent itemset mining on streaming data. In: Proc. IEEE ICDM, pp. 210–217 (2005)
Lakshmanan, L.V.S., Leung, C.K.-S., Ng, R.T.: Efficient dynamic mining of constrained frequent sets. ACM TODS 28(4), 337–389 (2003)
Leung, C.K.-S., et al.: A tree-based approach for frequent pattern mining from uncertain data. In: Washio, T., et al. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 653–661 (2008)
Leung, C.K.-S., et al.: CanTree: a canonical-order tree for incremental frequent-pattern mining. KAIS 11(3), 287–311 (2007)
Leung, C.K.-S., et al.: Exploiting succinct constraints using FP-trees. ACM SIGKDD Explorations 4(1), 40–49 (2002)
Leung, C.K.-S., et al.: FIsViz: a frequent itemset visualizer. In: Washio, T., et al. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 644–652. Springer, Heidelberg (2008)
Leung, C.K.-S., Khan, Q.I.: DSTree: a tree structure for the mining of frequent sets from data streams. In: Proc. IEEE ICDM, pp. 928–932 (2006)
Leung, C.K.-S., Khan, Q.I.: Efficient mining of constrained frequent patterns from streams. In: Proc. IDEAS, pp. 61–68 (2006)
Ng, R.T., et al.: Exploratory mining and pruning optimizations of constrained associations rules. In: Proc. ACM SIGMOD, pp. 13–24 (1998)
Pei, J., et al.: Mining frequent itemsets with convertible constraints. In: Proc. IEEE ICDE, pp. 433–442 (2001)
Yu, J.X., et al.: False positive or false negative: mining frequent itemsets from high speed transactional data streams. In: Proc. VLDB, pp. 204–215 (2004)
Zaki, M.J., Hsiao, C.-J.: CHARM: an efficient algorithm for closed itemset mining. In: Proc. SDM, pp. 457–473 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leung, C.KS., Brajczuk, D.A. (2008). Efficient Mining of Frequent Itemsets from Data Streams. In: Gray, A., Jeffery, K., Shao, J. (eds) Sharing Data, Information and Knowledge. BNCOD 2008. Lecture Notes in Computer Science, vol 5071. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70504-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-70504-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70503-1
Online ISBN: 978-3-540-70504-8
eBook Packages: Computer ScienceComputer Science (R0)