Abstract
Recently, due to technical improvements of storage devices and networks, the amount of data increases rapidly. In addition, it is required to find the knowledge embedded in a data stream as fast as possible. Data stream is influenced by time. Therefore, the itemsets which were not the frequent itemsets can become frequent itemsets. The volume of data stream is so large that it can hardly be stored in finite memory space. Current researches do not offer appropriate method to find frequent itemsets in which flow of time is reflected but provide only frequent items using total aggregation values. In this paper we propose a novel algorithm for finding the relative frequent itemsets according to the time in a data stream. We also propose a method to save frequent items and sub-frequent items in order to take limited memory into account and a method to update time variant frequent items. By applying the proposed technique, we can improve the accuracy of searching for a change in the frequent itemsets according to the time in a data stream. Moreover, it will be able to use the limited memory space efficiently and store all frequent itemsets.
Chapter PDF
Similar content being viewed by others
References
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th Intl. Conf. on Very Large Databases (1994)
Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and issues in data stream systems. In: Proc. of SIGMOD/PODS, Madison, Wisconsin, USA, pp. 1–16 (2002)
Chang, J., Lee, W.: Finding recent frequent itemsets adaptively over online data Streams. In: Proc. of the 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery & Data Mining, Washington, DC, pp. 226–235 (2003)
Charikar, M., Chen, K., Farach-Colton, M.: Finding frequent items in data streams. In: Procedings of the International Colloquium on Automata, Languages and Programming, pp. 693–703 (2002)
Cheng, J., Ke, Y., Ng, W.: Maintaining Frequent Itemsets over High-Speed Data Streams. In: Ng, W.-K., Kitsuregawa, M., Li, J., Chang, K. (eds.) PAKDD 2006. LNCS (LNAI), vol. 3918, Springer, Heidelberg (2006)
Chi, Y., Wang, H., Yu, P., Muntz, R.: MOMENT: Maintaining closed frequent itemsets over a stream sliding window. In: Proc. of 4th IEEE Intl. Conf. on Data Mining, Brighton, UK, pp. 59–66 (2004)
Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Kargupta, H., Joshi, A., Sivakumar, K., Yesha, Y. (eds.) Next Generation Data Mining, AAAI/MIT (2003)
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: Proceedings of the SIGMOD Conference, Dallas, Texas, USA, pp. 1–12. ACM Press, New York (2000)
Manku, G., Motwani, R.: Approximate frequency counts over data streams. In: Proceedings of 28th International Conference on Very Large Data Bases, pp. 346–357 (2002)
Yu, J.X., Chong, Z., Lu, H., Zhou, A.: False Positive or False Negative: Mining Frequent Itemsets from High Speed Transactional Data Streams. In: VLDB 2004, pp. 204–215 (2004)
Zhang, D., Gunopulos, D., Tsotras, V.J., Seeger, B.: Temporal Aggregation over Data Streams using Multiple Granlarities. In: Jensen, C.S., Jeffery, K.G., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, Springer, Heidelberg (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Park, TS., Lee, JH., Park, SH., Choi, B., Kim, DH. (2006). Search Method of Time Sensitive Frequent Itemsets in Data Streams. In: Martínez-Trinidad, J.F., Carrasco Ochoa, J.A., Kittler, J. (eds) Progress in Pattern Recognition, Image Analysis and Applications. CIARP 2006. Lecture Notes in Computer Science, vol 4225. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11892755_53
Download citation
DOI: https://doi.org/10.1007/11892755_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-46556-0
Online ISBN: 978-3-540-46557-7
eBook Packages: Computer ScienceComputer Science (R0)