Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Variable Support Mining of Frequent Itemsets over Data Streams Using Synopsis Vectors

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3918))

Included in the following conference series:

Abstract

Mining frequent itemsets over data streams is an emergent research topic in recent years. Previous approaches generally use a fixed support threshold to discover the patterns in the stream. However, the threshold will be changed to cope with the needs of the users and the characteristics of the incoming data in reality. Changing the threshold implies a re-mining of the whole transactions in a non-streaming environment. Nevertheless, the "look-once" feature of the streaming data cannot provide the discarded transactions so that a re-mining on the stream is impossible. Therefore, we propose a method for variable support mining of frequent itemsets over the data stream. A synopsis vector is constructed for maintaining statistics of past transactions and is invoked only when necessary. The conducted experimental results show that our approach is efficient and scalable for variable support mining in data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agrawal, R., Srikant, R.: Fast Algorithm for Mining Association Rules. In: Proc. of the 20th International Conference on Very Large Databases (VLDB 1994), pp. 487–499 (1994)

    Google Scholar 

  2. Babcock, B., Babu, S., Datar, M., Motwani, R., Widom, J.: Models and Issues in data stream systems. In: Proc. of the 2002 ACM Symposium on Principles of Database Systems (PODS 2002). ACM Press, New York (2002)

    Google Scholar 

  3. Chi, Y., Wang, H.: Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window. In: Perner, P. (ed.) ICDM 2004. LNCS, vol. 3275, pp. 59–66. Springer, Heidelberg (2004)

    Google Scholar 

  4. Giannella, C., Han, J., Pei, J., Yan, X., Yu, P.S.: Mining Frequent Patterns in Data Streams at Multiple Time Granularities. In: Proc. of the NSF Workshop on Next Generation Data Mining (2002)

    Google Scholar 

  5. Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation. In: Proc. of the 2000 ACM SIGMOD International Conference on Management of Data, vol. 9(2), pp. 1–12 (1999)

    Google Scholar 

  6. Koyuturk, M., Grama, A., Ramakrishnan, N.: Compression, clustering and pattern discovery in very high dimensional discrete-attribute datasets. IEEE Transactions on Knowledge and Data Engineering 17(5), 447–461 (2005)

    Article  Google Scholar 

  7. Li, H.F., Lee, S.Y., Shan, M.K.: An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams. In: Proc. of the First International Workshop on Knowledge Discovery in Data Streams, Pisa, Italy, September 2004, pp. 20–24 (2004)

    Google Scholar 

  8. Lin, M.Y., Lee, S.Y.: Interactive Sequence Discovery by Incremental Mining. Information Sciences: An International Journal 165(3-4), 187–205 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  9. Lin, M.Y., Lee, S.Y.: A Fast Lexicographic Algorithm for Association Rule Mining in Web Applications. In: Proc. of the ICDCS Workshop on Knowledge Discovery and Data Mining in the World-Wide Web, Taipei, Taiwan, R.O.C, pp. F7–F14 (2000)

    Google Scholar 

  10. Manku, G.S., Motwani, R.: Approximate Frequency Counts over Data Streams. In: Proc. of the 28th VLDB Conference, Hong Kong, China, August 2002, pp. 346–357 (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lin, MY., Hsueh, SC., Hwang, SK. (2006). Variable Support Mining of Frequent Itemsets over Data Streams Using Synopsis Vectors. In: Ng, WK., Kitsuregawa, M., Li, J., Chang, K. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2006. Lecture Notes in Computer Science(), vol 3918. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11731139_84

Download citation

  • DOI: https://doi.org/10.1007/11731139_84

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33206-0

  • Online ISBN: 978-3-540-33207-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics