Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1276958.1277340acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
Article

A genetic algorithm for dynamic modelling and prediction of activity in document streams

Published: 07 July 2007 Publication History

Abstract

This paper presents an evolutionary algorithm for modeling the arrival dates of document streams, which is any time-stamped collection of documents, such as newscasts, e-mails, scientific journals archives and weblog postings. The goal is to find a frequency curve that fits the data circumventing the unavoidable noise. Classical dynamic programming algorithms are limited by memory and efficiency requirements, which can be a problem when dealing with long streams. This suggests to explore alternative search methods which although do not guarantee optimality, are far more efficient. Experiments have shown that the designed evolutionary algorithm is able to reach high quality solutions in a short time. We have also explored different approaches to infer whether new arrivals increase or decrease interest in the topic the document stream is about. In particular, we present a variant of the evolutionary algorithm, which is able to very quickly fit a stream extended with new data, by taking advantage of the fit obtained for the original substream. These mechanisms can be used for real time detection of changes in the trend of interest in a topic, an important application of this kind of models.

References

[1]
J. Allan. Topic Detection and Tracking: Event-Based Information Organization. Kluwer Academic Publishers, 2002.
[2]
J. Allan, J. Carbonell, G. Doddington, J. Yamron, and Y. Yang. Topic detection and tracking pilot study, 1998.
[3]
E. Bingham, A. Kabán, and M. Girolami. Topic identification in dynamical text by complexity pursuit. Neural Process. Lett., 17(1):69--83, 2003.
[4]
M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. In M. Charikar, K. Chen, and M. Farach-Colton. Finding frequent items in data streams. In Proceedings of the 29th International Colloquium on Automata, Languages, and Programming, 2002., 2002.
[5]
A. I. Elwalid and D. Mitra. Effective bandwidth of general markovian traffic sources and admission control of high speed networks. IEEE/ACM Trans. Netw., 1(3):329--343, 1993.
[6]
G. D. Forney. The Viterbi algorithm. Proceedings of The IEEE, 61(3):268--278, 1973.
[7]
M. Girolami and A. Kaban. Simplicial mixtures of Markov chains: Distributed modelling of dynamic user profiles. In S. Thrun, L. Saul, and B. Schölkopf, editors, Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA, 2004.
[8]
J. Kleinberg. Bursty and hierarchical structure in streams. In Proc. 8th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pages 91--101. ACM, 2002.
[9]
J. Kleinberg. Temporal dynamics of on-line information streams. In M. Garofalakis, J. Gehrke, and R. Rastogi, editors, Data Stream Management: Processing High-Speed Data Streams (to appear). Springer, 2005.
[10]
R. Papka. On-line New Event Detection, Clustering and Tracking. PhD thesis, Department of Computer Science, University of Massachusetts, 1999.
[11]
L. R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. In Readings in speech recognition, pages 267--296. Morgan Kaufmann Publishers Inc., 1990.

Cited By

View all
  • (2013)Dynamic Constrained Optimization with offspring repair based Gravitational Search Algorithm2013 IEEE Congress on Evolutionary Computation10.1109/CEC.2013.6557858(2414-2421)Online publication date: Jun-2013
  • (2013)Evolutionary Optimization on Continuous Dynamic Constrained Problems - An AnalysisEvolutionary Computation for Dynamic Optimization Problems10.1007/978-3-642-38416-5_8(193-217)Online publication date: 2013
  • (2013)Differential Evolution and Offspring Repair Method Based Dynamic Constrained Optimization4th International Conference on Swarm, Evolutionary, and Memetic Computing - Volume 829710.1007/978-3-319-03753-0_27(298-309)Online publication date: 19-Dec-2013
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '07: Proceedings of the 9th annual conference on Genetic and evolutionary computation
July 2007
2313 pages
ISBN:9781595936974
DOI:10.1145/1276958
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2007

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. buzz detection
  2. event stream modelling
  3. evolutionary algorithms
  4. online text streams

Qualifiers

  • Article

Conference

GECCO07
Sponsor:

Acceptance Rates

GECCO '07 Paper Acceptance Rate 266 of 577 submissions, 46%;
Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2013)Dynamic Constrained Optimization with offspring repair based Gravitational Search Algorithm2013 IEEE Congress on Evolutionary Computation10.1109/CEC.2013.6557858(2414-2421)Online publication date: Jun-2013
  • (2013)Evolutionary Optimization on Continuous Dynamic Constrained Problems - An AnalysisEvolutionary Computation for Dynamic Optimization Problems10.1007/978-3-642-38416-5_8(193-217)Online publication date: 2013
  • (2013)Differential Evolution and Offspring Repair Method Based Dynamic Constrained Optimization4th International Conference on Swarm, Evolutionary, and Memetic Computing - Volume 829710.1007/978-3-319-03753-0_27(298-309)Online publication date: 19-Dec-2013
  • (2012)Continuous Dynamic Constrained Optimization—The ChallengesIEEE Transactions on Evolutionary Computation10.1109/TEVC.2011.218053316:6(769-786)Online publication date: 1-Dec-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media