Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1076034.1076055acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

A probabilistic model for retrospective news event detection

Published: 15 August 2005 Publication History

Abstract

Retrospective news event detection (RED) is defined as the discovery of previously unidentified events in historical news corpus. Although both the contents and time information of news articles are helpful to RED, most researches focus on the utilization of the contents of news articles. Few research works have been carried out on finding better usages of time information. In this paper, we do some explorations on both directions based on the following two characteristics of news articles. On the one hand, news articles are always aroused by events; on the other hand, similar articles reporting the same event often redundantly appear on many news sources. The former hints a generative model of news articles, and the latter provides data enriched environments to perform RED. With consideration of these characteristics, we propose a probabilistic model to incorporate both content and time information in a unified framework. This model gives new representations of both news articles and news events. Furthermore, based on this approach, we build an interactive RED system, HISCOVERY, which provides additional functions to present events, Photo Story and Chronicle.

References

[1]
Topic detection and tracking(tdt) project. homepage: http://www.nist.gov/speech/tests/tdt/.
[2]
J. Allan, H. Jin, M. Rajman, C. Wayne, G. D., L. V., R. Hoberman, and D. Caputo. Summer workshop final report. In Center for Language and Speech Processing, 1999.
[3]
J. Allan, R. Papka, and V. Lavrenko. On-line new event detection and tracking. In Proc. of SIGIR Conference on Research and Development in Information Retrieval, 1998.
[4]
D. M. Bikel, R. L. Schwartz, and R. M. Weischedel. An algorithm that learns what's in a name. Machine Learning, 1999.
[5]
T. Brants, F. Chen, and A. Farahat. A system for new event detection. In Proc. of the SIGIR conference on Research and development in information retrieval, 2003.
[6]
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer-Verlag, 2001.
[7]
G. Kumaran and J. Allan. Text classification and named entities for new event detection. In Proc. of the SIGIR Conference on Research and Development in Information Retrieval, 2004.
[8]
W. Lam, H. Meng, K. Wong, and J. Yen. Using contextual analysis for news event detection. International Journal on Intelligent Systems, 2001.
[9]
K. Nigam, A. McCallum, S. Thrun, and T. Mitchell. Text classification from labeled and unlabeled documents using em. Machine Learning, 2000.
[10]
A. Strehl, J. Ghosh, and R. Mooney. Impact of the similarity measures on web-page clustering. In Proc. of the AAAI 2000 Workshop on AI for Web Search, 2000.
[11]
Y. Yang and J. Z. et al. Topic-conditioned novelty detection. In Proc. of the SIGKDD international conference on Knowledge discovery and data mining, 2002.
[12]
Y. Yang, T. Pierce, and J. G. Carbonell. A study on retrospective and on-line event detection. In Proc. of the SIGIR Conference on Research and Development in Information Retrieval, 1998.
[13]
Y. Zhang, J. Callan, and T. Minka. Novelty and redundancy detection in adaptive filtering. In Proc. of the SIGIR Conference on Research and Development in Information Retrieval, 2002.

Cited By

View all
  • (2024)Global News Synchrony and Diversity During the Start of the COVID-19 PandemicProceedings of the ACM Web Conference 202410.1145/3589334.3645645(2639-2650)Online publication date: 13-May-2024
  • (2024)Disease outbreak prediction using natural language processing: a reviewKnowledge and Information Systems10.1007/s10115-024-02192-6Online publication date: 6-Aug-2024
  • (2023)Machine Learning Based Representative Spatio-Temporal Event Documents ClassificationApplied Sciences10.3390/app1307423013:7(4230)Online publication date: 27-Mar-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
August 2005
708 pages
ISBN:1595930345
DOI:10.1145/1076034
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 August 2005

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustering
  2. expectation maximization
  3. maximum likelihood
  4. retrospective news event detection

Qualifiers

  • Article

Conference

SIGIR05
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)36
  • Downloads (Last 6 weeks)8
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Global News Synchrony and Diversity During the Start of the COVID-19 PandemicProceedings of the ACM Web Conference 202410.1145/3589334.3645645(2639-2650)Online publication date: 13-May-2024
  • (2024)Disease outbreak prediction using natural language processing: a reviewKnowledge and Information Systems10.1007/s10115-024-02192-6Online publication date: 6-Aug-2024
  • (2023)Machine Learning Based Representative Spatio-Temporal Event Documents ClassificationApplied Sciences10.3390/app1307423013:7(4230)Online publication date: 27-Mar-2023
  • (2023)An Approach for Detecting Dangerous Events from Online Text Using Transformers2023 IEEE International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT59888.2023.00040(256-262)Online publication date: 26-Oct-2023
  • (2022)A Survey of Data Representation for Multi-Modality Event Detection and EvolutionApplied Sciences10.3390/app1204220412:4(2204)Online publication date: 20-Feb-2022
  • (2022)Enhancement of Twitter event detection using news streamsNatural Language Engineering10.1017/S1351324921000462(1-20)Online publication date: 24-Jan-2022
  • (2022)GeoClustExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.118461210:COnline publication date: 30-Dec-2022
  • (2022)Detection of dangerous events on social media: a critical reviewSocial Network Analysis and Mining10.1007/s13278-022-00980-y12:1Online publication date: 22-Oct-2022
  • (2022)Event Detection in Live Twitter Streams Using Tf-Idf and Clustering AlgorithmsData, Engineering and Applications10.1007/978-981-19-4687-5_36(469-480)Online publication date: 12-Oct-2022
  • (2022)Efficacy of Online Event Detection with Contextual and Structural BiasesEmerging Technologies in Computer Engineering: Cognitive Computing and Intelligent IoT10.1007/978-3-031-07012-9_36(419-430)Online publication date: 26-May-2022
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media