Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

D2S: Document-to-sentence framework for novelty detection

Published: 01 November 2011 Publication History

Abstract

Novelty detection aims at identifying novel information from an incoming stream of documents. In this paper, we propose a new framework for document-level novelty detection using document-to-sentence (D2S) annotations and discuss the applicability of this method. D2S first segments a document into sentences, determines the novelty of each sentence, then computes the document-level novelty score based on a fixed threshold. Experimental results on APWSJ data show that D2S outperforms standard document-level novelty detection in terms of redundancy-precision (RP) and redundancy-recall (RR). We applied D2S on the document-level data from the TREC 2004 and TREC 2003 Novelty Track and find that D2S is useful in detecting novel information in data with a high percentage of novel documents. However, D2S shows a strong capability to detect redundant information regardless of the percentage of novel documents. D2S has been successfully integrated in a real-world novelty detection system.

Cited By

View all
  • (2024)Predicting document novelty: an unsupervised learning approachKnowledge and Information Systems10.1007/s10115-023-01989-166:3(1709-1728)Online publication date: 1-Mar-2024
  • (2017)Personalized News Article Recommendation with Novelty Using Collaborative Filtering Based Rough Set TheoryMobile Networks and Applications10.1007/s11036-017-0842-922:4(719-729)Online publication date: 1-Aug-2017
  • (2014)Second order probabilistic models for within-document novelty detection in academic articlesProceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval10.1145/2600428.2609520(1103-1106)Online publication date: 3-Jul-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Knowledge and Information Systems
Knowledge and Information Systems  Volume 29, Issue 2
November 2011
242 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 November 2011

Author Tags

  1. Document novelty
  2. Novelty dataset
  3. Redundancy
  4. Sentence segmentation
  5. Text mining

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Predicting document novelty: an unsupervised learning approachKnowledge and Information Systems10.1007/s10115-023-01989-166:3(1709-1728)Online publication date: 1-Mar-2024
  • (2017)Personalized News Article Recommendation with Novelty Using Collaborative Filtering Based Rough Set TheoryMobile Networks and Applications10.1007/s11036-017-0842-922:4(719-729)Online publication date: 1-Aug-2017
  • (2014)Second order probabilistic models for within-document novelty detection in academic articlesProceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval10.1145/2600428.2609520(1103-1106)Online publication date: 3-Jul-2014
  • (2014)Clustering web documents using hierarchical representation with multi-granularityWorld Wide Web10.1007/s11280-012-0197-x17:1(105-126)Online publication date: 1-Jan-2014
  • (2013)A segment-based approach to clustering multi-topic documentsKnowledge and Information Systems10.1007/s10115-012-0556-z34:3(563-595)Online publication date: 1-Mar-2013
  • (2012)Modeling collective blogging dynamics of popular incidental topicsKnowledge and Information Systems10.5555/3225628.322572731:2(371-387)Online publication date: 1-May-2012
  • (2012)A data-centric approach to feed search in blogsInternational Journal of Web Engineering and Technology10.1504/IJWET.2012.0485197:3(228-249)Online publication date: 1-Aug-2012
  • (2011)Chinese categorization and novelty miningProceedings of the 15th Pacific-Asia conference on Advances in knowledge discovery and data mining - Volume Part II10.5555/2022850.2022874(284-295)Online publication date: 24-May-2011
  • (2011)Multilingual novelty detectionExpert Systems with Applications: An International Journal10.1016/j.eswa.2010.07.01638:1(652-658)Online publication date: 1-Jan-2011

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media