Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Rating news documents for similarity

Published: 01 July 2000 Publication History

Abstract

No abstract available.

Cited By

View all
  • (2019)Capturing the change in topical interests of personas over timeProceedings of the Association for Information Science and Technology10.1002/pra2.1156:1(127-136)Online publication date: 18-Oct-2019
  • (2004)A graph model for E-commerce recommender systemsJournal of the American Society for Information Science and Technology10.1002/asi.1037255:3(259-274)Online publication date: 1-Feb-2004
  • (2003)Topic detection and interest tracking in a dynamic online news sourceProceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries10.5555/827140.827157(122-124)Online publication date: 27-May-2003

Index Terms

  1. Rating news documents for similarity

      Recommendations

      Reviews

      Nagiza F. Samatova

      Readers of electronic news on the Web invariably seek answers for the six classic questions: who, what, where, when, why, and how. The authors address the challenge of how to extract these attributes automatically. This paper describes a system that recognizes dates and names of people, locations, and organizations, and rates news items based on their similarity to user-specified attribute phrases. The recognizer works by spotting capitalized name-phrases. The classifier places them into date, person, location, or organization categories by matching names from category-specific dictionaries. For analysis, authors use clean data with an XML-like mark up for the title, author, content, etc., which is not the case for most Web news sources full of extraneous information (e.g., advertisements). The accuracy of the classifier largely depends on the completeness of domain-specific dictionaries but the efficiency deteriorates with their size. It is unclear what trials were selected to achieve 90% classification accuracy and how efficient the recognizer and classifier are. To retrieve and rate news items, authors propose a similarity measure dependent on the similarity of attribute lists. Performance is measured by precision only. The precision of over 90% is demonstrated on a small set of 78 news items. Since the attribute lists are of indeterminate length, the efficiency of calculating the similarity is unclear. Unfortunately, this research is neither placed in context with the current literature on rating structured documents nor compared with traditional key-word retrieval systems. Thus, this work is somewhat preliminary and its applicability in the real world is questionable.

      Access critical reviews of Computing literature here

      Become a reviewer for Computing Reviews.

      Comments

      Information & Contributors

      Information

      Published In

      cover image Journal of the American Society for Information Science
      Journal of the American Society for Information Science  Volume 51, Issue 9
      July 2000
      89 pages
      ISSN:0002-8231
      Issue’s Table of Contents

      Publisher

      John Wiley & Sons, Inc.

      United States

      Publication History

      Published: 01 July 2000

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 03 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2019)Capturing the change in topical interests of personas over timeProceedings of the Association for Information Science and Technology10.1002/pra2.1156:1(127-136)Online publication date: 18-Oct-2019
      • (2004)A graph model for E-commerce recommender systemsJournal of the American Society for Information Science and Technology10.1002/asi.1037255:3(259-274)Online publication date: 1-Feb-2004
      • (2003)Topic detection and interest tracking in a dynamic online news sourceProceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries10.5555/827140.827157(122-124)Online publication date: 27-May-2003

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media