Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1835449.1835672acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
poster

Graphical models for text: a new paradigm for text representation and processing

Published: 19 July 2010 Publication History

Abstract

Almost all text applications use the well known vector-space model for text representation and analysis. While the vector-space model has proven itself to be an effective and efficient representation for mining purposes, it does not preserve information about the ordering of the words in the representation. In this paper, we will introduce the concept of distance graph representations of text data. Such representations preserve distance and ordering information between the words, and provide a much richer representation of the underlying text. This approach enables knowledge discovery from text which is not possible with the use of a pure vector-space representation, because it loses much less information about the ordering of the underlying words. Furthermore, this representation does not require the development of new mining and management techniques. This is because the technique can also be converted into a structural version of the vector-space representation, which allows the use of all existing tools for text. In addition, existing techniques for graph and XML data can be directly leveraged with this new representation. Thus, a much wider spectrum of algorithms is available for processing this representation.

Reference

[1]
. Aggarwal, P. Zhao. Graphical Models for Text: A New Paradigm for Text Representation and Processing, IBM Research Report, 2010.

Cited By

View all
  • (2023)Deep Learning for Natural Language Processing: A SurveyJournal of Mathematical Sciences10.1007/s10958-023-06519-6273:4(533-582)Online publication date: 26-Jun-2023
  • (2017)Hierarchical document clustering based on cosine similarity measure2017 1st International Conference on Intelligent Systems and Information Management (ICISIM)10.1109/ICISIM.2017.8122166(153-159)Online publication date: Oct-2017
  • (2013)E-VSM: Novel text representation model to capture contex-based closeness between two text documents2013 7th International Conference on Intelligent Systems and Control (ISCO)10.1109/ISCO.2013.6481176(345-348)Online publication date: Jan-2013
  • Show More Cited By

Index Terms

  1. Graphical models for text: a new paradigm for text representation and processing

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
    July 2010
    944 pages
    ISBN:9781450301534
    DOI:10.1145/1835449
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 July 2010

    Check for updates

    Author Tag

    1. text representations

    Qualifiers

    • Poster

    Conference

    SIGIR '10
    Sponsor:

    Acceptance Rates

    SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;
    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)3
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Deep Learning for Natural Language Processing: A SurveyJournal of Mathematical Sciences10.1007/s10958-023-06519-6273:4(533-582)Online publication date: 26-Jun-2023
    • (2017)Hierarchical document clustering based on cosine similarity measure2017 1st International Conference on Intelligent Systems and Information Management (ICISIM)10.1109/ICISIM.2017.8122166(153-159)Online publication date: Oct-2017
    • (2013)E-VSM: Novel text representation model to capture contex-based closeness between two text documents2013 7th International Conference on Intelligent Systems and Control (ISCO)10.1109/ISCO.2013.6481176(345-348)Online publication date: Jan-2013
    • (2013)Towards graphical models for text processingKnowledge and Information Systems10.1007/s10115-012-0552-336:1(1-21)Online publication date: 1-Jul-2013
    • (2012)Searching by corpus with fingerprintsProceedings of the 15th International Conference on Extending Database Technology10.1145/2247596.2247638(348-359)Online publication date: 27-Mar-2012

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media