Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1871437.1871446acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Online annotation of text streams with structured entities

Published: 26 October 2010 Publication History

Abstract

We propose a framework and algorithm for annotating unbounded text streams with entities of a structured database. The algorithm allows one to correlate unstructured and dirty text streams from sources such as emails, chats and blogs, to entities stored in structured databases. In contrast to previous work on entity extraction, our emphasis is on performing entity annotation in a completely online fashion. The algorithm continuously extracts important phrases and assigns to them top-k relevant entities. Our algorithm does so with a guarantee of constant time and space complexity for each additional word in the text stream, thus infinite text streams can be annotated. Our framework allows the online annotation algorithm to adapt to changing stream rate by self-adjusting multiple run-time parameters to reduce or improve the quality of annotation for fast or slow streams, respectively. The framework also allows the online annotation algorithm to incorporate query feedback to learn the user preference and personalize the annotation for individual users.

References

[1]
S. Agrawal, K. Chakrabarti, S. Chaudhuri, and V. Ganti. Scalable Ad-Hoc Entity Extraction From Text Collections. PVLDB, 1(1):945--957, 2008.
[2]
A. Chandel, P. C. Nagesh, and S. Sarawagi. Efficient Batch Top-k Search for Dictionary-based Entity Recognition. In ICDE, page 28, 2006.
[3]
S. Chaudhuri, V. Ganti, and D. Xin. Exploiting Web Search To Generate Synonyms For Entities. In WWW, pages 151--160, 2009.
[4]
W. W. Cohen and S. Sarawagi. Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods. In KDD, pages 89--98, 2004.
[5]
O. Etzioni, M. J. Cafarella, D. Downey, A. Popescu, T. Shaked, S. Soderland, D. S. Weld, and A. Yates. Unsupervised Named-Entity Extraction from the Web: An Experimental Study. Artif. Intell., 165(1):91--134, 2005.
[6]
S. Ji, G. Li, C. Li, and J. Feng. Efficient Interactive Fuzzy Keyword Search. In WWW, pages 371--380, 2009.
[7]
William Kocay and Donald Kreher. Graphs, algorithms, and optimization. Chapman and Hall, 2005.
[8]
F. Liu, C. Yu, W. Meng, and A. Chowdhury. Effective Keyword Search In Relational Databases. In SIGMOD, pages 563--574, 2006.
[9]
C. D. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge Univ. Press, 2008.
[10]
C. D. Manning and H. Schutze. Foudations of Statistical Natural Language Processing. The MIT Press, 1999.
[11]
J. A. Orenstein and T. H. Merrett. A class of data structures for associative searching. In PODS, pages 181--190, 1984.
[12]
K. Q. Pu and X. Yu. Keyword Query Cleaning. PVLDB, 1(1):909--920, 2008.
[13]
V. S. Uren, P. Cimiano, J. Iria, S. Handschuh, M. Vargas-Vera, E. Motta, and F. Ciravegna. Semantic Annotation for Knowledge Management: Requirements and a Survey of the State of the Art. JWS, 4(1):14--28, 2006.
[14]
W. Wang, C. Xiao, X. Lin, and C. Zhang. Efficient Approximate Entity Extraction with Edit Distance Constraints. In SIGMOD, pages 759--770, 2009.
[15]
Y. Yang, N. Bansal, W. Dakka, P. G. Ipeirotis, N. Koudas, and D. Papadias. Query by Document. In WSDM, pages 34--43, 2009.
[16]
J. X. Yu, L. Qin, and L. Chang. Keyword Search in Databases. Synthesis Lectures on Data Management, 1(1):1--155, 2009.

Cited By

View all
  • (2019)Deep Entity Linking via Eliminating Semantic Ambiguity With BERTIEEE Access10.1109/ACCESS.2019.29554987(169434-169445)Online publication date: 2019
  • (2016)Graph-Based Jointly Modeling Entity Detection and Linking in Domain-Specific AreaKnowledge Graph and Semantic Computing: Semantic, Knowledge, and Linked Big Data10.1007/978-981-10-3168-7_15(146-159)Online publication date: 23-Nov-2016
  • (2016)Domain-Specific Entity Linking via Fake Named Entity DetectionDatabase Systems for Advanced Applications10.1007/978-3-319-32025-0_7(101-116)Online publication date: 25-Mar-2016
  • Show More Cited By

Index Terms

  1. Online annotation of text streams with structured entities

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '10: Proceedings of the 19th ACM international conference on Information and knowledge management
    October 2010
    2036 pages
    ISBN:9781450300995
    DOI:10.1145/1871437
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 October 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. annotation
    2. entity
    3. online
    4. text stream

    Qualifiers

    • Research-article

    Conference

    CIKM '10

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 22 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Deep Entity Linking via Eliminating Semantic Ambiguity With BERTIEEE Access10.1109/ACCESS.2019.29554987(169434-169445)Online publication date: 2019
    • (2016)Graph-Based Jointly Modeling Entity Detection and Linking in Domain-Specific AreaKnowledge Graph and Semantic Computing: Semantic, Knowledge, and Linked Big Data10.1007/978-981-10-3168-7_15(146-159)Online publication date: 23-Nov-2016
    • (2016)Domain-Specific Entity Linking via Fake Named Entity DetectionDatabase Systems for Advanced Applications10.1007/978-3-319-32025-0_7(101-116)Online publication date: 25-Mar-2016
    • (2015)Entity Linking with a Knowledge Base: Issues, Techniques, and SolutionsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2014.232702827:2(443-460)Online publication date: 1-Feb-2015
    • (2011)Tool support for technology scouting using online sourcesProceedings of the 30th international conference on Advances in conceptual modeling: recent developments and new directions10.5555/2075202.2075266(371-376)Online publication date: 31-Oct-2011
    • (2011)EnBlogueProceedings of the 2011 ACM SIGMOD International Conference on Management of data10.1145/1989323.1989473(1271-1274)Online publication date: 12-Jun-2011
    • (2011)Tool Support for Technology Scouting Using Online SourcesAdvances in Conceptual Modeling. Recent Developments and New Directions10.1007/978-3-642-24574-9_53(371-376)Online publication date: 2011

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media