Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1242572.1242778acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
Article

Altering document term vectors for classification: ontologies as expectations of co-occurrence

Published: 08 May 2007 Publication History
  • Get Citation Alerts
  • Abstract

    In this paper we extend the state-of-the-art in utilizing background knowledge for supervised classification by exploiting the semantic relationships between terms explicated in Ontologies. Preliminary evaluations indicate that the new approach generally improves precision and recall, more so for hard to classify cases and reveals patterns indicating the usefulness of such background knowledge.

    References

    [1]
    Aleman-Meza B. et. al, Semantic Analytics on Social Networks: Experiences in Addressing the Problem of Conflict of Interest Detection. WWW 2006.
    [2]
    Aleman-Meza B. et al., An Ontological Approach to the Document Access Problem of Insider Threat. IEEE International Conference on Intelligence and Security Informatics, 2005.
    [3]
    Kemafor A. et. al. SemRank: Ranking Complex Relationship Search Results on the Semantic Web. WWW 2005.
    [4]
    http://lucene.apache.org/java/docs/index.html Apache Lucene
    [5]
    Baeza-Yates R., B. Ribeiro-Neto, Modern Information Retrieval. 1999 Addison--Wesley.
    [6]
    Cavnar W.B., J.M. Trenkle, N-Gram-Based Text Categorization. 1994 In Proceedings of the SDAIR.
    [7]
    Semantic Document Classification http://lsdis.cs.uga.edu/semdis/DocumentClassification.html
    [8]
    Halaschek C. et. al. Discovering and Ranking Semantic Associations over a Large RDF Metabase. VLDB 2004
    [9]
    Han, E. and Karypis, G., Centroid-Based Document Classification: Analysis Experimental Results Principles of Data Mining and Knowledge Discovery, 2000
    [10]
    Miller George A, WordNet: A Lexical Database for English. 1995 Communications of the ACM, 38 (11). 39--41.
    [11]
    Mladenic D. and M. Grobelnik., Feature selection for classification based on text hierarchy. Automated Learning and Discovery 1998.
    [12]
    Salton G. and C. Buckley, Term Weighting Approaches in Automatic Text Retrieval. 1987 Technical Report
    [13]
    Salton G. et al., A Vector Space Model for Automatic Indexing. 1975 Communications of the ACM, vol. 18, nr. 11, pages 613--620.
    [14]
    Scott S. and S. Matwin., Text Classification Using WordNet Hypernyms. Use of WordNet in Natural Language Processing Systems, 1998.

    Cited By

    View all
    • (2022)Framework for the Analysis of Resilient Performance Conditionings in Integrated Operations of the Oil and Gas IndustryResilience in a Digital Age10.1007/978-3-030-85954-1_6(71-92)Online publication date: 11-Mar-2022
    • (2018)Document Clustering Using an Ontology-Based Vector Space ModelInformation Retrieval and Management10.4018/978-1-5225-5191-1.ch085(1860-1883)Online publication date: 2018
    • (2017)Semantic enrichment of product data supported by machine learning techniques2017 International Conference on Engineering, Technology and Innovation (ICE/ITMC)10.1109/ICE.2017.8280056(1472-1479)Online publication date: Jun-2017
    • Show More Cited By

    Index Terms

    1. Altering document term vectors for classification: ontologies as expectations of co-occurrence

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        WWW '07: Proceedings of the 16th international conference on World Wide Web
        May 2007
        1382 pages
        ISBN:9781595936547
        DOI:10.1145/1242572
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 08 May 2007

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. background domain knowledge
        2. ranking semantic relationships
        3. supervised document classification
        4. vector space models

        Qualifiers

        • Article

        Conference

        WWW'07
        Sponsor:
        WWW'07: 16th International World Wide Web Conference
        May 8 - 12, 2007
        Alberta, Banff, Canada

        Acceptance Rates

        Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)3
        • Downloads (Last 6 weeks)0
        Reflects downloads up to

        Other Metrics

        Citations

        Cited By

        View all
        • (2022)Framework for the Analysis of Resilient Performance Conditionings in Integrated Operations of the Oil and Gas IndustryResilience in a Digital Age10.1007/978-3-030-85954-1_6(71-92)Online publication date: 11-Mar-2022
        • (2018)Document Clustering Using an Ontology-Based Vector Space ModelInformation Retrieval and Management10.4018/978-1-5225-5191-1.ch085(1860-1883)Online publication date: 2018
        • (2017)Semantic enrichment of product data supported by machine learning techniques2017 International Conference on Engineering, Technology and Innovation (ICE/ITMC)10.1109/ICE.2017.8280056(1472-1479)Online publication date: Jun-2017
        • (2017)Mathematical Method of Translation into Ukrainian Sign Language Based on OntologiesAdvances in Intelligent Systems and Computing II10.1007/978-3-319-70581-1_7(89-100)Online publication date: 22-Nov-2017
        • (2016)Improving ontology-based text classificationJournal of Applied Logic10.1016/j.jal.2015.09.00817:C(48-58)Online publication date: 1-Sep-2016
        • (2015)Document Clustering Using an Ontology-Based Vector Space ModelInternational Journal of Information Retrieval Research10.4018/IJIRR.20150701035:3(39-60)Online publication date: Jul-2015
        • (2015)Management of Knowledge Sources Supported by Domain OntologiesInternational Journal of Intelligent Systems in Accounting and Finance Management10.1002/isaf.136122:1(29-64)Online publication date: 1-Jan-2015
        • (2014)Text Classification Techniques in Oil Industry ApplicationsInternational Joint Conference SOCO’13-CISIS’13-ICEUTE’1310.1007/978-3-319-01854-6_22(211-220)Online publication date: 2014
        • (2013)Distributional term representations for short-text categorizationProceedings of the 14th international conference on Computational Linguistics and Intelligent Text Processing - Volume 210.1007/978-3-642-37256-8_28(335-346)Online publication date: 24-Mar-2013
        • (2009)Exploiting term relationship to boost text classificationProceedings of the 18th ACM conference on Information and knowledge management10.1145/1645953.1646192(1637-1640)Online publication date: 2-Nov-2009
        • Show More Cited By

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media