Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2770897.2770902guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A language independent approach for aligning subject heading systems with geographic ontologies

Published: 21 September 2011 Publication History

Abstract

Subject headings systems are tools for organization of knowledge that have been developed over the years by libraries. The SKOS Simple Knowledge Organization System provides a practical way to represent subject headings systems, and several libraries have taken the initiative to make these systems widely available as open linked data. Each individual subject heading describes a concept, however, in the majority of cases, one subject heading is actually a combination of several concepts, such as a topic bounded in geographical and temporal scopes. In these cases, the label of the concept actually contains several concepts which are not represented in structured form. This paper address the alignment of the geographic concepts described in subject headings systems with their correspondence in geographic ontologies. Our approach first recognizes the place names in the subject headings using entity recognition techniques and follows with the resolution of the place names in a target geographic ontology. The system is based on machine learning and was designed to be language independent so that it can be applied to the many existing subject headings systems. Our approach was evaluated on a subset of the Library of Congress Subject Headings, achieving an F1 score of 93%.

References

[1]
Amitay, E., Har'El, N., Sivan, R., Soffer, A. (2004). Web-a-where: geotagging web content. In Proceedings of the 27th Annual international ACM SIGIR Conference on Research and Development in information Retrieval.
[2]
Bikel, D., Daniel, M., Miller, S., Schwartz, R., Weischedel, R. (1997). Nymble: a High-Performance Learning Name-finder. Proceedings of the Conference on Applied Natural Language Processing.
[3]
Hoerman, H.L., Furniss, K. A. (2000). Turning Practice into Principles: A Comparison of the IFLA Principles Underlying Subject Heading Languages (SHLs) and the Principles Underlying the Library of Congress Subject Headings System. The Haworth Press, Inc., Cataloging & Classification Quarterly, vol. 29, no. 1/2, 31-52.
[4]
Isaac, A., Matthezing, H., Schlobach, S., Zinn, C. (2008). Integrated access to cultural heritage resources through representation and alignment of controlled vocabularies. Library Review, 57.
[5]
Kanada, Y. (1999). A method of geographical name extraction from Japanese text for thematic geographical search. In proceedings of the 8th International Conference on Information and Knowledge Management.
[6]
Kohavi, R., G. John. (1997). Wrappers for feature selection. Artificial Intelligence, 97(1-2):273-324.
[7]
Lafferty, J., McCallum, A., Pereira, F. (2001). Conditional random fields: probabilistic models for segmenting and labeling sequence data. In International Conference on Machine Learning.
[8]
Leidner, J. (2007). Toponym Resolution in Text. PhD thesis, University of Edinburgh.
[9]
McCallum, A., Freitag, D., Pereira, F. (2000). Maximum entropy Markov models for information extraction and segmentation. International Conference on Machine Learning.
[10]
Mikheev, A. (1999). A Knowledge-free Method for Capitalized Word Disambiguation. In the 37th annual meeting of the association for computational linguistics, 159-166.
[11]
Miles, A.J., Bechhofer, S. (2009). SKOS Reference. W3C Recommendation. Latest version available at http://www.w3.org/TR/skos-reference/.
[12]
Nadeau, D., S. Sekine. (2007). A survey of named entity recognition and classification. Linguisticae Investigationes 30.
[13]
Quinlan, J.R. (1993). C4.5: Programs for Machine Learning, Morgan Kauffman.
[14]
Ravin, Y., Wacholder, N. (1997). Extracting Names from Natural-Language Text.
[15]
Sang, T.K., F. Erik, F. De Meulder. (2003). Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In Proceedings Conference on Natural Language Learning.
[16]
Wellner, B., McCallum, A., Peng, F., Hay, M. (2004). An Integrated, Conditional Model of Information Extraction and Coreference with Application to Citation Matching. UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence.
[17]
Wick, M., T. Becker. (2007). Enhancing RSS Feeds with Extracted Geospatial Information for Further Processing and Visualization. In The Geospatial Web - How Geobrowsers, Social Software and the Web 2.0 are Shaping the Network Society, Springer.

Index Terms

  1. A language independent approach for aligning subject heading systems with geographic ontologies
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Guide Proceedings
        DCMI'11: Proceedings of the 2011 International Conference on Dublin Core and Metadata Applications
        September 2011
        205 pages

        Publisher

        Dublin Core Metadata Initiative

        Publication History

        Published: 21 September 2011

        Author Tags

        1. SKOS
        2. entity recognition
        3. entity resolution
        4. linked data
        5. machine learning
        6. subject headings

        Qualifiers

        • Article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 22 Sep 2024

        Other Metrics

        Citations

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media