Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2505515.2505597acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Towards an enhanced and adaptable ontology by distilling and assembling online encyclopedias

Published: 27 October 2013 Publication History

Abstract

In this paper, we investigate the problem of making better use of semantic knowledge obtained from different encyclopedia sources. We propose a framework to integrate different encyclopedias and reorganize the information. We also utilize Learning to Rank models to distill out more functional knowledge from the encyclopedic information and then align the knowledge with a WordNet-like ontology. Finally as a demonstration, a Chinese semantic knowledge repository named JNet is constructed based on this framework. Experiments show that the proposed methods work well and the three steps reinforce each other towards a more powerful ontology.

References

[1]
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. Ive. Dbpedia: A nucleus for a web of open data. In ISWC/ASWC'07, pages 722--735, 2007.
[2]
L. Bing, W. Lam, and T.-L. Wong. Wikipedia entity expansion and attribute extraction from the web using semi-supervised learning. In WSDM'13, pages 567--576, 2013.
[3]
C. C. Chang and C.J.Lin. LIBSVM: a library for support vector machines. 2001.
[4]
S. Chernov, T. Iofciu, W. Nejdl, and X. Zhou. Extracting semantic relationships between wikipedia categories. In SemWiki'06, 2006.
[5]
Z. Gyöngyi, H. Garcia-Molina, and J. Pedersen. Combating web spam with trustrank. In VLDB'04, pages 576--587, 2004.
[6]
A. Herbelot and A. Copestake. Acquiring ontological relationships from wikipedia using rmrs. In ISWC'06, 2006.
[7]
A. Hotho, S. Staab, and G. Stumme. Wordnet improves text document clustering. In SIGIR'03, pages 541--544, 2003.
[8]
M. Hu, E. Lim, A. Sun, H. W. Lauw, and B. Vuong. Measuring article quality in wikipedia: Models and evaluation. In CIKM'07, pages 243--252, 2007.
[9]
S. Jiang, L. Bing, B. Sun, Y. Zhang, and W. Lam. Ontology enhancement and concept granularity learning: Keeping yourself current and adaptive. In KDD'11, pages 1244--1252, 2011.
[10]
T. Joachims. Training linear svms in linear time. In KDD'06, pages 217--226, 2006.
[11]
R. Kaptein, P. Serdyukov, A. D. Vries, and J. Kamps. Entity ranking using wikipedia as a pivot. In CIKM'10, pages 69--78, 2010.
[12]
G. Karypis. CLUTO-a clustering toolkit. Technical report, DTIC Document, 2002.
[13]
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. JACM, 46(5):604--632, 1999.
[14]
Y. Liu, S. Yu, and J. Yu. Building a bilingual wordnet-like lexicon: the new approach and algorithms. In COLING'02, 2002.
[15]
R. Navigli and S. P. Ponzetto. Babelnet: Building a very large multilingual semantic network. In ACL'10, pages 216--225, 2010.
[16]
T. Nguyen, V. Moreira, H. Nguyen, H. Nguyen, and J. Freire. Multilingual schema matching for wikipedia infoboxes. PVLDB, 5(2):133--144, 2011.
[17]
J. Nian, S. Jiang, C. Huang, and Y. Zhang. CCE: A chinese concept encyclopedia incorporating the expert-edited chinese concept dictionary with online cyclopedias. ADMA'11, pages 201--214, 2011.
[18]
S. P. Ponzetto and R. Navigli. Knowledge-rich word sense disambiguation rivaling supervised systems. In ACL'10, pages 1522--1531, 2010.
[19]
M. Ruiz-Casado, E. Alfonseca, and P. Castells. Automatic assignment of wikipedia encyclopedic entries to wordnet synsets. In AWIC'05, pages 380--386, 2005.
[20]
F. M. Suchanek, G. Kasneci, and G. Weikum. Yago: A large ontology from wikipedia and wordnet. Journal of Web Semantics, 6(3):203--217, 2008.
[21]
A. Sumida and K. Torisawa. Hacking wikipedia for hyponymy relation acquisiti. In IJCNLP'08, pages 883--888, 2008.
[22]
A. M. Vercoustre, J. A. Thom, and J. Pehcevski. Entity ranking in wikipedia. In SAC'08, pages 1101--1106, 2008.
[23]
G. Wang, Y. Yu, and H. Zhu. Positive-only relation extraction from wikipedia text. In ISWC/ASWC'07, pages 580--594, 2007.
[24]
R. Wang, Z. Chen, X. Wang, and X. Huang. Analysis on the applications of wikipedia in chinese information processing. In ICMT'11, pages 3424--3427, 2011.
[25]
G. N. Wilkinson and C. E. Rogers. Symbolic descriptions of factorial models for analysis of variance. Applied Statistics, 22:392--399, 1973.
[26]
G. N. Wilkinson and C. E. Rogers. Linear models. Chapter 4 of Statistical Models in S eds J. M. Chambers and T. J. Hastie, Wadsworth & Brooks/Cole, 1992.
[27]
I. Yamada, J. Oh, C. Hashimoto, K. Torisawa, J. Kazama, S. D. Saeger, and T. Kawada. Extending wordnet with hypernyms and siblings acquired from wikipedia. In IJCNLP'11, pages 847--882, 2011.
[28]
J. Yu, S. Yu, Y. Liu, and H. Zhang. Introduction to Chinese Concept Dictionary. In ICCC'01, pages 361--366, 2001.
[29]
Y. Zhao and G. Karypis. Criterion functions for document clustering: Experiments and analysis. Technical Report TR 01-040, Department of Computer Science, University of Minnesota, Minneapolis, MN, 2001.

Cited By

View all
  • (2018)Active instance matching with pairwise constraints and its application to Chinese knowledge base constructionKnowledge and Information Systems10.1007/s10115-017-1076-755:1(171-214)Online publication date: 1-Apr-2018
  • (2015)Adaptive Concept Resolution for document representation and its applications in text miningKnowledge-Based Systems10.1016/j.knosys.2014.10.00374:1(1-13)Online publication date: 1-Jan-2015

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management
October 2013
2612 pages
ISBN:9781450322638
DOI:10.1145/2505515
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ccd
  2. encyclopedia
  3. ontology

Qualifiers

  • Research-article

Conference

CIKM'13
Sponsor:
CIKM'13: 22nd ACM International Conference on Information and Knowledge Management
October 27 - November 1, 2013
California, San Francisco, USA

Acceptance Rates

CIKM '13 Paper Acceptance Rate 143 of 848 submissions, 17%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Active instance matching with pairwise constraints and its application to Chinese knowledge base constructionKnowledge and Information Systems10.1007/s10115-017-1076-755:1(171-214)Online publication date: 1-Apr-2018
  • (2015)Adaptive Concept Resolution for document representation and its applications in text miningKnowledge-Based Systems10.1016/j.knosys.2014.10.00374:1(1-13)Online publication date: 1-Jan-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media