Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3357384.3358167acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper
Open access

Approximate Definitional Constructs as Lightweight Evidence for Detecting Classes Among Wikipedia Articles

Published: 03 November 2019 Publication History

Abstract

A lightweight method applies a few extraction patterns to the task of distinguishing Wikipedia articles that are classes ("Walled garden", "Garden") from other articles ("High Hazels Park"). The method acquires a set of classes, based on patterns targeting phrases that likely refer to either concepts being introduced or defined ("a *walled garden* is a garden [.]"); or to concepts used to introduce or define other concepts ("a walled garden is a *garden* [.]"). Experimental results over multiple evaluation sets are better, when relying on defined phrases alone vs. defining phrases alone; and further improved, when combining complementary evidence from both.

References

[1]
R. Blanco, G. Ottaviano, and E. Meij. 2015. Fast and Space-Efficient Entity Linking in Queries. In Proceedings of the 8th ACM Conference on Web Search and Data Mining (WSDM-15). Shanghai, China, 179--188.
[2]
N. Calzolari and E. Picchi. 1988. Acquisition of Semantic Information from an On-line Dictionary. In Proceedings of the 12th International Conference on Computational Linguistics (COLING-88) . Budapest, Hungary, 87--92.
[3]
D. Chen, A. Fisch, J. Weston, and A. Bordes. 2017. Reading Wikipedia to Answer Open-Domain Questions. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL-17). Vancouver, Canada, 1870--1879.
[4]
A. Chisholm and B. Hachey. 2015. Entity disambiguation with Web links. Transactions of the Association for Computational Linguistics, Vol. 3 (2015), 145--156.
[5]
X. Du and C. Cardie. 2018. Harvesting Paragraph-level Question-Answer Pairs from Wikipedia. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (ACL-18) . Melbourne, Australia, 1907--1917.
[6]
F. Ensan and E. Bagheri. 2017. Document Retrieval Model Through Semantic Linking. In Proceedings of the 10th ACM Conference on Web Search and Data Mining (WSDM-17). Cambridge, United Kingdom, 181--190.
[7]
P. Ernst, A. Siu, and G. Weikum. 2018. HighLife: Higher-Arity Fact Harvesting. In Proceedings of the 2018 Web Conference (WWW-18). Lyon, France, 1013--1022.
[8]
O. Etzioni, A. Fader, J. Christensen, S. Soderland, and Mausam. 2011. Open Information Extraction: The Second Generation. In Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI-11). Barcelona, Spain, 3--10.
[9]
C. Fellbaum (Ed.). 1998. WordNet: An Electronic Lexical Database and Some of its Applications .MIT Press.
[10]
T. Flati, D. Vannella, T. Pasini, and R. Navigli. 2014. Two Is Bigger (and Better) Than One: the Wikipedia Bitaxonomy Project. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL-14). Baltimore, Maryland, 945--955.
[11]
O. Ganea, M. Ganea, A. Lucchi, C. Eickhoff, and T. Hofmann. 2016. Probabilistic Bag-Of-Hyperlinks Model for Entity Linking. In Proceedings of the 25th World Wide Web Conference (WWW-16). Montreal, Canada, 927--938.
[12]
P. Gupta, S. Rajaram, H. Schütze, and T. Runkler. 2019. Neural Relation Extraction Within and Across Sentence Boundaries. In Proceedings of the 33rd National Conference on Artificial Intelligence (AAAI-19). Honolulu, Hawaii, 6513--6520.
[13]
M. Hearst. 1992. Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th International Conference on Computational Linguistics (COLING-92). Nantes, France, 539--545.
[14]
J. Hoffart, F. Suchanek, K. Berberich, and G. Weikum. 2013. YAGO2: a Spatially and Temporally Enhanced Knowledge Base from Wikipedia. Artificial Intelligence Journal. Special Issue on Artificial Intelligence, Wikipedia and Semi-Structured Resources, Vol. 194 (2013), 28--61.
[15]
J. Hu, G. Wang, F. Lochovsky, J. Sun, and Z. Chen. 2009. Understanding User's Query Intent with Wikipedia. In Proceedings of the 18th World Wide Web Conference (WWW-09). Madrid, Spain, 471--480.
[16]
D. Ma, Y. Chen, K. Chang, and X. Du. 2018. Leveraging Fine-Grained Wikipedia Categories for Entity Search. In Proceedings of the 2018 Web Conference (WWW-18). Lyon, France, 1623--1632.
[17]
Mausam, M. Schmitz, S. Soderland, R. Bart, and O. Etzioni. 2012. Open Language Learning for Information Extraction. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL-12). Jeju Island, Korea, 523--534.
[18]
V. Nastase and M. Strube. 2008. Decoding Wikipedia Categories for Knowledge Acquisition. In Proceedings of the 23rd National Conference on Artificial Intelligence (AAAI-08). Chicago, Illinois, 1219--1224.
[19]
V. Nastase and M. Strube. 2013. Transforming Wikipedia into a Large Scale Multilingual Concept Network. Artificial Intelligence, Vol. 194 (2013), 62--85.
[20]
M. Pacsca. 2018. Finding Needles in an Encyclopedic Haystack: Detecting Classes Among Wikipedia Articles. In Proceedings of the 2018 Web Conference (WWW-18) . Lyon, France, 1267--1276.
[21]
X. Pan, T. Cassidy, U. Hermjakob, H. Ji, and K. Knight. 2015. Unsupervised Entity Linking with Abstract Meaning Representation. In Proceedings of the 2015 Conference of the North American Association for Computational Linguistics (NAACL-HLT-15). Denver, Colorado, 1130--1139.
[22]
L. Ratinov and D. Roth. 2012. Learning-Based Multi-Sieve Co-Reference Resolution with Knowledge. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL-12) . Jeju Island, Korea, 1234--1244.
[23]
L. Ratinov, D. Roth, D. Downey, and M. Anderson. 2011. Local and Global Algorithms for Disambiguation to Wikipedia. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics (ACL-11) . Portland, Oregon, 1375--1384.
[24]
U. Scaiella, P. Ferragina, A. Marino, and M. Ciaramita. 2012. Topical Clustering of Search Results. In Proceedings of the 5th ACM Conference on Web Search and Data Mining (WSDM-12). Seattle, Washington, 223--232.
[25]
J. Seitner, C. Bizer, K. Eckert, S. Faralli, R. Meusel, H. Paulheim, and S. Ponzetto. 2016. A Large Database of Hypernymy Relations Extracted from the Web. In Proceedings of the 10th Conference on Language Resources and Evaluation (LREC-16) . Portoroz, Slovenia, 360--367.
[26]
A. Singhal. 2012. Introducing the Knowledge Graph: Things, not Strings. Corporate blog.
[27]
Y. Sun, A. Singla, D. Fox, and A. Krause. 2015. Building Hierarchies of Concepts via Crowdsourcing. In Proceedings of the 24th International Joint Conference on Artificial Intelligence (IJCAI-15) . Buenos Aires, Argentina, 844--851.
[28]
C. Tan, F. Wei, P. Ren, W. Lv, and M. Zhou. 2017. Entity Linking for Queries by Searching Wikipedia Sentences. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP-17) . Copenhagen, Denmark, 68--77.
[29]
D. Tsurel, D. Pelleg, I. Guy, and D. Shahaf. 2017. Fun Facts: Automatic Trivia Fact Extraction from Wikipedia. In Proceedings of the 10th ACM Conference on Web Search and Data Mining (WSDM-17) . Cambridge, United Kingdom, 345--354.
[30]
F. Wu and D. Weld. 2010. Open Information Extraction using Wikipedia. In Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics (ACL-10) . Uppsala, Sweden, 118--127.
[31]
W. Wu, H. Li, H. Wang, and K. Zhu. 2012. Probase: a Probabilistic Taxonomy for Text Understanding. In Proceedings of the 2012 International Conference on Management of Data (SIGMOD-12) . Scottsdale, Arizona, 481--492.
[32]
Y. Yan, N. Okazaki, Y. Matsuo, Z. Yang, and M. Ishizuka. 2009. Unsupervised Relation Extraction by Mining Wikipedia Texts Using Information from the Web. In Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics (ACL-IJCNLP-09) . Singapore, 1021--1029.
[33]
S. Zhang and K. Balog. 2018. Ad Hoc Table Retrieval Using Semantic Similarity. In Proceedings of the 2018 Web Conference (WWW-18). Lyon, France, 1553--1562.
[34]
C. Zirn, V. Nastase, and M. Strube. 2008. Distinguishing Between Instances and Classes in the Wikipedia Taxonomy. In Proceedings of the 5th European Semantic Web Conference (ESWC-08). Tenerife, Spain, 376--387.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
November 2019
3373 pages
ISBN:9781450369763
DOI:10.1145/3357384
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Check for updates

Author Tags

  1. classes
  2. knowledge acquisition
  3. open-domain information extraction
  4. semantics
  5. topic classification

Qualifiers

  • Short-paper

Conference

CIKM '19
Sponsor:

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 363
    Total Downloads
  • Downloads (Last 12 months)101
  • Downloads (Last 6 weeks)21
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media