research-article

Large-scale learning of word relatedness with constraints

Authors:

Evgeniy Gabrilovich,

Yehuda KorenAuthors Info & Claims

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 1406 - 1414

https://doi.org/10.1145/2339530.2339751

Published: 12 August 2012 Publication History

Abstract

Prior work on computing semantic relatedness of words focused on representing their meaning in isolation, effectively disregarding inter-word affinities. We propose a large-scale data mining approach to learning word-word relatedness, where known pairs of related words impose constraints on the learning process. We learn for each word a low-dimensional representation, which strives to maximize the likelihood of a word given the contexts in which it appears. Our method, called CLEAR, is shown to significantly outperform previously published approaches. The proposed method is based on first principles, and is generic enough to exploit diverse types of text corpora, while having the flexibility to impose constraints on the derived word similarities. We also make publicly available a new labeled dataset for evaluating word relatedness algorithms, which we believe to be the largest such dataset to date.

Supplementary Material

JPG File (306_w_talk_7.jpg)

Download
17.19 KB

MP4 File (306_w_talk_7.mp4)

Download
344.43 MB

References

[1]

M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta. The WaCky wide web: a collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation, 2009.

[2]

Y. Bengio and J.-S. Senécal. Quick training of probabilistic neural nets by sampling. In Proc. 9th International Workshop on Artificial Intelligence and Statistics (AISTATS'03), 2003.

[3]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.

Digital Library

[4]

L. Bottou. Stochastic learning. In Advanced Lectures on Machine Learning, LNAI 3176, pages 146--168. Springer Verlag, 2004.

[5]

A. Budanitsky and G. Hirst. Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics, 32(1):13--47, 2006.

Digital Library

[6]

I. Dagan, L. Lee, and F. C. N. Pereira. Similarity-based models of word cooccurrence probabilities. Machine Learning, 34(1--3):43--69, 1999.

Digital Library

[7]

S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. JASIS, 41(6):391--407, 1990.

[8]

C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.

[9]

E. Fieller, H. Hartley, and E. Pearson. Tests for rank correlation coefficients. Biometrika, 44:470--481, 1957.

[10]

L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman, and E. Ruppin. Placing search in context: The concept revisited. ACM TOIS, 20(1):116--131, January 2002.

Digital Library

[11]

E. Gabrilovich and S. Markovitch. Wikipedia-based semantic interpretation for natural language processing. Journal of Artificial Intelligence Research, 34:443--498, 2009.

[12]

T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Series in Statistics. Springer New York Inc., New York, NY, USA, 2009.

[13]

R. Hoffmann, C. Zhang, and D. S. Weld. Learning 5000 relational extractors. In ACL, pages 286--295, 2010.

Digital Library

[14]

C. Kunze. Computerlinguistik und sprachtechnologie. In Lexikalisch-semantische Wortnetze, pages 423--431. Spektrum Akademischer Verlag, 2004.

[15]

L. Lee. Measures of distributional similarity. In Proceedings of the 37th Annual Meeting of the ACL, pages 25--32, 1999.

Digital Library

[16]

A. K. McCallum. Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu, 2002.

[17]

K. Radinsky, E. Agichtein, E. Gabrilovich, and S. Markovitch. A word at a time: Computing word relatedness using temporal semantic analysis. In WWW, 2011.

Digital Library

[18]

H. Robbins and S. Monro. A stochastic approximation method. Annals of Math. Statistics, 22:400--407, 1951.

[19]

P. Roget. Roget's Thesaurus of English Words and Phrases. Longman Group Ltd., 1852.

[20]

G. Salton, editor. The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice Hall, 1971.

Digital Library

[21]

R. Snow, B. O'Connor, D. Jurafsky, and A. Y. Ng. Cheap and fast -- but is it good? Evaluating non-expert annotations for natural language tasks. In EMNLP, 2008.

Digital Library

[22]

J. C. Spall. Introduction to Stochastic Search and Optimization. John Wiley & Sons, Inc., 2003.

Digital Library

[23]

Q. Sun, R. Li, D. Luo, and X. Wu. Text segmentation with LDA-based fisher kernel. In ACL-HLT Short Papers, pages 269--272, 2008.

Digital Library

[24]

C. Tan, E. Gabrilovich, and B. Pang. To each his own: Personalized content selection based on text comprehensibility. In WSDM, 2012.

Digital Library

[25]

A. Tversky. Features of similarity. Psychological Review, 84(4):327--352, 1977.

[26]

S. K. M. Wong, W. Ziarko, and P. C. N. Wong. Generalized vector spaces model in information retrieval. In SIGIR, 1985.

Digital Library

[27]

E. Yeh, D. Ramage, C. D. Manning, E. Agirre, and A. Soroa. Wikiwalk: Random walks on wikipedia for semantic relatedness. In 2009 TextGraphs-4 Workshop, 2009.

Digital Library

[28]

T. Zesch and I. Gurevych. Wisdom of crowds versus wisdom of linguists? measuring the semantic relatedness of words. Natural Language Engineering, 16(1):25--59, 2010.

Digital Library

[29]

T. Zesch, C. Mueller, and I. Gurevych. Using Wiktionary for computing semantic relatedness. In AAAI, pages 861--866, 2008.

Digital Library

Cited By

Shabahang KYim HDennis S(2024)Latent Relations at Steady‐state with Associative NetsCognitive Science10.1111/cogs.1349448:9Online publication date: 16-Sep-2024
https://doi.org/10.1111/cogs.13494
Kim JHwang ILee K(2024)Learning Semantic Information from Raw Audio Signal Using Both Contextual and Phonetic RepresentationsICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10445745(12366-12370)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10445745
Shahmohammadi HHeitmeier MShafaei-Bajestan ELensch HBaayen R(2024)How direct is the link between words and images?The Mental Lexicon10.1075/ml.22010.shaOnline publication date: 11-Jan-2024
https://doi.org/10.1075/ml.22010.sha
Show More Cited By

Index Terms

Large-scale learning of word relatedness with constraints
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

A word at a time: computing word relatedness using temporal semantic analysis
WWW '11: Proceedings of the 20th international conference on World wide web

Computing the degree of semantic relatedness of words is a key functionality of many language applications such as search, clustering, and disambiguation. Previous approaches to computing semantic relatedness mostly used static language resources, while ...
Efficient Computation of Co-occurrence Based Word Relatedness
DocEng '15: Proceedings of the 2015 ACM Symposium on Document Engineering

Measuring document relatedness using unsupervised co-occurrence based word relatedness methods is a processing-time and memory consuming task. This paper introduces the application of compact data structures for efficient computation of word relatedness ...
Hindi Word Sense Disambiguation Using Semantic Relatedness Measure
MIWAI 2013: Proceedings of the 7th International Workshop on Multi-disciplinary Trends in Artificial Intelligence - Volume 8271

In this paper we propose and evaluate a method of Hindi word sense disambiguation that computes similarity based on the semantics. We adapt an existing measure for semantic relatedness between two lexically expressed concepts of Hindi WordNet. This ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2012

1616 pages

ISBN:9781450314626

DOI:10.1145/2339530

General Chair:
Qiang Yang
Hong Kong University of Science and Technology
,
Program Chairs:
Deepak Agarwal
LinkedIn
,
Jian Pei
Simon Fraser University

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '12

Sponsor:

KDD '12: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 12 - 16, 2012

Beijing, China

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

100
Total Citations
View Citations
944
Total Downloads

Downloads (Last 12 months)27
Downloads (Last 6 weeks)5

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Shabahang KYim HDennis S(2024)Latent Relations at Steady‐state with Associative NetsCognitive Science10.1111/cogs.1349448:9Online publication date: 16-Sep-2024
https://doi.org/10.1111/cogs.13494
Kim JHwang ILee K(2024)Learning Semantic Information from Raw Audio Signal Using Both Contextual and Phonetic RepresentationsICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10445745(12366-12370)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10445745
Shahmohammadi HHeitmeier MShafaei-Bajestan ELensch HBaayen R(2024)How direct is the link between words and images?The Mental Lexicon10.1075/ml.22010.shaOnline publication date: 11-Jan-2024
https://doi.org/10.1075/ml.22010.sha
Stöhr F(2024)Advancing language models through domain knowledge integration: a comprehensive approach to training, evaluation, and optimization of social scientific neural word embeddingsJournal of Computational Social Science10.1007/s42001-024-00286-37:2(1753-1793)Online publication date: 22-May-2024
https://doi.org/10.1007/s42001-024-00286-3
Liu QHan SCambria ELi YKwok K(2024)PrimeNet: A Framework for Commonsense Knowledge Representation and Reasoning Based on Conceptual PrimitivesCognitive Computation10.1007/s12559-024-10345-616:6(3429-3456)Online publication date: 30-Aug-2024
https://doi.org/10.1007/s12559-024-10345-6
Ascari RGiabelli AMalandri LMercorio FMezzanzanica M(2024)A Fistful of Vectors: A Tool for Intrinsic Evaluation of Word EmbeddingsCognitive Computation10.1007/s12559-023-10235-316:3(949-963)Online publication date: 22-Jan-2024
https://doi.org/10.1007/s12559-023-10235-3
Shahmohammadi HHeitmeier MShafaei-Bajestan ELensch HBaayen R(2023)Language with vision: A study on grounded word and sentence embeddingsBehavior Research Methods10.3758/s13428-023-02294-z56:6(5622-5646)Online publication date: 19-Dec-2023
https://doi.org/10.3758/s13428-023-02294-z
Quintero RMendiola EGuzmán GTorres-Ruiz MGuzmán Sánchez-Mejorada C(2023)Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge GraphsMathematics10.3390/math1123480611:23(4806)Online publication date: 28-Nov-2023
https://doi.org/10.3390/math11234806
Lopes ACarbonera JAbel M(2023)Disjointness axioms between top-level ontology concepts as a heuristic for word similarity evaluation2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI)10.1109/ICTAI59109.2023.00086(538-545)Online publication date: 6-Nov-2023
https://doi.org/10.1109/ICTAI59109.2023.00086
Feng YRao Y(2023)A Causal Graph for Learning Gender Debiased Word Embedding2023 9th International Conference on Big Data and Information Analytics (BigDIA)10.1109/BigDIA60676.2023.10429107(1-8)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigDIA60676.2023.10429107
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten