Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2339530.2339751acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Large-scale learning of word relatedness with constraints

Published: 12 August 2012 Publication History

Abstract

Prior work on computing semantic relatedness of words focused on representing their meaning in isolation, effectively disregarding inter-word affinities. We propose a large-scale data mining approach to learning word-word relatedness, where known pairs of related words impose constraints on the learning process. We learn for each word a low-dimensional representation, which strives to maximize the likelihood of a word given the contexts in which it appears. Our method, called CLEAR, is shown to significantly outperform previously published approaches. The proposed method is based on first principles, and is generic enough to exploit diverse types of text corpora, while having the flexibility to impose constraints on the derived word similarities. We also make publicly available a new labeled dataset for evaluating word relatedness algorithms, which we believe to be the largest such dataset to date.

Supplementary Material

JPG File (306_w_talk_7.jpg)
MP4 File (306_w_talk_7.mp4)

References

[1]
M. Baroni, S. Bernardini, A. Ferraresi, and E. Zanchetta. The WaCky wide web: a collection of very large linguistically processed web-crawled corpora. Language Resources and Evaluation, 2009.
[2]
Y. Bengio and J.-S. Senécal. Quick training of probabilistic neural nets by sampling. In Proc. 9th International Workshop on Artificial Intelligence and Statistics (AISTATS'03), 2003.
[3]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3:993--1022, 2003.
[4]
L. Bottou. Stochastic learning. In Advanced Lectures on Machine Learning, LNAI 3176, pages 146--168. Springer Verlag, 2004.
[5]
A. Budanitsky and G. Hirst. Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics, 32(1):13--47, 2006.
[6]
I. Dagan, L. Lee, and F. C. N. Pereira. Similarity-based models of word cooccurrence probabilities. Machine Learning, 34(1--3):43--69, 1999.
[7]
S. Deerwester, S. Dumais, G. Furnas, T. Landauer, and R. Harshman. Indexing by latent semantic analysis. JASIS, 41(6):391--407, 1990.
[8]
C. Fellbaum, editor. WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 1998.
[9]
E. Fieller, H. Hartley, and E. Pearson. Tests for rank correlation coefficients. Biometrika, 44:470--481, 1957.
[10]
L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, Z. Solan, G. Wolfman, and E. Ruppin. Placing search in context: The concept revisited. ACM TOIS, 20(1):116--131, January 2002.
[11]
E. Gabrilovich and S. Markovitch. Wikipedia-based semantic interpretation for natural language processing. Journal of Artificial Intelligence Research, 34:443--498, 2009.
[12]
T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer Series in Statistics. Springer New York Inc., New York, NY, USA, 2009.
[13]
R. Hoffmann, C. Zhang, and D. S. Weld. Learning 5000 relational extractors. In ACL, pages 286--295, 2010.
[14]
C. Kunze. Computerlinguistik und sprachtechnologie. In Lexikalisch-semantische Wortnetze, pages 423--431. Spektrum Akademischer Verlag, 2004.
[15]
L. Lee. Measures of distributional similarity. In Proceedings of the 37th Annual Meeting of the ACL, pages 25--32, 1999.
[16]
A. K. McCallum. Mallet: A machine learning for language toolkit. http://mallet.cs.umass.edu, 2002.
[17]
K. Radinsky, E. Agichtein, E. Gabrilovich, and S. Markovitch. A word at a time: Computing word relatedness using temporal semantic analysis. In WWW, 2011.
[18]
H. Robbins and S. Monro. A stochastic approximation method. Annals of Math. Statistics, 22:400--407, 1951.
[19]
P. Roget. Roget's Thesaurus of English Words and Phrases. Longman Group Ltd., 1852.
[20]
G. Salton, editor. The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice Hall, 1971.
[21]
R. Snow, B. O'Connor, D. Jurafsky, and A. Y. Ng. Cheap and fast -- but is it good? Evaluating non-expert annotations for natural language tasks. In EMNLP, 2008.
[22]
J. C. Spall. Introduction to Stochastic Search and Optimization. John Wiley & Sons, Inc., 2003.
[23]
Q. Sun, R. Li, D. Luo, and X. Wu. Text segmentation with LDA-based fisher kernel. In ACL-HLT Short Papers, pages 269--272, 2008.
[24]
C. Tan, E. Gabrilovich, and B. Pang. To each his own: Personalized content selection based on text comprehensibility. In WSDM, 2012.
[25]
A. Tversky. Features of similarity. Psychological Review, 84(4):327--352, 1977.
[26]
S. K. M. Wong, W. Ziarko, and P. C. N. Wong. Generalized vector spaces model in information retrieval. In SIGIR, 1985.
[27]
E. Yeh, D. Ramage, C. D. Manning, E. Agirre, and A. Soroa. Wikiwalk: Random walks on wikipedia for semantic relatedness. In 2009 TextGraphs-4 Workshop, 2009.
[28]
T. Zesch and I. Gurevych. Wisdom of crowds versus wisdom of linguists? measuring the semantic relatedness of words. Natural Language Engineering, 16(1):25--59, 2010.
[29]
T. Zesch, C. Mueller, and I. Gurevych. Using Wiktionary for computing semantic relatedness. In AAAI, pages 861--866, 2008.

Cited By

View all
  • (2024)Latent Relations at Steady‐state with Associative NetsCognitive Science10.1111/cogs.1349448:9Online publication date: 16-Sep-2024
  • (2024)Learning Semantic Information from Raw Audio Signal Using Both Contextual and Phonetic RepresentationsICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10445745(12366-12370)Online publication date: 14-Apr-2024
  • (2024)How direct is the link between words and images?The Mental Lexicon10.1075/ml.22010.shaOnline publication date: 11-Jan-2024
  • Show More Cited By

Index Terms

  1. Large-scale learning of word relatedness with constraints

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2012
    1616 pages
    ISBN:9781450314626
    DOI:10.1145/2339530
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 August 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. semantic similarity
    2. word relatedness

    Qualifiers

    • Research-article

    Conference

    KDD '12
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Upcoming Conference

    KDD '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)27
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 27 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Latent Relations at Steady‐state with Associative NetsCognitive Science10.1111/cogs.1349448:9Online publication date: 16-Sep-2024
    • (2024)Learning Semantic Information from Raw Audio Signal Using Both Contextual and Phonetic RepresentationsICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10445745(12366-12370)Online publication date: 14-Apr-2024
    • (2024)How direct is the link between words and images?The Mental Lexicon10.1075/ml.22010.shaOnline publication date: 11-Jan-2024
    • (2024)Advancing language models through domain knowledge integration: a comprehensive approach to training, evaluation, and optimization of social scientific neural word embeddingsJournal of Computational Social Science10.1007/s42001-024-00286-37:2(1753-1793)Online publication date: 22-May-2024
    • (2024)PrimeNet: A Framework for Commonsense Knowledge Representation and Reasoning Based on Conceptual PrimitivesCognitive Computation10.1007/s12559-024-10345-616:6(3429-3456)Online publication date: 30-Aug-2024
    • (2024)A Fistful of Vectors: A Tool for Intrinsic Evaluation of Word EmbeddingsCognitive Computation10.1007/s12559-023-10235-316:3(949-963)Online publication date: 22-Jan-2024
    • (2023)Language with vision: A study on grounded word and sentence embeddingsBehavior Research Methods10.3758/s13428-023-02294-z56:6(5622-5646)Online publication date: 19-Dec-2023
    • (2023)Algorithm for the Accelerated Calculation of Conceptual Distances in Large Knowledge GraphsMathematics10.3390/math1123480611:23(4806)Online publication date: 28-Nov-2023
    • (2023)Disjointness axioms between top-level ontology concepts as a heuristic for word similarity evaluation2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI)10.1109/ICTAI59109.2023.00086(538-545)Online publication date: 6-Nov-2023
    • (2023)A Causal Graph for Learning Gender Debiased Word Embedding2023 9th International Conference on Big Data and Information Analytics (BigDIA)10.1109/BigDIA60676.2023.10429107(1-8)Online publication date: 15-Dec-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media