Gaussian Word Embedding with a Wasserstein Distance Loss

Sun, Chi; Yan, Hang; Qiu, Xipeng; Huang, Xuanjing

Computer Science > Computation and Language

arXiv:1808.07016 (cs)

[Submitted on 21 Aug 2018 (v1), last revised 1 Sep 2018 (this version, v7)]

Title:Gaussian Word Embedding with a Wasserstein Distance Loss

Authors:Chi Sun, Hang Yan, Xipeng Qiu, Xuanjing Huang

View PDF

Abstract:Compared with word embedding based on point representation, distribution-based word embedding shows more flexibility in expressing uncertainty and therefore embeds richer semantic information when representing words. The Wasserstein distance provides a natural notion of dissimilarity with probability measures and has a closed-form solution when measuring the distance between two Gaussian distributions. Therefore, with the aim of representing words in a highly efficient way, we propose to operate a Gaussian word embedding model with a loss function based on the Wasserstein distance. Also, external information from ConceptNet will be used to semi-supervise the results of the Gaussian word embedding. Thirteen datasets from the word similarity task, together with one from the word entailment task, and six datasets from the downstream document classification task will be evaluated in this paper to test our hypothesis.

Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1808.07016 [cs.CL]
	(or arXiv:1808.07016v7 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1808.07016

Submission history

From: Chi Sun [view email]
[v1] Tue, 21 Aug 2018 16:59:39 UTC (152 KB)
[v2] Wed, 22 Aug 2018 14:06:14 UTC (152 KB)
[v3] Thu, 23 Aug 2018 11:15:28 UTC (152 KB)
[v4] Fri, 24 Aug 2018 13:32:51 UTC (152 KB)
[v5] Mon, 27 Aug 2018 14:03:29 UTC (152 KB)
[v6] Tue, 28 Aug 2018 06:30:09 UTC (152 KB)
[v7] Sat, 1 Sep 2018 12:20:26 UTC (153 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2018-08

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Chi Sun
Hang Yan
Xipeng Qiu
Xuanjing Huang

export BibTeX citation

Computer Science > Computation and Language

Title:Gaussian Word Embedding with a Wasserstein Distance Loss

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:Gaussian Word Embedding with a Wasserstein Distance Loss

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators