On the Role of Supervision in Unsupervised Constituency Parsing

Shi, Haoyue; Livescu, Karen; Gimpel, Kevin

Computer Science > Computation and Language

arXiv:2010.02423v1 (cs)

[Submitted on 6 Oct 2020 (this version), latest version 7 Oct 2020 (v2)]

Title:On the Role of Supervision in Unsupervised Constituency Parsing

Authors:Haoyue Shi, Karen Livescu, Kevin Gimpel

View PDF

Abstract:We analyze several recent unsupervised constituency parsing models, which are tuned with respect to the parsing $F_1$ score on the Wall Street Journal (WSJ) development set (1,700 sentences). We introduce strong baselines for them, by training an existing supervised parsing model (Kitaev and Klein, 2018) on the same labeled examples they access. When training on the 1,700 examples, or even when using only 50 examples for training and 5 for development, such a few-shot parsing approach can outperform all the unsupervised parsing methods by a significant margin. Few-shot parsing can be further improved by a simple data augmentation method and self-training. This suggests that, in order to arrive at fair conclusions, we should carefully consider the amount of labeled data used for model development. We propose two protocols for future work on unsupervised parsing: (i) use fully unsupervised criteria for hyperparameter tuning and model selection; (ii) use as few labeled examples as possible for model development, and compare to few-shot parsing trained on the same labeled examples.

Comments:	EMNLP 2020. Project page: this https URL
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:2010.02423 [cs.CL]
	(or arXiv:2010.02423v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2010.02423

Submission history

From: Haoyue Shi [view email]
[v1] Tue, 6 Oct 2020 01:34:58 UTC (7,142 KB)
[v2] Wed, 7 Oct 2020 01:38:38 UTC (7,138 KB)

Computer Science > Computation and Language

Title:On the Role of Supervision in Unsupervised Constituency Parsing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:On the Role of Supervision in Unsupervised Constituency Parsing

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators