Nearly-Unsupervised Hashcode Representations for Biomedical Relation Extraction

Sahil Garg; Aram Galstyan; Greg Ver Steeg; Guillermo A. Cecchi

doi:10.18653/v1/D19-1414

Nearly-Unsupervised Hashcode Representations for Biomedical Relation Extraction

Sahil Garg, Aram Galstyan, Greg Ver Steeg, Guillermo Cecchi

Abstract

Recently, kernelized locality sensitive hashcodes have been successfully employed as representations of natural language text, especially showing high relevance to biomedical relation extraction tasks. In this paper, we propose to optimize the hashcode representations in a nearly unsupervised manner, in which we only use data points, but not their class labels, for learning. The optimized hashcode representations are then fed to a supervised classifi er following the prior work. This nearly unsupervised approach allows fine-grained optimization of each hash function, which is particularly suitable for building hashcode representations generalizing from a training set to a test set. We empirically evaluate the proposed approach for biomedical relation extraction tasks, obtaining significant accuracy improvements w.r.t. state-of-the-art supervised and semi-supervised approaches.

Anthology ID:: D19-1414
Volume:: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Month:: November
Year:: 2019
Address:: Hong Kong, China
Editors:: Kentaro Inui, Jing Jiang, Vincent Ng, Xiaojun Wan
Venues:: EMNLP | IJCNLP
SIG:: SIGDAT
Publisher:: Association for Computational Linguistics
Note:
Pages:: 4026–4036
Language:
URL:: https://aclanthology.org/D19-1414
DOI:: 10.18653/v1/D19-1414
Bibkey:
Cite (ACL):: Sahil Garg, Aram Galstyan, Greg Ver Steeg, and Guillermo Cecchi. 2019. Nearly-Unsupervised Hashcode Representations for Biomedical Relation Extraction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 4026–4036, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):: Nearly-Unsupervised Hashcode Representations for Biomedical Relation Extraction (Garg et al., EMNLP-IJCNLP 2019)
Copy Citation:
PDF:: https://aclanthology.org/D19-1414.pdf

PDF Cite Search