Abstract
By mining rich semantic information from large-scale unlabeled texts and incorporating it into pre-trained models, BERT and RoBERTa have achieved impressive performance on many natural language processing tasks. However, these pre-trained models rely on fine-tuning for specific tasks, and it is very difficult to use native BERT or RoBERTa for the task of Semantic Textual Similarity (STS).
In this paper, we present CoSBERT, a cosine-based siamese BERT-Networks modified from the pre-trained BERT or RoBERT models to derive meaningfully semantic embeddings. Its main feature is to optimize the cosine-similarity between the semantic embeddings of input texts in training stage. And it also improves the efficiency and accuracy for the computation of STS tasks in prediction stage. Experiments on multiple STS tasks prove that CoSBERT performs well and its effectiveness is verified. In addition, by deploying CoSBERT into the SCHOLAT user recommendation system, the efficiency and accuracy of the system has been improved.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Majumder, G., Pakray, P., Gelbukh, A., et al.: Semantic textual similarity methods, tools, and applications: a survey. Computación y Sistemas 20(4), 647–665 (2016)
Chowdhary, K.R.: Natural language processing. Fund. Artif. Intell., 603–649 (2020)
Li, H., Xu, J.: Semantic matching in search. Found. Trends Inf. Retr. 7(5), 343–469 (2014)
Chen, M.X., Firat, O., Bapna, A., et al.: The best of both worlds: combining recent advances in neural machine translation. arXiv preprint arXiv:1804.09849 (2018)
Demszky, D., Guu, K., Liang, P.: Transforming question answering datasets into natural language inference datasets. arXiv preprint arXiv:1809.02922 (2018)
Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Conneau, A., Kiela, D., Schwenk, H., et al.: Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364 (2017)
Bowman, S.R., Angeli, G., Potts, C., et al.: A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015)
Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)
Chicco, D.: Siamese neural networks: an overview. Artif. Neural Netw., 73–94 (2021)
Wan, S., Lan, Y., Guo, J., et al.: A deep architecture for semantic matching with multiple positional sentence representations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1 (2016)
Chang, W.C., Yu, F.X., Chang, Y.W., et al.: Pre-training tasks for embedding-based large-scale retrieval. arXiv preprint arXiv:2002.03932 (2020)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 1–11 (2017)
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Liu, Y., Ott, M., Goyal, N., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Sun, Y., Cheng, C., Zhang, Y., et al.: Circle loss: a unified perspective of pair similarity optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6398–6407 (2020)
Wang, X., Han, X., Huang, W., et al.: Multi-similarity loss with general pair weighting for deep metric learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5022–5030 (2019)
Agirre, E., Banea, C., Cardie, C., et al.: Semeval-2015 task 2: semantic textual similarity, english, spanish and pilot on interpretability. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 252–263 (2015)
Liu, X., Chen, Q., Deng, C., et al.: Lcqmc: a large-scale Chinese question matching corpus. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1952–1962 (2018)
Chen, J., Chen, Q., Liu, X., et al.: The BQ corpus: a large-scale domain-specific Chinese corpus for sentence semantic equivalence identification. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4946–4951 (2018)
Yang, Y., Zhang, Y., Tar, C., et al.: PAWS-X: a cross-lingual adversarial dataset for paraphrase identification. arXiv preprint arXiv:1908.11828 (2019)
Cui, Y., Che, W., Liu, T., et al.: Revisiting pre-trained models for Chinese natural language processing. arXiv preprint arXiv:2004.13922 (2020)
Cui, Y., Che, W., Liu, T., et al.: Pre-training with whole word masking for Chinese bert. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 3504–3514 (2021)
Bentivogli, L., Clark, P., Dagan, I., et al.: The fifth PASCAL recognizing textual entailment challenge. In: TAC (2009)
Choi, H., Kim, J., Joe, S., et al.: Evaluation of BERT and ALBERT sentence embedding performance on downstream NLP tasks. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5482–5487. IEEE (2021)
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network, vol. 2, no. 7. arXiv preprint arXiv:1503.02531 (2015)
Khattab, O., Zaharia, M.: Colbert: efficient and effective passage search via contextualized late interaction over bert. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 39–48 (2020)
Tang, R., Lu, Y., Liu, L., et al.: Distilling task-specific knowledge from bert into simple neural networks. arXiv preprint arXiv:1903.12136 (2019)
Myers, L., Sirois, M.J.: Spearman correlation coefficients, differences between. Encycl. Stat. Sci. 12 (2004)
Gao, T., Yao, X., Chen, D.: Simcse: simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821 (2021)
Su, J., Cao, J., Liu, W., et al.: Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316 (2021)
Acknowledgment
This work was supported in part by the National Natural Science Foundation of China under Grant U1811263.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Yu, W., Weng, Y., Lin, R., Tang, Y. (2023). CoSBERT: A Cosine-Based Siamese BERT-Networks Using for Semantic Textual Similarity. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2022. Communications in Computer and Information Science, vol 1681. Springer, Singapore. https://doi.org/10.1007/978-981-99-2356-4_30
Download citation
DOI: https://doi.org/10.1007/978-981-99-2356-4_30
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2355-7
Online ISBN: 978-981-99-2356-4
eBook Packages: Computer ScienceComputer Science (R0)