CoSBERT: A Cosine-Based Siamese BERT-Networks Using for Semantic Textual Similarity

Yu, Wenguang; Weng, Yu; Lin, Ronghua; Tang, Yong

doi:10.1007/978-981-99-2356-4_30

Wenguang Yu^13,14,
Yu Weng¹³,
Ronghua Lin^13,14 &
…
Yong Tang^13,14

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1681))

Included in the following conference series:

CCF Conference on Computer Supported Cooperative Work and Social Computing

658 Accesses

Abstract

By mining rich semantic information from large-scale unlabeled texts and incorporating it into pre-trained models, BERT and RoBERTa have achieved impressive performance on many natural language processing tasks. However, these pre-trained models rely on fine-tuning for specific tasks, and it is very difficult to use native BERT or RoBERTa for the task of Semantic Textual Similarity (STS).

In this paper, we present CoSBERT, a cosine-based siamese BERT-Networks modified from the pre-trained BERT or RoBERT models to derive meaningfully semantic embeddings. Its main feature is to optimize the cosine-similarity between the semantic embeddings of input texts in training stage. And it also improves the efficiency and accuracy for the computation of STS tasks in prediction stage. Experiments on multiple STS tasks prove that CoSBERT performs well and its effectiveness is verified. In addition, by deploying CoSBERT into the SCHOLAT user recommendation system, the efficiency and accuracy of the system has been improved.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Fine-grained semantic textual similarity measurement via a feature separation network

Article 25 January 2023

Siamese BERT Architecture Model with attention mechanism for Textual Semantic Similarity

Article 02 May 2023

A hybrid approach of Weighted Fine-Tuned BERT extraction with deep Siamese Bi – LSTM model for semantic text similarity identification

Article 06 January 2022

References

Majumder, G., Pakray, P., Gelbukh, A., et al.: Semantic textual similarity methods, tools, and applications: a survey. Computación y Sistemas 20(4), 647–665 (2016)
Article Google Scholar
Chowdhary, K.R.: Natural language processing. Fund. Artif. Intell., 603–649 (2020)
Google Scholar
Li, H., Xu, J.: Semantic matching in search. Found. Trends Inf. Retr. 7(5), 343–469 (2014)
Article Google Scholar
Chen, M.X., Firat, O., Bapna, A., et al.: The best of both worlds: combining recent advances in neural machine translation. arXiv preprint arXiv:1804.09849 (2018)
Demszky, D., Guu, K., Liang, P.: Transforming question answering datasets into natural language inference datasets. arXiv preprint arXiv:1809.02922 (2018)
Devlin, J., Chang, M.W., Lee, K., et al.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Conneau, A., Kiela, D., Schwenk, H., et al.: Supervised learning of universal sentence representations from natural language inference data. arXiv preprint arXiv:1705.02364 (2017)
Bowman, S.R., Angeli, G., Potts, C., et al.: A large annotated corpus for learning natural language inference. arXiv preprint arXiv:1508.05326 (2015)
Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)
Chicco, D.: Siamese neural networks: an overview. Artif. Neural Netw., 73–94 (2021)
Google Scholar
Wan, S., Lan, Y., Guo, J., et al.: A deep architecture for semantic matching with multiple positional sentence representations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1 (2016)
Google Scholar
Chang, W.C., Yu, F.X., Chang, Y.W., et al.: Pre-training tasks for embedding-based large-scale retrieval. arXiv preprint arXiv:2002.03932 (2020)
Vaswani, A., Shazeer, N., Parmar, N., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 1–11 (2017)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)
Google Scholar
Liu, Y., Ott, M., Goyal, N., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Sun, Y., Cheng, C., Zhang, Y., et al.: Circle loss: a unified perspective of pair similarity optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6398–6407 (2020)
Google Scholar
Wang, X., Han, X., Huang, W., et al.: Multi-similarity loss with general pair weighting for deep metric learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5022–5030 (2019)
Google Scholar
Agirre, E., Banea, C., Cardie, C., et al.: Semeval-2015 task 2: semantic textual similarity, english, spanish and pilot on interpretability. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pp. 252–263 (2015)
Google Scholar
Liu, X., Chen, Q., Deng, C., et al.: Lcqmc: a large-scale Chinese question matching corpus. In: Proceedings of the 27th International Conference on Computational Linguistics, pp. 1952–1962 (2018)
Google Scholar
Chen, J., Chen, Q., Liu, X., et al.: The BQ corpus: a large-scale domain-specific Chinese corpus for sentence semantic equivalence identification. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pp. 4946–4951 (2018)
Google Scholar
Yang, Y., Zhang, Y., Tar, C., et al.: PAWS-X: a cross-lingual adversarial dataset for paraphrase identification. arXiv preprint arXiv:1908.11828 (2019)
Cui, Y., Che, W., Liu, T., et al.: Revisiting pre-trained models for Chinese natural language processing. arXiv preprint arXiv:2004.13922 (2020)
Cui, Y., Che, W., Liu, T., et al.: Pre-training with whole word masking for Chinese bert. IEEE/ACM Trans. Audio Speech Lang. Process. 29, 3504–3514 (2021)
Article Google Scholar
Bentivogli, L., Clark, P., Dagan, I., et al.: The fifth PASCAL recognizing textual entailment challenge. In: TAC (2009)
Google Scholar
Choi, H., Kim, J., Joe, S., et al.: Evaluation of BERT and ALBERT sentence embedding performance on downstream NLP tasks. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 5482–5487. IEEE (2021)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network, vol. 2, no. 7. arXiv preprint arXiv:1503.02531 (2015)
Khattab, O., Zaharia, M.: Colbert: efficient and effective passage search via contextualized late interaction over bert. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 39–48 (2020)
Google Scholar
Tang, R., Lu, Y., Liu, L., et al.: Distilling task-specific knowledge from bert into simple neural networks. arXiv preprint arXiv:1903.12136 (2019)
Myers, L., Sirois, M.J.: Spearman correlation coefficients, differences between. Encycl. Stat. Sci. 12 (2004)
Google Scholar
Gao, T., Yao, X., Chen, D.: Simcse: simple contrastive learning of sentence embeddings. arXiv preprint arXiv:2104.08821 (2021)
Su, J., Cao, J., Liu, W., et al.: Whitening sentence representations for better semantics and faster retrieval. arXiv preprint arXiv:2103.15316 (2021)

Download references

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China under Grant U1811263.

Author information

Authors and Affiliations

South China Normal University, Guangzhou, 510631, China
Wenguang Yu, Yu Weng, Ronghua Lin & Yong Tang
Pazhou Lab, Guangzhou, 510330, China
Wenguang Yu, Ronghua Lin & Yong Tang

Authors

Wenguang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Yu Weng
View author publications
You can also search for this author in PubMed Google Scholar
Ronghua Lin
View author publications
You can also search for this author in PubMed Google Scholar
Yong Tang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yong Tang .

Editor information

Editors and Affiliations

Shandong University, Jinan, China
Yuqing Sun
Fudan University, Shanghai, China
Tun Lu
Taiyuan University of Science and Technology, Taiyuan, China
Yinzhang Guo
Shanxi Datong University, Datong, China
Xiaoxia Song
Tongji University, Shanghai, China
Hongfei Fan
Guangdong University of Technology, Guangzhou, China
Dongning Liu
University of Shanghai for Science and Technology, Shanghai, China
Liping Gao
Tongji University, Shanghai, China
Bowen Du

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, W., Weng, Y., Lin, R., Tang, Y. (2023). CoSBERT: A Cosine-Based Siamese BERT-Networks Using for Semantic Textual Similarity. In: Sun, Y., et al. Computer Supported Cooperative Work and Social Computing. ChineseCSCW 2022. Communications in Computer and Information Science, vol 1681. Springer, Singapore. https://doi.org/10.1007/978-981-99-2356-4_30

Download citation

DOI: https://doi.org/10.1007/978-981-99-2356-4_30
Published: 13 May 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-2355-7
Online ISBN: 978-981-99-2356-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)

CoSBERT: A Cosine-Based Siamese BERT-Networks Using for Semantic Textual Similarity

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Fine-grained semantic textual similarity measurement via a feature separation network

Siamese BERT Architecture Model with attention mechanism for Textual Semantic Similarity

A hybrid approach of Weighted Fine-Tuned BERT extraction with deep Siamese Bi – LSTM model for semantic text similarity identification

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

CoSBERT: A Cosine-Based Siamese BERT-Networks Using for Semantic Textual Similarity

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Fine-grained semantic textual similarity measurement via a feature separation network

Siamese BERT Architecture Model with attention mechanism for Textual Semantic Similarity

A hybrid approach of Weighted Fine-Tuned BERT extraction with deep Siamese Bi – LSTM model for semantic text similarity identification

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation