Nested Semisupervised Learning for Cross-Note Abbreviation Detection in Vietnamese Clinical Texts

Chau, Vo Thi Ngoc; Phung, Nguyen Hua

doi:10.1007/978-3-031-42430-4_49

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1863))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

428 Accesses

Abstract

Abbreviation detection in clinical texts is popular and significant due to its contribution to enhancing readability and shareability of electronic medical records (EMRs). Nonetheless, it is limited to low-resource languages like Vietnamese because there is no available labeled dataset for the task. More development is thus needed to handle this task on Vietnamese clinical texts. On the other hand, there are many different note types where abbreviations are generated and used by many various groups of physicians, nurses, and other stakeholders. This fact leads to the necessity of processing a wide diversity of clinical texts for abbreviation detection. At this moment, none of the existing works takes into account the context where abbreviation detection is asked for the clinical texts that belong to one note type, unfortunately with the availability of the labeled clinical texts of another note type. This challenge results in a so-called cross-note abbreviation detection task in our work. In such a context, we address this task on Vietnamese clinical texts by proposing nested semisupervised learning. Our resulting Nested-SSL method is capable of detecting abbreviations in real Vietnamese clinical texts effectively. It is based on an existing semisupervised learning method and then boosts the core semisupervised learning process by a fold-based enhancement scheme in favor of F-measure of the minority class. In the empirical evaluation with real EMRs, Nested-SSL always outperforms its base semisupervised learning method and some existing ones. Its better performance lays the foundations for effectively preprocessing Vietnamese clinical texts in other tasks on EMRs.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Cross-Domain Abbreviation Disambiguation on Vietnamese Clinical Texts in Online Processing

Abbreviation Identification in Clinical Notes with Level-wise Feature Engineering and Supervised Learning

Supervised Clinical Abbreviations Detection and Normalisation Approach

References

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Article MATH Google Scholar
Cover, T., Hart, P.: Nearest neighbor pattern classification. IEEE Trans. Inf. Theory 13, 21–27 (1967)
Article MATH Google Scholar
Collard, B., Royal, A.: The use of abbreviations in surgical note keeping. Ann. Med. Surg. 4, 100–102 (2015)
Article Google Scholar
Cossin, S., Jolly, M., Larrouture, I., Griffier, R., Jouhet, V.: Semi-automatic extraction of abbreviations and their senses from electronic health records. In: Proceedings of IA & Santé 2021, pp. 1–13 (2021)
Google Scholar
van Engelen, J.E., Hoos, H.H.: A survey on semi-supervised learning. Mach. Learn. 109, 373–440 (2020). https://doi.org/10.1007/s10994-019-05855-6
Article MathSciNet MATH Google Scholar
Heryawan, L., et al.: A detection of informal abbreviations from free text medical notes using deep learning. EJBI 16(1), 29–37 (2020). https://doi.org/10.24105/ejbi.2020.16.1.29
Article Google Scholar
Kreuzthaler, M., Oleynik, M., Avian, A., Schulz, S.: Unsupervised abbreviation detection in clinical narratives. In: Proceedings of the Clinical Natural Language Processing Workshop, pp. 91–98 (2016)
Google Scholar
Kreuzthaler, M., Schulz, S.: Detection of sentence boundaries and abbreviations in clinical narratives. BMC Med. Inform. Decis. Making 15, 1–13 (2015)
Article Google Scholar
Kubal, D., Nagvenkar, A.: Effective ensembling of transformer based language models for acronyms identification. In: Proceedings of SDU@ AAAI, pp. 1–6 (2021)
Google Scholar
Li, J., Zhu, Q.: Semi-supervised self-training method based on an optimum-path forest. IEEE Access 7, 36388–36399 (2019). https://doi.org/10.1109/ACCESS.2019.2903839
Article Google Scholar
Li, S., Yang, C., Liang, T., Zhu, X., Yu, C., Yang, Y.: Acronym extraction with hybrid strategies. In: Proceedings of SDU@ AAAI, pp. 1–7 (2022)
Google Scholar
Long, W.J.: Parsing free text nursing notes. In: Proceedings of AMIA Annual Symposium, p. 917 (2003)
Google Scholar
Moon, S., Pakhomov, S., Melton, G.: Clinical Abbreviation Sense Inventory. University of Minnesota Digital Conservancy (2012). http://hdl.handle.net/11299/137703. Accessed 13 Jan 2019
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann (1993)
Google Scholar
Sharma, P., Saadany, H., Zilio, L., Kanojia, D., Orăsan, C.: An ensemble approach to acronym extraction using transformers. In: Proceedings of SDU@ AAAI, pp. 1–6 (2022)
Google Scholar
Shilo, L., Shilo, G.: Analysis of abbreviations used by residents in admission notes and discharge summaries. QJM Int. J. Med. 111(3), 179–183 (2018)
Article Google Scholar
Triguero, I., García, S., Herrera, F.: Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study. Knowl. Inf. Syst. 42(2), 245–284 (2015). https://doi.org/10.1007/s10115-013-0706-y
Article Google Scholar
Weka 3. http://www.cs.waikato.ac.nz/ml/weka. Accessed 28 June 2017
Wu, Y., Denny, J.C., Rosenbloom, S.T., Miller, R.A., Giuse, D.A., Xu, H.: A comparative study of current clinical natural language processing systems on handling abbreviations in discharge summaries. In: Proceedings of AMIA Annual Symposium, pp. 997–1003 (2012)
Google Scholar
Wu, Y., et al.: Detecting abbreviations in discharge summaries using machine learning methods. In: Proceedings of AMIA Annual Symposium, pp. 1541–1549 (2011)
Google Scholar
Wu, Y., Tang, B., Jiang, M., Moon, S., Denny, J.C., Xu, H.: Clinical acronym/abbreviation normalization using a hybrid approach. In: Proceedings of CLEF, pp. 1–9 (2013)
Google Scholar
Wu, Y., et al.: A long journey to short abbreviations: developing an open-source framework for clinical abbreviation recognition and disambiguation (CARD). J. Am. Med. Inform. Assoc. 24(e1), e79–e86 (2017)
Article Google Scholar
Xu, H., Stetson, P.D., Friedman, C.: A study of abbreviations in clinical notes. In: Proceedings of AMIA Annual Symposium, pp. 822–825 (2007)
Google Scholar
Zhao, S., Li, J.: A semi-supervised self-training method based on density peaks and natural neighbors. J. Ambient Intell. Human. Comput. 1–15 (2020). https://doi.org/10.1007/s12652-020-02451-8
Zhou, Z.H., Li, M.: Tri-Training: exploiting unlabeled data using three classifiers. IEEE Trans. Knowl. Data Eng. 17(11), 1529–1541 (2005). https://doi.org/10.1109/TKDE.2005.186
Article Google Scholar

Download references

Acknowledgment

This research is funded by Vietnam National University – Ho Chi Minh City (VNU-HCM) under grant number C2022-20-11.

In addition, our sincere thanks go to Dr. Nguyen Thi Minh Huyen and her team at University of Science, Vietnam National University, Hanoi, Vietnam, for the helpful resources. We also thank the providers of the Vietnamese electronic medical records very much.

Author information

Authors and Affiliations

Ho Chi Minh City University of Technology, Vietnam National University – Ho Chi Minh City, Ho Chi Minh City, Vietnam
Vo Thi Ngoc Chau & Nguyen Hua Phung

Authors

Vo Thi Ngoc Chau
View author publications
You can also search for this author in PubMed Google Scholar
Nguyen Hua Phung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Vo Thi Ngoc Chau or Nguyen Hua Phung .

Editor information

Editors and Affiliations

Wrocław University of Technology, Wrocław, Poland
Ngoc Thanh Nguyen
King Mongkut's Institute of Technology Ladkrabang, Bangkok, Thailand
Siridech Boonsang
Iwate Prefectural University, Iwate, Japan
Hamido Fujita
Wrocław University of Science and Technology, Wrocław, Poland
Bogumiła Hnatkowska
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
King Mongkut's Institute of Technology, Ladkrabang, Thailand
Kitsuchart Pasupa
Malaysia Japan International Institute of Technology, Kuala Lumpur, Malaysia
Ali Selamat

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chau, V.T.N., Phung, N.H. (2023). Nested Semisupervised Learning for Cross-Note Abbreviation Detection in Vietnamese Clinical Texts. In: Nguyen, N.T., et al. Recent Challenges in Intelligent Information and Database Systems. ACIIDS 2023. Communications in Computer and Information Science, vol 1863. Springer, Cham. https://doi.org/10.1007/978-3-031-42430-4_49

Download citation

DOI: https://doi.org/10.1007/978-3-031-42430-4_49
Published: 29 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-42429-8
Online ISBN: 978-3-031-42430-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Nested Semisupervised Learning for Cross-Note Abbreviation Detection in Vietnamese Clinical Texts

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Cross-Domain Abbreviation Disambiguation on Vietnamese Clinical Texts in Online Processing

Abbreviation Identification in Clinical Notes with Level-wise Feature Engineering and Supervised Learning

Supervised Clinical Abbreviations Detection and Normalisation Approach

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Nested Semisupervised Learning for Cross-Note Abbreviation Detection in Vietnamese Clinical Texts

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Cross-Domain Abbreviation Disambiguation on Vietnamese Clinical Texts in Online Processing

Abbreviation Identification in Clinical Notes with Level-wise Feature Engineering and Supervised Learning

Supervised Clinical Abbreviations Detection and Normalisation Approach

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation