Application of Swin Transformer Model to Retrieve and Classify Endoscopic Images

Luu, Ngo Duc; Anh, Vo Thai

doi:10.1007/978-981-99-7666-9_13

Ngo Duc Luu⁸ &
Vo Thai Anh⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1950))

Included in the following conference series:

International Conference on Intelligent Systems and Data Science

290 Accesses

Abstract

The machine learning community is very interested in image classification and retrieval, especially in the area of computer vision and with an emphasis on medical image retrieval. Numerous machine learning approaches have been used for image retrieval problems and have made as a result of the ongoing developments in techniques like Convolutional Neral Networks (CNN) and Vision Transformers with quite good performances. The Swin Transformer model is used to create a specialized medical image retrieval system in this paper that is well suited to gastric endoscopic pictures. The suggested technique takes advantage of the Swin Transformer model's classification process to create feature vectors by combining fragmented image segments collected from local windows, making it easier to calculate similarity on the Kvasir dataset that we have added some additional images. Empirical results show that the Swin Transformer model retrieves endoscopic images with a remarkable classification accuracy of 90.5% and an 85% mean average precision at top 20 (mAP@20).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Endoscopic Image Classification and Retrieval using Clustered Convolutional Features

Article 30 October 2017

RetrieveNet: a novel deep network for medical image retrieval

Article 11 April 2020

A Review on Classification and Retrieval of Biomedical Images Using Artificial Intelligence

References

Tang, J., Qu, M., Wang, M., Zhang, M., Yan, J., Mei, Q.: Line: large-scale information network embedding. In: Proceedings of the 24th International Conference on World Wide Web, pp. 1067–1077. ACM (2015)
Google Scholar
Rao, N., Jiang, H., Luo, C.: Review on the applications of deep learning in the analysis of gastrointestinal endoscopy images. Article in IEEE Access, September 2019
Google Scholar
Sommen, F., Zinger, S., Schoon, E.J. (eds.) Computer-aided detection of early Cancer in the Esophagus Using HD endoscopy images. In: Medical Imaging 2013: Computer-Aided Diagnosis, vol. 8670. International Society for Optics and Photonics, Florida (2013)
Google Scholar
Hu, H., et al.: Content-based gastric image retrieval using convolutional neural networks. Accepted 20 July 2020
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16 × 16 words: transformers for image recognition at scale. Submitted on 22 Oct 2020 (v1)
Google Scholar
Trinh, Q.-H., Nguyen, M.-V.: Endoscopy image retrieval by mixer multi-layer perceptron. Computer Science and Information Systems, pp. 223±226. ACSIS. ISSN 2300-5963
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. Submitted on 25 Mar 2021 (v1)
Google Scholar
Pogorelov, K., Randel, K.R., Griwodz, C., Eskeland, S.L., de Lange, T., Johansen, D., et al. (eds.) Kvasir: A multi-class image dataset for computer aided gastrointestinal disease detection. Paper presented at: Proceedings of the 8th ACM on Multimedia Systems Conference. ACM (2017)
Google Scholar
Touvron, H., Cord, M., Douze, M., Massa, F., Sablayrolles, A., Jégou, H.: Training data-efficient image transformers & distillation through attention (2020)
Google Scholar
Zeiler, M.: ADADELTA: An adaptive learning rate method. Endoscopic Image Classification and Retrieval use of the Clustered Convolutedonal Features, p. 1212 (2012)
Google Scholar
Dubey, S.R., Singh, S.K., Chu, W.-T.: Vision transformer hashing for image retrieval, 26 September 2021
Google Scholar
Xia, X., Xu, C., Nan, B.: Inception-v3 for flower classification, pp. 783–787 (2017). https://doi.org/10.1109/ICIVC.2017.7984661
Chebbi, I.: VGG16: VGQR (2021)
Google Scholar
Chollet, F.: Xception: deep learning with depthwise separable convolutions, pp. 1800–1807 (2017). https://doi.org/10.1109/CVPR.2017.195
Pogorelov, K., et al.: KVASIR: a multi-class image dataset for computer aided gastrointestinal disease detection (2017). https://doi.org/10.1145/3083187.3083212
Maruyama, T., et al.: Comparison of medical image classification accuracy on the machine learning methods. J. X-ray Sci. Technol. 266, 885, 893 (2018)
Google Scholar
Yadav, S.S., Jadhav, S.M.: Deep convolutional neural network based medical image classification for disease diagnosis. J. Big Data 6, 1–18 (2019)
Article Google Scholar
Ahmad, J., Muhammad, K., Baik, S.: Medical image retrieval with compact binary codes generated in frequency domain using highly reactive convolutional features. J. Med. Syst. 42, 119 (2017). https://doi.org/10.1007/s10916-017-0875-4
Article Google Scholar
Shamna, P., Govindan, V.K., Nazeer, K.A.: Content-based medical image retrieval by spatial matching of visual words. J. King Saud Univ. Comp. Inf. Sci. 34 (2018). https://doi.org/10.1016/j.jksuci.2018.10.002
Image content based retrieval system using cosine similarity for skin disease images. ACSIJ Adv. Comput. Sci. Int. J. 2 (2013)
Google Scholar
Song, C., Yoon, J., Choi, S., Avrithis, Y.: Boosting vision transformers for image retrieval (2022)
Google Scholar
El-Nouby, A., Neverova, N., Laptev, I., Jégou, H.: Training vision transformers for image retrieval (2021)
Google Scholar
Thakrar, A., et al.: Semantic retrieval of similar radiological images using vision transformers (2023). https://doi.org/10.1101/2023.02.16.23286056
Feng, Q., et al.: EViT: Privacy-preserving image retrieval via encrypted vision transformer in cloud computing (2023)
Google Scholar
Tang, T., et al.: Learning self-regularized adversarial views for self-supervised vision transformers (2022). https://doi.org/10.48550/arXiv.2210.08458

Download references

Author information

Authors and Affiliations

Bac Lieu University, Bac Lieu, Vietnam
Ngo Duc Luu
Can Tho University, Can Tho, Vietnam
Vo Thai Anh

Authors

Ngo Duc Luu
View author publications
You can also search for this author in PubMed Google Scholar
Vo Thai Anh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ngo Duc Luu .

Editor information

Editors and Affiliations

Can Tho University, Can Tho, Vietnam
Nguyen Thai-Nghe
Can Tho University, Can Tho, Vietnam
Thanh-Nghi Do
Mahidol University, Salaya, Thailand
Peter Haddawy

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luu, N.D., Anh, V.T. (2024). Application of Swin Transformer Model to Retrieve and Classify Endoscopic Images. In: Thai-Nghe, N., Do, TN., Haddawy, P. (eds) Intelligent Systems and Data Science. ISDS 2023. Communications in Computer and Information Science, vol 1950. Springer, Singapore. https://doi.org/10.1007/978-981-99-7666-9_13

Download citation

DOI: https://doi.org/10.1007/978-981-99-7666-9_13
Published: 31 October 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7665-2
Online ISBN: 978-981-99-7666-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Application of Swin Transformer Model to Retrieve and Classify Endoscopic Images

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Endoscopic Image Classification and Retrieval using Clustered Convolutional Features

RetrieveNet: a novel deep network for medical image retrieval

A Review on Classification and Retrieval of Biomedical Images Using Artificial Intelligence

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Application of Swin Transformer Model to Retrieve and Classify Endoscopic Images

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Endoscopic Image Classification and Retrieval using Clustered Convolutional Features

RetrieveNet: a novel deep network for medical image retrieval

A Review on Classification and Retrieval of Biomedical Images Using Artificial Intelligence

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation