Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3689096.3689458acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Kvasir-VQA: A Text-Image Pair GI Tract Dataset

Published: 28 October 2024 Publication History

Abstract

We introduce Kvasir-VQA, an extended dataset derived from the HyperKvasir and Kvasir-Instrument datasets, augmented with question- and-answer annotations to facilitate advanced machine learning tasks in Gastrointestinal (GI) diagnostics. This dataset comprises 6,500 annotated images spanning various GI tract conditions and surgical instruments, and it supports multiple question types including yes/no, choice, location, and numerical count. The dataset is intended for applications such as image captioning, Visual Question Answering (VQA), text-based generation of synthetic medical images, object detection, and classification. Our experiments demonstrate the dataset's effectiveness in training models for three selected tasks, showcasing significant applications in medical image analysis and diagnostics. We also present evaluation metrics for each task, highlighting the usability and versatility of our dataset. The dataset and supporting artifacts are available at https://datasets.simula.no/kvasir-vqa.

References

[1]
2024. Labelbox | Data factory for the next GenAI. https://labelbox.com [Online; accessed 30. Jul. 2024].
[2]
2024. The Gastrolab Image Gallery. http://www.gastrolab.net/index.htm [Online; accessed 30. Jul. 2024].
[3]
2024. WEO Endoscopy Atlas: Search the Atlas. http://www.endoatlas.org/index. php [Online; accessed 30. Jul. 2024].
[4]
Omer F. Ahmad, Antonio S. Soares, Evangelos Mazomenos, Patrick Brandao, Roser Vega, Edward Seward, Danail Stoyanov, Manish Chand, and Laurence B. Lovat. 2019. Artificial intelligence and computer-aided diagnosis in colonoscopy: current evidence and future directions. Lancet Gastroenterology & Hepatology 4, 1 (Jan. 2019), 71--80. https://doi.org/10.1016/S2468--1253(18)30282--6
[5]
Sharib Ali, Felix Zhou, Christian Daul, Barbara Braden, Adam Bailey, Stefano Realdon, James East, GeorgesWagnières, Victor Loschenov, Enrico Grisan,Walter Blondel, and Jens Rittscher. 2019. Endoscopy artifact detection (EAD 2019) challenge dataset. arXiv preprint arXiv:1905.03209 (2019).
[6]
Max Allan et al. 2019. 2017 Robotic instrument segmentation challenge. arXiv preprint arXiv:1902.06426 (2019).
[7]
Max Allan and Mahdi Azizian. 2019. Robotic Scene Segmentation Sub-Challenge. arXiv preprint arXiv:1902.06426.
[8]
Shuroug A. Alowais, Sahar S. Alghamdi, Nada Alsuhebany, Tariq Alqahtani, Abdulrahman I. Alshaya, Sumaya N. Almohareb, Atheer Aldairem, Mohammed Alrashed, Khalid Bin Saleh, Hisham A. Badreldin, Majed S. Al Yami, Shmeylan Al Harbi, and Abdulkareem M. Albekairy. 2023. Revolutionizing healthcare: the role of artificial intelligence in clinical practice. BMC Med. Educ. 23, 1 (Dec. 2023), 1--15. https://doi.org/10.1186/s12909-023-04698-z
[9]
Fouzia Altaf, Syed M. S. Islam, Naveed Akhtar, and Naeem Khalid Janjua. 2019. Going Deep in Medical Image Analysis: Concepts, Methods, Challenges, and Future Directions. IEEE Access 7 (July 2019), 99540--99572. https://doi.org/10. 1109/ACCESS.2019.2929365
[10]
Quentin Angermann, Jorge Bernal, Cristina Sánchez-Montes, Maroua Hammami, Gloria Fernández-Esparrach, Xavier Dray, Olivier Romain, F Javier Sánchez, and Aymeric Histace. 2017. Towards real-time polyp detection in colonoscopy videos: Adapting still frame-based methodologies for video sequences analysis. In Computer Assisted and Robotic Endoscopy and Clinical Image-Based Procedures (CARE CLIP). Vol. 10550. 29--41. https://doi.org/10.1007/978--3--319--67543--5_3
[11]
Melina Arnold, Christian C. Abnet, Rachel E. Neale, Jerome Vignat, Edward L. Giovannucci, Katherine A. McGlynn, and Freddie Bray. 2020. Global Burden of 5 Major Types of Gastrointestinal Cancer. Gastroenterology 159, 1 (July 2020), 335--349.e15. https://doi.org/10.1053/j.gastro.2020.02.068
[12]
Jorge Bernal and Histace Aymeric. 2017. Gastrointestinal Image ANAlysis (GIANA) Angiodysplasia D&L challenge. https://endovissub2017-giana.grandchallenge. org/home/. Accessed: 2017--11--20.
[13]
Jorge Bernal and Histace Aymeric. 2017. MICCAI Endoscopic Vision Challenge Polyp detection and segmentation. https://endovissub2017-giana.grandchallenge. org/home/. Accessed: 2017--12--11.
[14]
Jorge Bernal, Aymeric Histace, Marc Masana, Quentin Angermann, Cristina Sánchez-Montes, Cristina Rodriguez, Maroua Hammami, Ana Garcia-Rodriguez, Henry Córdova, Olivier Romain, Gloria Fernández-Esparrach, Xavier Dray, and Javier Sanchez. 2018. Polyp detection benchmark in colonoscopy videos using gtcreator: A novel fully configurable tool for easy and fast annotation of image databases. In Proceedings of Computer Assisted Radiology and Surgery (CARS). https://doi.org/hal-01846141
[15]
Jorge Bernal, F Javier Sánchez, Gloria Fernández-Esparrach, Debora Gil, Cristina Rodríguez, and Fernando Vilariño. 2015. WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized Medical Imaging and Graphics 43 (2015), 99--111. https://doi.org/10. 1016/j.compmedimag.2015.02.007
[16]
Jorge Bernal, Javier Sánchez, and Fernando Vilarino. 2012. Towards automatic polyp detection with a polyp appearance model. Pattern Recognition 45, 9 (2012), 3166--3182. https://doi.org/10.1016/j.patcog.2012.03.002
[17]
Sebastian Bodenstedt, Max Allan, Anthony Agustinos, Xiaofei Du, Luis Garcia- Peraza-Herrera, Hannes Kenngott, Thomas Kurmann, Beat Müller-Stich, Sebastien Ourselin, Daniil Pakhomov, et al. 2018. Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery. arXiv preprint arXiv:1805.02475 (2018).
[18]
Hanna Borgli, Vajira Thambawita, Pia H. Smedsrud, Steven Hicks, Debesh Jha, Sigrun L. Eskeland, Kristin Ranheim Randel, Konstantin Pogorelov, Mathias Lux, Duc Tien Dang Nguyen, Dag Johansen, Carsten Griwodz, Håkon K. Stensland, Enrique Garcia-Ceja, Peter T. Schmidt, Hugo L. Hammer, Michael A. Riegler, Pål Halvorsen, and Thomas de Lange. 2020. HyperKvasir, a comprehensive multiclass image and video dataset for gastrointestinal endoscopy. Sci. Data 7, 283 (Aug. 2020), 1--14. https://doi.org/10.1038/s41597-020-00622-y
[19]
Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas Müller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, Kyle Lacey, Alex Goodwin, Yannik Marek, and Robin Rombach. 2024. Scaling Rectified Flow Transformers for High-Resolution Image Synthesis. arXiv (March 2024). https://doi.org/10.48550/arXiv.2403.03206 arXiv:2403.03206
[20]
Julio Murra-Saca et al. 2019. El Salvador Atlas of Gastrointestinal Video Endoscopy. http://www.gastrointestinalatlas.com/index.html. Accessed: 2019--12- 16.
[21]
Jan Andre Fagereng, Vajira Thambawita, Andrea M. Storås, Sravanthi Parasa, Thomas de Lange, Pål Halvorsen, and Michael A. Riegler. 2022. PolypConnect: Image inpainting for generating realistic gastrointestinal tract images with polyps. In 2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS). 66--71. https://doi.org/10.1109/CBMS55023.2022.00019
[22]
H. Gelberg. 2018. Pathophysiological Mechanisms of Gastrointestinal Toxicity. Comprehensive Toxicology (2018), 139. https://doi.org/10.1016/B978-0--12--801238- 3.10923--7
[23]
Kelei He, Chen Gan, Zhuoyuan Li, Islem Rekik, Zihao Yin, Wen Ji, Yang Gao, Qian Wang, Junfeng Zhang, and Dinggang Shen. 2023. Transformers in medical image analysis. Intelligent Medicine 3, 1 (Feb. 2023), 59--78. https://doi.org/10. 1016/j.imed.2022.07.002
[24]
Steven Hicks, AndreaMStorås, Pål Halvorsen, Thomas de Lange, Michael Riegler, and Vajira Thambawita. 2023. Overview of ImageCLEFmedical 2023-Medical Visual Question Answering for Gastrointestinal Tract. In CLEF (Working Notes). 1316--1327.
[25]
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. 2021. LoRA: Low-Rank Adaptation of Large Language Models. arXiv (June 2021). https://doi.org/10.48550/arXiv.2106.09685 arXiv:2106.09685
[26]
Mahmoud Ibrahim, Yasmina Al Khalil, Sina Amirrajab, Chang Sun, Marcel Breeuwer, Josien Pluim, Bart Elen, Gokhan Ertaylan, and Michel Dumontier. 2024. Generative AI for Synthetic Data Across Multiple Medical Modalities: A Systematic Review of Recent Developments and Challenges. arXiv (June 2024). https://doi.org/10.48550/arXiv.2407.00116 arXiv:2407.00116
[27]
Debesh Jha, Sharib Ali, Krister Emanuelsen, Steven A. Hicks, Vajira Thambawita, Enrique Garcia-Ceja, Michael A. Riegler, Thomas de Lange, Peter T. Schmidt, Håvard D. Johansen, Dag Johansen, and Pål Halvorsen. 2021. Kvasir- Instrument: Diagnostic and Therapeutic Tool Segmentation Dataset in Gastrointestinal Endoscopy. In MultiMedia Modeling. Springer, Cham, Switzerland, 218-- 229. https://doi.org/10.1007/978--3-030--67835--7_19
[28]
Debesh Jha, Sharib Ali, Steven Hicks, Vajira Thambawita, Hanna Borgli, Pia H. Smedsrud, Thomas de Lange, Konstantin Pogorelov, Xiaowei Wang, Philipp Harzig, Minh-Triet Tran, Wenhua Meng, Trung-Hieu Hoang, Danielle Dias, Tobey H. Ko, Taruna Agrawal, Olga Ostroukhova, Zeshan Khan, Muhammad Atif Tahir, Yang Liu, Yuan Chang, Mathias Kirkerød, Dag Johansen, Mathias Lux, Håvard D. Johansen, Michael A. Riegler, and Pål Halvorsen. 2021. A comprehensive analysis of classification methods in gastrointestinal endoscopy imaging. Medical Image Analysis 70 (2021), 102007. https://doi.org/10.1016/j.media.2021.102007
[29]
Debesh Jha, Pia H Smedsrud, Michael A Riegler, Pål Halvorsen, Thomas de Lange, Dag Johansen, and Håvard D Johansen. 2020. Kvasir-seg: A segmented polyp dataset. In Proceeding of International Conference on Multimedia Modeling (MMM), Vol. 11962. 451--462. https://doi.org/10.1007/978--3-030--37734--2_37
[30]
Vasudha Joshi, Pabitra Mitra, and Supratik Bose. 2024. Multi-modal multi-head self-attention for medical VQA. Multimed. Tools Appl. 83, 14 (April 2024), 42585-- 42608. https://doi.org/10.1007/s11042-023--17162--3
[31]
Michal F. Kaminski, Siwan Thomas-Gibson, Marek Bugajski, Michael Bretthauer, Colin J. Rees, Evelien Dekker, Geir Hoff, Rodrigo Jover, Stepan Suchanek, Monika Ferlitsch, John Anderson, Thomas Roesch, Rolf Hultcranz, Istvan Racz, Ernst J. Kuipers, Kjetil Garborg, James E. East, Maciej Rupinski, Birgitte Seip, Cathy Bennett, Carlo Senore, Silvia Minozzi, Raf Bisschops, Dirk Domagk, Roland Valori, Cristiano Spada, Cesare Hassan, Mario Dinis-Ribeiro, and Matthew D. Rutter. 2017. Performance measures for lower gastrointestinal endoscopy: a European Society of Gastrointestinal Endoscopy (ESGE) Quality Improvement Initiative. Endoscopy 49, 04 (April 2017), 378--397. https://doi.org/10.1055/s-0043--103411
[32]
Amirhossein Kazerouni, Ehsan Khodapanah Aghdam, Moein Heidari, Reza Azad, Mohsen Fayyaz, Ilker Hacihaliloglu, and Dorit Merhof. 2023. Diffusion models in medical imaging: A comprehensive survey. Med. Image Anal. 88 (Aug. 2023), 102846. https://doi.org/10.1016/j.media.2023.102846
[33]
Yash Khare, Viraj Bagal, Minesh Mathew, Adithi Devi, U. Deva Priyakumar, and C. V. Jawahar. 2021. MMBERT: Multimodal BERT Pretraining for Improved Medical VQA. In IEEE 18th International Symposium on Biomedical Imaging (ISBI). IEEE, 13--16. https://doi.org/10.1109/ISBI48211.2021.9434063
[34]
Dow-Mu Koh, Nickolas Papanikolaou, Ulrich Bick, Rowland Illing, Charles E. Kahn, Jayshree Kalpathi-Cramer, Celso Matos, Luis Martí-Bonmatí, Anne Miles, Seong Ki Mun, Sandy Napel, Andrea Rockall, Evis Sala, Nicola Strickland, and Fred Prior. 2022. Artificial intelligence and machine learning in cancer imaging. Commun. Med. 2, 133 (Oct. 2022), 1--14. https://doi.org/10.1038/s43856-022- 00199-0
[35]
Anastasios Koulaouzidis, Dimitris K. Iakovidis, Diana E. Yung, Emanuele Rondonotti, Uri Kopylov, John N. Plevris, Ervin Toth, Abraham Eliakim, Gabrielle Wurm Johansson, Wojciech Marlicz, Georgios Mavrogenis, Artur Nemeth, Henrik Thorlacius, and Gian Eugenio Tontini. 2017. KID Project: an internet-based digital video atlas of capsule endoscopy for research purposes. Endoscopy international open 5, 6 (May 2017), E477--E483. https://doi.org/10.1055/s- 0043--105488
[36]
Zhihong Lin, Donghao Zhang, Qingyi Tao, Danli Shi, Gholamreza Haffari, Qi Wu, Mingguang He, and Zongyuan Ge. 2023. Medical visual question answering: A survey. Artif. Intell. Med. 143 (Sept. 2023), 102611. https://doi.org/10.1016/j. artmed.2023.102611
[37]
Siyu Lu, Yueming Ding, Mingzhe Liu, Zhengtong Yin, Lirong Yin, and Wenfeng Zheng. 2023. Multiscale Feature Extraction and Fusion of Image and Text in VQA. Int. J. Comput. Intell. Syst. 16, 1 (Dec. 2023), 1--11. https://doi.org/10.1007/s44196- 023-00233--6
[38]
David M. Martin. 2019. The Atlas of Gastrointestinal Endoscope. http://www. endoatlas.com/atlas_1.html. Accessed: 2019--12--12.
[39]
Pablo Mesejo, Daniel Pizarro, Armand Abergel, Olivier Rouquette, Sylvain Beorchia, Laurent Poincloux, and Adrien Bartoli. 2019. Gastrointestinal Lesions in Regular Colonoscopy Dataset. http://www.depeca.uah.es/colonoscopy_dataset/. Accessed: 2019--12--12.
[40]
Konstantin Mishchenko and Aaron Defazio. 2023. Prodigy: An Expeditiously Adaptive Parameter-Free Learner. arXiv (June 2023). https://doi.org/10.48550/ arXiv.2306.06101 arXiv:2306.06101
[41]
Usman Naseem, Matloob Khushi, and Jinman Kim. 2022. Vision-Language Transformer for Interpretable Pathology Visual Question Answering. IEEE J. Biomed. Health Inf. 27, 4 (March 2022), 1681--1690. https://doi.org/10.1109/JBHI.2022. 3163751
[42]
Khalid Nassiri and Moulay A. Akhloufi. 2024. Recent Advances in Large Language Models for Healthcare. BioMedInformatics 4, 2 (April 2024), 1097--1143. https: //doi.org/10.3390/biomedinformatics4020062
[43]
Christine B. Navarre and D. G. Pugh. 2002. Diseases of the Gastrointestinal System. Sheep & Goat Medicine (2002), 69. https://doi.org/10.1016/B0--72--169052- 1/50006--5
[44]
Shaoyan Pan, Tonghe Wang, Richard L. J. Qiu, Marian Axente, Chih-Wei Chang, Junbo Peng, Ashish B. Patel, Joseph Shelton, Sagar A. Patel, Justin Roper, and Xiaofeng Yang. 2023. 2D medical image synthesis using transformer-based denoising diffusion probabilistic model. Phys. Med. Biol. 68, 10 (May 2023), 105004. https://doi.org/10.1088/1361--6560/acca5c
[45]
Andreas S. Panayides, Amir Amini, Nenad D. Filipovic, Ashish Sharma, Sotirios A. Tsaftaris, Alistair Young, David Foran, Nhan Do, Spyretta Golemati, Tahsin Kurc, Kun Huang, Konstantina S. Nikita, Ben P. Veasey, Michalis Zervakis, Joel H. Saltz, and Constantinos S. Pattichis. 2020. AI in Medical Imaging Informatics: Current Challenges and Future Directions. IEEE J. Biomed. Health Inf. 24, 7 (May 2020), 1837--1857. https://doi.org/10.1109/JBHI.2020.2991043
[46]
Konstantin Pogorelov, Kristin Ranheim Randel, Thomas de Lange, Sigrun Losada Eskeland, Carsten Griwodz, Dag Johansen, Concetto Spampinato, Mario Taschwer, Mathias Lux, Peter Thelin Schmidt, Michael Riegler, and Pål Halvorsen. 2017. Nerthus: A Bowel Preparation Quality Video Dataset. In Proceedings of the ACM Multimedia Systems Conference (ACM MMSYS). 170--174. https: //doi.org/10.1145/3083187.3083216
[47]
Konstantin Pogorelov, Kristin Ranheim Randel, Carsten Griwodz, Sigrun Losada Eskeland, Thomas de Lange, Dag Johansen, Concetto Spampinato, Duc-Tien Dang-Nguyen, Mathias Lux, Peter Thelin Schmidt, Michael Riegler, and Pål Halvorsen. 2017. Kvasir: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection. In Proceedings of the ACM Multimedia Systems Conference (ACM MMSYS). 164--169. https://doi.org/10.1145/3083187.3083212
[48]
Konstantin Pogorelov, Kristin Ranheim Randel, Carsten Griwodz, Sigrun Losada Eskeland, Thomas de Lange, Dag Johansen, Concetto Spampinato, Duc-Tien Dang-Nguyen, Mathias Lux, Peter Thelin Schmidt, Michael Riegler, and Pål Halvorsen. 2017. KVASIR: A Multi-Class Image Dataset for Computer Aided Gastrointestinal Disease Detection. In Proceedings of the 8th ACM on Multimedia Systems Conference (MMSys'17). ACM, 164--169. https://doi.org/10.1145/3083187. 3083212
[49]
Tobias Ross et al. 2020. Robust Medical Instrument Segmentation Challenge 2019. arXiv preprint arXiv:2003.10299 (2020).
[50]
Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael Rubinstein, and Kfir Aberman. [n. d.]. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 17--24. https://doi.org/10.1109/CVPR52729. 2023.02155
[51]
Adrian S?ftoiu, Cesare Hassan, Miguel Areia, Manoop S. Bhutani, Raf Bisschops, Erwan Bories, Irina M. Cazacu, Evelien Dekker, Pierre H. Deprez, Stephen P. Pereira, Carlo Senore, Riccardo Capocaccia, Giulio Antonelli, Jeanin van Hooft, Helmut Messmann, Peter D. Siersema, Mario Dinis-Ribeiro, and Thierry Ponchon. 2020. Role of gastrointestinal endoscopy in the screening of digestive tract cancers in Europe: European Society of Gastrointestinal Endoscopy (ESGE) Position Statement. Endoscopy 52, 04 (April 2020), 293--304. https://doi.org/10.1055/a- 1104--5245
[52]
Lalithkumar Seenivasan, Mobarakol Islam, Adithya K. Krishna, and Hongliang Ren. 2022. Surgical-VQA: Visual Question Answering in Surgical Scenes Using Transformer. In Medical Image Computing and Computer Assisted Intervention -- MICCAI 2022. Springer, Cham, Switzerland, 33--43. https://doi.org/10.1007/978--3-031--16449--1_4
[53]
Alexander Selivanov, Oleg Y. Rogov, Daniil Chesakov, Artem Shelmanov, Irina Fedulova, and Dmitry V. Dylov. 2023. Medical image captioning via generative pretrained transformers. Sci. Rep. 13, 4171 (March 2023), 1--12. https://doi.org/10.1038/s41598-023--31223--5
[54]
Juan Silva, Aymeric Histace, Olivier Romain, Xavier Dray, and Bertrand Granado. 2014. Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. International Journal of Computer Assisted Radiology and Surgery 9, 2 (2014), 283--293. https://doi.org/10.1007/s11548-013-0926--3
[55]
Pia H. Smedsrud, Vajira Thambawita, Steven A. Hicks, Henrik Gjestang, Oda Olsen Nedrejord, Espen Næss, Hanna Borgli, Debesh Jha, Tor Jan Derek Berstad, Sigrun L. Eskeland, Mathias Lux, Håvard Espeland, Andreas Petlund, Duc Tien Dang Nguyen, Enrique Garcia-Ceja, Dag Johansen, Peter T. Schmidt, Ervin Toth, Hugo L. Hammer, Thomas de Lange, Michael A. Riegler, and Pål Halvorsen. 2021. Kvasir-Capsule, a video capsule endoscopy dataset. Sci. Data 8, 142 (May 2021), 1--10. https://doi.org/10.1038/s41597-021-00920-z
[56]
Abhishek Srivastava, Nikhil Kumar Tomar, Ulas Bagci, and Debesh Jha. 2022. Video Capsule Endoscopy Classification using Focal Modulation Guided Convolutional Neural Network. In IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS). IEEE, 21--23. https://doi.org/10.1109/CBMS55023.2022. 00064
[57]
Nima Tajbakhsh, Suryakanth R Gurudu, and Jianming Liang. 2016. Automated Polyp Detection in Colonoscopy Videos Using Shape and Context Information. IEEE Transactions on Medical Imaging 35, 2 (2016), 630--644. https://doi.org/10. 1109/TMI.2015.2487997
[58]
Nefeli Panagiota Tzavara and Bjørn-Jostein Singstad. 2021. Transfer learning in polyp and endoscopic tool segmentation from colonoscopy images. Nordic Machine Intelligence 1, 1 (2021), 32--34. https://doi.org/10.5617/nmi.9132
[59]
Ramakrishna Vedantam, C. Lawrence Zitnick, and Devi Parikh. [n. d.]. CIDEr: Consensus-based image description evaluation. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/CVPR. 2015.7299087
[60]
Pu Wang, Tyler M. Berzin, Jeremy Romek Glissen Brown, Shishira Bharadwaj, Aymeric Becq, Xun Xiao, Peixi Liu, Liangping Li, Yan Song, Di Zhang, Yi Li, Guangre Xu, Mengtian Tu, and Xiaogang Liu. 2019. Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study. Gut 68, 10 (Oct. 2019), 1813--1819. https://doi.org/10.1136/gutjnl-2018--317500
[61]
Xing Wu, Cheng Chen, Mingyu Zhong, and Jianjia Wang. 2021. HAL: Hybrid active learning for efficient labeling in medical domain. Neurocomputing 456 (2021), 563--572. https://doi.org/10.1016/j.neucom.2020.10.115
[62]
Bin Xiao, HaipingWu,Weijian Xu, Xiyang Dai, Houdong Hu, Yumao Lu, Michael Zeng, Ce Liu, and Lu Yuan. 2023. Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks. arXiv (Nov. 2023). https://doi.org/10.48550/arXiv.2311.06242 arXiv:2311.06242

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
VLM4Bio'24: Proceedings of the First International Workshop on Vision-Language Models for Biomedical Applications
October 2024
53 pages
ISBN:9798400712074
DOI:10.1145/3689096
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 October 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. gastrointestinal diagnostics
  2. machine learning in healthcare
  3. medical image analysis
  4. medical image captioning
  5. visual question answering (vqa)

Qualifiers

  • Research-article

Conference

MM '24
Sponsor:
MM '24: The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne VIC, Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 83
    Total Downloads
  • Downloads (Last 12 months)83
  • Downloads (Last 6 weeks)16
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media