research-article

Multi-Class Latent Concept Pooling for Computer-Aided Endoscopy Diagnosis

Authors:

Haibin YuAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 13, Issue 2

Article No.: 15, Pages 1 - 18

https://doi.org/10.1145/3051481

Published: 21 March 2017 Publication History

Abstract

Successful computer-aided diagnosis systems typically rely on training datasets containing sufficient and richly annotated images. However, detailed image annotation is often time consuming and subjective, especially for medical images, which becomes the bottleneck for the collection of large datasets and then building computer-aided diagnosis systems. In this article, we design a novel computer-aided endoscopy diagnosis system to deal with the multi-classification problem of electronic endoscopy medical records (EEMRs) containing sets of frames, while labels of EEMRs can be mined from the corresponding text records using an automatic text-matching strategy without human special labeling. With unambiguous EEMR labels and ambiguous frame labels, we propose a simple but effective pooling scheme called Multi-class Latent Concept Pooling, which learns a codebook from EEMRs with different classes step by step and encodes EEMRs based on a soft weighting strategy. In our method, a computer-aided diagnosis system can be extended to new unseen classes with ease and applied to the standard single-instance classification problem even though detailed annotated images are unavailable. In order to validate our system, we collect 1,889 EEMRs with more than 59K frames and successfully mine labels for 348 of them. The experimental results show that our proposed system significantly outperforms the state-of-the-art methods. Moreover, we apply the learned latent concept codebook to detect the abnormalities in endoscopy images and compare it with a supervised learning classifier, and the evaluation shows that our codebook learning method can effectively extract the true prototypes related to different classes from the ambiguous data.

References

[1]

Radhakrishna Achanta, Appu Shaji, Kevin Smith, Aurelien Lucchi, Pascal Fua, and Sabine Susstrunk. 2012. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 34, 11 (2012), 2274--2282.

Digital Library

[2]

Z. K. Baker and V. K. Prasanna. 2005. A computationally efficient engine for flexible intrusion detection. IEEE Trans. VLSI Syst. 13, 10 (2005), 1179--1189.

Digital Library

[3]

Md. Khayrul Bashar, Kensaku Mori, Yasuhito Suenaga, Takayuki Kitasaka, and Yoshito Mekada. 2008. Detecting informative frames from wireless capsule endoscopic video using color and texture features. In MICCAI. 603--610.

Digital Library

[4]

Ylan Boureau, Francis Bach, Yann Lecun, and Jean Ponce. 2010. Learning mid-level features for recognition. In CVPR. 2559--2566.

[5]

Anna M. Buchner, Muhammad W. Shahid, Michael G. Heckman, Murli Krishna, Marwan Ghabril, Muhammad Hasan, Julia E. Crook, Victoria Gomez, Massimo Raimondo, and Timothy Woodward. 2010. Comparison of probe-based confocal laser endomicroscopy with virtual chromoendoscopy for classification of colon polyps. Gastroenterology 138, 3 (2010), 834--842.

[6]

Xinqi Chu, Chee Khun Poh, Liyuan Li, Kap Luk Chan, Shuicheng Yan, Weijia Shen, That Mon Htwe, Jiang Liu, Joo Hwee Lim, and Eng Hui Ong. 2010. Epitomized summarization of wireless capsule endoscopic videos for efficient visualization. In MICCAI. 522--529.

Digital Library

[7]

Noel Codella, Jonathan Connell, Sharath Pankanti, Michele Merler, and John R. Smith. 2014. Automated medical image modality recognition by fusion of visual and text information. In MICCAI. 487--495.

[8]

M. T. Coimbra and J. P. S. Cunha. 2006. MPEG-7 visual descriptors-contributions for automated feature extraction in capsule endoscopy. IEEE Trans. Circuits Syst. Video Technol. 16, 5 (2006), 628--637.

Digital Library

[9]

Yang Cong, Shuai Wang, Ji Liu, Jun Cao, Yunsheng Yang, and Jiebo Luo. 2015. Deep sparse feature selection for computer aided endoscopy diagnosis. Pattern Recogn. 48, 3 (2015), 907--917.

Digital Library

[10]

Yang Cong, Junsong Yuan, and Ji Liu. 2011. Sparse reconstruction cost for abnormal event detection. In CVPR. IEEE, 3449--3456.

Digital Library

[11]

Yang Cong, Junsong Yuan, and Ji Liu. 2013. Abnormal event detection in crowded scenes using sparse representation. Pattern Recogn. 46, 7 (2013), 1851--1864.

Digital Library

[12]

Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In CVPR, Vol. 1. IEEE, 886--893.

Digital Library

[13]

T. Deselaers, L. Pimenidis, and H. Ney. 2008. Bag-of-visual-words models for adult image classification and filtering. In ICPR. 1--4.

[14]

Shenghua Gao, Liang Tien Chia, and Wai Hung Tsang. 2011. Multi-layer group sparse coding -- for concurrent image classification and annotation. In CVPR. 2809--2816.

Digital Library

[15]

Shenghua Gao, Ivor Wai-Hung Tsang, Liang-Tien Chia, and Peilin Zhao. 2010. Local features are not lonely--Laplacian sparse coding for image classification. In CVPR. IEEE, 3555--3561.

[16]

K. Gono, T. Obi, M. Yamaguchi, N. Ohyama, H. Machida, Y. Sano, S. Yoshida, Y. Hamamoto, and T. Endo. 2004. Appearance of enhanced tissue features in narrow-band endoscopic imaging. J. Biomed. Opt. 9, 3 (2004), 568--577.

[17]

H. He, F. Kong, and J. Tan. 2016. DietCam: Multi-view food recognition using a multi-kernel SVM. IEEE J. Biomed. Health Inform. 20, 3 (2016), 848--855.

[18]

H. He, Z. Shao, and J. Tan. 2015. Recognition of car makes and models from a single traffic-camera image. IEEE Trans. Intell. Transp. Syst. 16, 6 (2015), 1--11.

Digital Library

[19]

Chun Rong Huang, Pau Choo Chung, Bor Shyang Sheu, Hsiu Jui Kuo, and P. Mikulas. 2008. Helicobacter pylori-related gastric histology classification using support-vector-machine-based feature selection. IEEE Trans. Inf. Technol. Biomed. 12, 4 (2008), 523--531.

Digital Library

[20]

Yongzhen Huang, Zifeng Wu, Liang Wang, and Tieniu Tan. 2013. Feature coding in image classification: A comprehensive study. IEEE Trans. Pattern Anal. Mach. Intell. 36, 3 (2013), 493--506.

Digital Library

[21]

D. K. Iakovidis, S. Tsevas, and A. Polydorou. 2010. Reduction of capsule endoscopy reading times by unsupervised image mining. Comput. Med. Imag. Graph. 34, 6 (2010), 471--478.

[22]

Herve Jegou, Matthijs Douze, Cordelia Schmid, and Patrick Perez. 2010. Aggregating local descriptors into a compact image representation. In CVPR. 3304--3311.

[23]

Herve Jegou, Florent Perronnin, Matthijs Douze, Jorge Sanchez, Patrick Perez, and Cordelia Schmid. 2012. Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34, 9 (2012), 1704--1716.

Digital Library

[24]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In NIPS. 1097--1105.

Digital Library

[25]

R. Kumar, Q. Zhao, S. Seshamani, G. Mullin, G. Hager, and T. Dassopoulos. 2012. Assessment of Crohn’s disease lesions in wireless capsule endoscopy images. IEEE Trans. Biomed. Eng. 59, 2 (2012), 355--362.

[26]

B. Li and M. Q. Meng. 2009. Computer-aided detection of bleeding regions for capsule endoscopy images. IEEE Trans. Biomed. Eng. 56, 4 (2009), 1032--1039.

[27]

B. Li and M. Q. Meng. 2012. Tumor recognition in wireless capsule endoscopy images using textural features and SVM-based feature selection. IEEE Trans. Inf. Technol. Biomed. 16, 3 (2012), 323--329.

Digital Library

[28]

Baopu Li, Guoqing Xu, Ran Zhou, and Tianfu Wang. 2015. Computer aided wireless capsule endoscopy video segmentation. Med. Phys. 42, 2 (2015), 645--652.

[29]

M. Mackiewicz, J. Berens, and M. Fisher. 2008. Wireless capsule endoscopy color video segmentation. IEEE Trans. Med. Imag. 27, 12 (2008), 1769--1781.

[30]

A. V. Mamonov, I. N. Figueiredo, P. N. Figueiredo, and Y. H. Tsai. 2014. Automated polyp detection in colon capsule endoscopy. IEEE Trans. Med. Imag. 33, 7 (2014), 1488--1502.

[31]

Irfan Mehmood, Muhammad Sajjad, and Sung Wook Baik. 2014. Video summarization based tele-endoscopy: A service to efficiently manage visual data generated during wireless capsule endoscopy procedure. J. Med. Syst. 38, 9 (2014), 1--9.

Digital Library

[32]

Azadeh Sadat Mozafari and Mansour Jamzad. 2016. A SVM-based model-transferring method for heterogeneous domain adaptation. Pattern Recogn. 56 (2016), 142--158.

Digital Library

[33]

Manabu Muto, Hirokazu Higuchi, Yasumasa Ezoe, Takahiro Horimatsu, Shuko Morita, Shin Ichi Miyamoto, and Tsutomu Chiba. 2011. Differences of image enhancement in image-enhanced endoscopy: Narrow band imaging versus flexible spectral imaging color enhancement. J. Gastroenterol. 46, 8 (2011), 998--1002.

[34]

E. Pasolli, F. Melgani, D. Tuia, F. Pacifici, and W. J. Emery. 2014. SVM active learning approach for image classification using spatial information. IEEE Trans. Geosci. Remote Sens. 52, 52 (2014), 2217--2233.

[35]

Florent Perronnin, Jorge Sanchez, and Thomas Mensink. 2010. Improving the fisher kernel for large-scale image classification. In ECCV. 119--133.

Digital Library

[36]

F. Riaz, A. Hassan, R. Nisar, and M. Dinis-Ribeiro. 2015. Content-adaptive region-based color texture descriptors for medical images. Leukemia 27, 4 (2015), e90--2.

[37]

F. Riaz, F. B. Silva, M. D. Ribeiro, and M. T. Coimbra. 2012. Invariant gabor texture descriptors for classification of gastroenterology images. IEEE Trans. Biomed. Eng. 59, 10 (2012), 2893--2904.

[38]

Jorge Sanchez, Florent Perronnin, Thomas Mensink, and Jakob Verbeek. 2013. Image classification with the fisher vector: Theory and practice. Int. J. Comput. Vis. 105, 3 (2013), 222--245.

Digital Library

[39]

Amit Satpathy, Xudong Jiang, and How Lung Eng. 2014. LBP-based edge-texture features for object recognition. IEEE Trans. Image Process. 23, 5 (2014), 1953--1964.

[40]

Bernhard Scholkopf, John Platt, and Thomas Hofmann. 2007. Efficient sparse coding algorithms. In NIPS. 801--808.

Digital Library

[41]

R. Shahidi, M. R. Bax, Maurer Cr Jr, J. A. Johnson, E. P. Wilkinson, B. Wang, J. B. West, M. J. Citardi, K. H. Manwaring, and R. Khadem. 2003. Implementation, calibration and accuracy testing of an image-enhanced endoscopy system. IEEE Trans. Med. Imag. 21, 12 (2003), 1524--1535.

[42]

Zhenzhou Shao, Yong Guan, Hongsheng He, and Jindong Tan. 2014. Geometry constrained sparse embedding for multi-dimensional transfer function design in direct volume rendering. In Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA’14). 1290--1295.

[43]

Y. Shen, P. P. Guturu, and B. P. Buckles. 2012. Wireless capsule endoscopy video segmentation using an unsupervised learning approach based on probabilistic latent semantic analysis with scale invariant features. IEEE Trans. Inf. Technol. Biomed. 16, 1 (2012), 98--105.

Digital Library

[44]

S. Wang, Y. Cong, H. Fan, L. Liu, X. Li, S. Yang, Y. Tang, H. Zhao, and H. Yu. 2016. Computer-aided endoscopic diagnosis without human specific labeling. IEEE Trans. Biomed. Eng. 63, 11 (2016), 2347--2358.

[45]

C. H. Wu, Y. N. Sun, and C. C. Chang. 2007. Three-dimensional modeling from endoscopic video using geometric constraints via feature positioning. IEEE Trans. Biomed. Eng. 54, 7 (2007), 1199--1211.

[46]

Zhongwen Xu, Yi Yang, and Alexander G. Hauptmann. 2015. A discriminative CNN video representation for event detection. In CVPR. 1798--1807.

[47]

Jianchao Yang, Kai Yu, Yihong Gong, and T. Huang. 2009. Linear spatial pyramid matching using sparse coding for image classification. In CVPR. 1794--1801.

[48]

Jianchao Yang, Kai Yu, and Thomas Huang. 2010. Efficient highly over-complete sparse coding using a mixture model. In ECCV. 113--126.

Digital Library

[49]

X. Yu, J. Yang, T. Wang, and T. Huang. 2015. Key point detection by max pooling for tracking. IEEE Trans. Cybern. 45 (2015), 444--452.

[50]

Y. Yuan, J. Wang, B. Li, and Q. H. Meng. 2015. Saliency based ulcer detection for wireless capsule endoscopy diagnosis. IEEE Trans. Med. Imag. 34, 10 (2015), 1.

[51]

Z. Akata, F. Perronnin, Z. Harchaoui, and C. Schmid. 2014. Good practice in large-scale learning for image classification. IEEE Trans. Softw. Eng. 36, 3 (2014), 507--520.

Digital Library

[52]

Chunjie Zhang, Jing Liu, Qi Tian, Changsheng Xu, Hanqing Lu, and Songde Ma. 2011. Image classification by non-negative sparse coding, low-rank and sparse decomposition. In CVPR. 1673--1680.

Digital Library

Cited By

Wang SCong YZhu HChen XQu LFan HZhang QLiu M(2021)Multi-Scale Context-Guided Deep Network for Automated Lesion Segmentation With Endoscopy Images of Gastrointestinal TractIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2020.299776025:2(514-525)Online publication date: Feb-2021
https://doi.org/10.1109/JBHI.2020.2997760

Index Terms

Multi-Class Latent Concept Pooling for Computer-Aided Endoscopy Diagnosis
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations
  2. Machine learning
    1. Machine learning approaches
      1. Classification and regression trees
      2. Learning latent representations

Recommendations

Dynamic Label Propagation for Semi-supervised Multi-class Multi-label Classification
ICCV '13: Proceedings of the 2013 IEEE International Conference on Computer Vision

In graph-based semi-supervised learning approaches, the classification rate is highly dependent on the size of the availabel labeled data, as well as the accuracy of the similarity measures. Here, we propose a semi-supervised multi-class/multi-label ...
Self-supervised multimodal reconstruction pre-training for retinal computer-aided diagnosis
Abstract
Computer-aided diagnosis using retinal fundus images is crucial for the early detection of many ocular and systemic diseases. Nowadays, deep learning-based approaches are commonly used for this purpose. However, training deep neural ...
Highlights
- Self-supervised multimodal pre-training improves retinal computer-aided diagnosis.
Improving the performance of computer-aided diagnosis systems using semi-supervised learning: a survey and analysis

The healthcare sector generates important amount of medical data on a daily basis, several machine learning (ML) methods have been developed and studied in order to usefully exploit this substantial sum of information generated colossally, in a wide range ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 13, Issue 2

May 2017

226 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3058792

Editor:
Alberto Del Bimbo
University of Firenze, Italy

Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 March 2017

Accepted: 01 January 2017

Revised: 01 December 2016

Received: 01 October 2016

Published in TOMM Volume 13, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed

Funding Sources

NSFC

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
167
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang SCong YZhu HChen XQu LFan HZhang QLiu M(2021)Multi-Scale Context-Guided Deep Network for Automated Lesion Segmentation With Endoscopy Images of Gastrointestinal TractIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2020.299776025:2(514-525)Online publication date: Feb-2021
https://doi.org/10.1109/JBHI.2020.2997760

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents