Comparison of machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer from 18F-FDG PET/CT images

Hongkai Wang; Zongwei Zhou; Yingci Li; Zhonghua Chen; Peiou Lu; Wenzhi Wang; Wanyu Liu; Lijuan Yu

doi:10.1186/s13550-017-0260-9

Comparison of machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer from ¹⁸F-FDG PET/CT images

EJNMMI Res. 2017 Dec;7(1):11. doi: 10.1186/s13550-017-0260-9. Epub 2017 Jan 28.

Authors

Hongkai Wang¹, Zongwei Zhou², Yingci Li³, Zhonghua Chen¹, Peiou Lu³, Wenzhi Wang³, Wanyu Liu⁴, Lijuan Yu⁵

Affiliations

¹ Department of Biomedical Engineering, Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, No. 2 Linggong Street, Ganjingzi District, Dalian, Liaoning, 116024, China.
² Department of Biomedical Informatics and the College of Health Solutions, Arizona State University, 13212 East Shea Boulevard, Scottsdale, AZ, 85259, USA.
³ Center of PET/CT, The Affiliated Tumor Hospital of Harbin Medical University, 150 Haping Road, Nangang District, Harbin, Heilongjiang Province, 150081, China.
⁴ HIT-INSA Sino French Research Centre for Biomedical Imaging, Harbin Institute of Technology, Harbin, Heilongjiang, 150001, China.
⁵ Center of PET/CT, The Affiliated Tumor Hospital of Harbin Medical University, 150 Haping Road, Nangang District, Harbin, Heilongjiang Province, 150081, China. yulijuan2003@126.com.

Abstract

Background: This study aimed to compare one state-of-the-art deep learning method and four classical machine learning methods for classifying mediastinal lymph node metastasis of non-small cell lung cancer (NSCLC) from ¹⁸F-FDG PET/CT images. Another objective was to compare the discriminative power of the recently popular PET/CT texture features with the widely used diagnostic features such as tumor size, CT value, SUV, image contrast, and intensity standard deviation. The four classical machine learning methods included random forests, support vector machines, adaptive boosting, and artificial neural network. The deep learning method was the convolutional neural networks (CNN). The five methods were evaluated using 1397 lymph nodes collected from PET/CT images of 168 patients, with corresponding pathology analysis results as gold standard. The comparison was conducted using 10 times 10-fold cross-validation based on the criterion of sensitivity, specificity, accuracy (ACC), and area under the ROC curve (AUC). For each classical method, different input features were compared to select the optimal feature set. Based on the optimal feature set, the classical methods were compared with CNN, as well as with human doctors from our institute.

Results: For the classical methods, the diagnostic features resulted in 81~85% ACC and 0.87~0.92 AUC, which were significantly higher than the results of texture features. CNN's sensitivity, specificity, ACC, and AUC were 84, 88, 86, and 0.91, respectively. There was no significant difference between the results of CNN and the best classical method. The sensitivity, specificity, and ACC of human doctors were 73, 90, and 82, respectively. All the five machine learning methods had higher sensitivities but lower specificities than human doctors.

Conclusions: The present study shows that the performance of CNN is not significantly different from the best classical methods and human doctors for classifying mediastinal lymph node metastasis of NSCLC from PET/CT images. Because CNN does not need tumor segmentation or feature calculation, it is more convenient and more objective than the classical methods. However, CNN does not make use of the import diagnostic features, which have been proved more discriminative than the texture features for classifying small-sized lymph nodes. Therefore, incorporating the diagnostic features into CNN is a promising direction for future research.

Keywords: Computer-aided diagnosis; Deep learning; Machine learning; Non-small cell lung cancer; Positron-emission tomography.