Abstract
Dysplasia is a common pre-cancerous abnormality that can be categorized as mild, moderate and severe. With the advance of digital systems applied in microscopes for histological analysis, specialists can obtain data that allows investigation using computational algorithms. These systems are known as computer-aided diagnosis, which provide quantitative analysis in a large number of data and features. This work proposes a method for nuclei segmentation for histopathological images of oral dysplasias based on an artificial neural network model and post-processing stage. This method employed nuclei masks for the training, where objects and bounding boxes were evaluated. In the post-processing step, false positive areas were removed by applying morphological operations, such as dilation and erosion. This approach was applied in a dataset with 296 regions of mice tongue images. The metrics accuracy, sensitivity, specificity, the Dice coefficient and correspondence ratio were employed for evaluation and comparison with other methods present in the literature. The results show that the method was able to segment the images with accuracy average value of \(89.52 \pm 0.04\%\) and Dice coefficient of \(84.03 \pm 0.06\%\). These values are important to indicate that the proposed method can be applied as a tool for nuclei analysis in oral cavity images with relevant precision values for the specialist.
You have full access to this open access chapter, Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Dysplasia is characterized by alterations on the cell information such as size, shape and brightness intensity. In developing countries, this anomaly is a common type of pre-cancerous lesions that can be classified as mild, moderate and severe [5]. Cancer is the second most common cause of death and severe class dysplasias have a 3% to 36% chance of progressing to this type of malignant lesion [12]. Usually, the diagnosis for these lesions is performed by analysing the size of the lesion and the intensity of morphological alterations in tissue nuclei. However, different size lesions may have similar levels of nuclear alterations [12]. Thus, pathologists may assign different levels of dysplasia to lesions with similar level of alterations [17]. Moreover, there are several malignant lesions that are identified only by the intensity of nuclei alterations [12].
With the advance of digital systems applied in microscopes for histological analysis, specialists can obtain data that allows investigation using computational algorithms. The development of digital tools for analysis can assist pathologists in decision making as well as reduce the error rate caused by subjectivity [3]. These systems are known as computer aided diagnosis (CAD), which provide quantitative analysis in a large number of data and features [7]. This system shows a set of steps ranging from the signal-to-noise ratio improvement, segmentation, feature extraction and data classification. Segmentation is an important step that allows the identification of objects that will be analyzed by feature descriptors and classified in subsequent steps [7]. The main ways of applying this stage are by discontinuity or similarity of pixel brightness intensity. Discontinuity is based on the detection of abrupt variations of the pixel intensity to delimit region borders. The goal of the similarity segmentation is to divide an image into regions with similar features based on established criteria [7].
In literature, there are several studies that investigate the segmentation of cellular structures of malignant lesions based on computer vision algorithms that employed thresholding techniques, region merging and semantic information [11, 15]. With the goal of identifying nuclei in epithelial breast tissues, the authors in [10] proposed a method that employs color deconvolution and Otsu thresholding for nuclei segmentation. In [1], the K-means method was used to segment and identify cells in lymphoma images. In [15], the authors proposed a method that segment cell nuclei using neural networks for semantic segmentation where each pixel is assigned to a class region present in the image. In the context of histological images, segmentation of epithelial nuclei is a complex task due to irregular characteristics shown by nuclei, such as dye variation. These nuclei may have color and aspects similar to other structures present in the tissue [11]. This shows that it is an area where there are major challenges related to nuclei segmentation. In the case of dysplasia, this process can be more complex due to the growth of the connective tissue that can invade the epithelial tissue and difficult the nuclei segmentation [12]. The described proposals have not yet considered a method focused to the segmentation of epithelial nuclei as proposed in this work.
This work proposes a segmentation method based on region-based convolutional neural networks (R-CNN) aiming to identify cell nuclei present in oral histological tissues. For this, in the first step individual nuclei masks were generated to train the network. In the training step, bounding boxes were defined for candidate objects and combined with the masks. In the segmentation step, the R-CNN classifies each image pixel in nuclei or background based on these pixels neighborhood. As post-processing step, morphological operations of dilation and erosion were combined with hole-fill operations and small object exclusion to eliminate false negatives and positives. A dataset of mice images was used for the evaluation of the method. The results were evaluated in relation to the gold standard and a comparison was made employing methods used in literature.
This paper is organized as follows. In the Sect. 2 we detail the dataset, the methods used to train the network, the segmentation, post-processing and the method for quantitative analysis. The obtained results and the comparison with other segmentation methods present in literature are shown in Sect. 3. The Sect. 4 presents discussions and our work conclusions.
2 Methodology
2.1 Image Dataset
The dataset was built from tongue slides of 30 mice previously submitted to a carcinogen during two experiments performed between 2009 and 2010, duly approved by the Ethics Committee on the Use of Animals under protocol number 038/09 at the Federal University of Uberlândia, Brazil.
The histological images were obtained using the LeicaDM500 light microscope with a magnification of 400x and saved in TIFF format using the RGB color model and resolution of \(2048 \times 1536\) pixels. In total, 43 images were obtained and with the aid of a pathologist were classified into healthy tissues, mild dysplasia, moderate dysplasia and severe dysplasia. After digitalization, these images were cropped in regions of interest (ROI) with a size of \(450 \times 250\) pixels, totalling 296 ROI with 74 ROI for each class. In Fig. 1 some cases of dysplasia and a case of a healthy region of histological images of the buccal cavity are shown. The lesions were manually marked by a specialist and automatically analyzed by the proposed method (gold standard).
2.2 Segmentation via Mask R-CNN
The Mask R-CNN model is based on the Faster R-CNN model, which has two stages: the first is called the region proposal network (RPN), which proposes to define bounding boxes for candidate objects [14]; in the second stage, bounding box regression is used to refine the area of the boxes [8].
Then, in the training step is applied the loss function defined by:
In this context, we have \(L_{cls} = -\log p_u\), where p is the distribution probability of each ROI over the \(K+1\) possible classes and u the gold standard of the ROI class [6]. The bounding boxes loss is defined by:
where:
where \(t^u\) is the regression of the boxes, v is the gold standard of the box regression, x and y are the coordinates of the upper left corner of each ROI, and h and w are height and width information for the region [6].
The output of this step is the masks with size \(Km^2\), containing K binary masks of resolution \(m \times n\), one for each of the K classes. In this study, the \(L_{mask}\) adopted was the average binary cross-entropy loss as described in the work of [8]. The main reason for the choice of masks was the advantage of allowing the spatial representation of the objects. Thus, this information was extracted by pixel-by-pixel correspondence through convolution operations.
In this work, Mask R-CNN was applied to the Resnet-50 convolutional neural network model [9]. This network has 50 layers arranged in the following structure: input layer; 16 blocks of convolutional layers organized into 4 groups called B1, B2, B3 and B4; and an output layer. The network structure can be seen in Fig. 2. Each block has 3 layers, with convolutions of sizes \(1 \times 1\), \(3 \times 3\) and \(1 \times 1\), respectively. The \(1 \times 1\) convolutions are responsible for reducing and restoring dimensions leaving the \(3 \times 3\) convolution with smaller input and output dimension size. Between the first layer and blocks B1, a max pooling filter with a \(3 \times 3\) dimension and stride of 2 is applied, reducing the size of the input by half. To halve input size between each block, in the first layer of groups B2, B3 and B4, the convolution is performed using a moving window with a stride of 2. In the final step of the network, an average pooling filter and a fully connected layer were used for object classification. The network uses the rectified linear unit (ReLU) activation function.
Given the set of 160 ROI obtained from the 4 classes of histological images of the oral cavity, 40 ROI were employed for the network training. In this stage, 32 ROI were used to build the weights of the net and 8 ROI to evaluate each epoch. With the help of a pathologist, 1,220 individual nucleus masks were obtained from these ROI. Since the Mask R-CNN maps the masks over the ROI to extract nuclei features, our set consists of 1,220 nucleus masks, being 1,027 for the training set and 193 for the test set. According to the authors in [14], since the RPN has a magnitude order with few parameters, it has less risk of overfitting on small dataset. Also, by using the Resnet-50 model instead of the Resnet-101 there is an overall overfitting reduction, as explained by the authors in [9]. The network was pre-trained on the ImageNet dataset and fine-tuned using our dataset. For the training, it was defined as a batch size of 9 masks and learning rate of 0.001. It was also used the SGD optimizer with a momentum of 0.9. The fine-tuning of the network was performed using 40 epochs with 142 iterations, which is the number of batches passed to the network in each epoch. These parameters were empirically defined for the analyzed dataset. After this stage, the 120 ROI were employed in the network evaluation stage. The CNN was trained on a computer with an eight-core AMD FX-8320 processor, 8 GB of RAM and Nvidia GTX-1060 GPU with 6 GB of VRAM, using the TensorFlow library in the Python language.
2.3 Post-processing
The segmentation step result in a binary image with the regions of the identified nuclei. This image may have incomplete regions and small artifacts. To fill the incomplete nuclei, the concept of morphological closing was applied. First, in order to close the contour of these nuclei, the dilation operation was performed using a cross-shaped structuring element with size of \(3 \times 3\) pixels. Then, a hole-fill function was used on the present binary objects. Then, an erosion operation was employed with the same structuring element used in the dilation operation to eliminate noise still present in the images. Finally, objects with an area size smaller than 30 pixels were classified as background.
2.4 Evaluation Metrics
The evaluation of a segmentation method can be performed by calculating the overlapping regions of the segmented image and the regions of a reference image demarcated by a specialist (gold-standard) [4]. In this stage, 30 histological ROI of each class were randomly chosen and first segmented by specialists considered the gold standard. Then, the following metrics were considered: accuracy (\(A_{CC}\)), sensitivity (\(S_E\)), specificity (\(S_P\)), correspondencerRate (\(C_R\)) and Dice coefficient (\(D_C\)) [2, 13, 16].
The values of \(S_E\) and \(S_P\) were used to determine the proportion of pixels correctly marked as objects and as background, respectively. The metric \(A_ {CC}\) was used to measure the amount of true positive and true negative calculated in relation to all positives and negatives. The \(C_R\) metric evaluated the correspondence between the result obtained and the gold standard. Then, finally, the \(D_C\) measure was used to evaluate the similarity between the gold standard and the result.
3 Experimental Results
The results of the proposed segmentation method for images of mild, moderate and severe dysplasia are shown in Fig. 3. In an evaluation of the images of the mild class (see Figs. 3a, d and g), it is noted that the method was able to detect and segment the regions with the presence of nuclei. In these figures, it is possible to observe, by the arrows marked in red color, that there are objects detected as false-positive in relation to the marking made by the specialist (Fig. 3d). It is also possible to note regions with the presence of false negative, that is, some nuclei were eliminated by the segmentation process (see the arrows marked in green color). However, the method was able to segment obscured and difficult-to-identify nuclei, obtaining close similarity to the gold standard as seen in the region marked in blue in Fig. 3g. In a similar way, the images of the moderate dysplasia and severe dysplasia classes (see Figs. 3h and i) also showed some regions with the presence of the false positive and false negative.
Aiming to investigate the method, algorithms based on semantic segmentation by SegNet [15], EM-GMM and K-means [11] were also applied on the image dataset. Some of these methods are extensively used for comparison of histological component segmentation methods [11]. The EM-GMM and K-means algorithms were performed using a cluster number \(k=3\). The SegNet was trained using the same 10 ROI used for the proposed method and using 10 groundtruth masks, each one containing all nuclei from the ROI.
Figures 4a, b, c, d, e, f, g, h and i show the results of the EM-GMM, K-means and SegNet methods, respectively. In the results it is noticed that there are also flaws in relation to false positive regions (red color arrows) and false negative regions (green color arrows). In addition, there are results in which there was a great degradation with respect to the nuclei structures (see Figs. 4a, h and i), where part of these structures was eliminated.
The performance of the proposed method for tissue segmentation is shown in Table 1. In images with severe dysplasia, the \(S_E\) and \(A_{CC}\) were smaller than the results of the other classes of lesions. This shows that the method has greater difficulty in identifying the nuclei for this class. This may occur because some nuclei of this class have a high-intensity morphological alteration, which makes it difficult for the algorithm to identify it, classifying it as a background region [12].
The results of the algorithms present in literature are presented in Table 2. The proposed method obtained relevant results in relation to the methods present in literature (\(A_{CC} = 89.52\pm 0.04\% \), \(C_R = 0.76\pm 0.10\) and \( D_C = 0.84\pm 0.06\)). The K-means methods obtained a value of \(A_{CC} = 77.32\pm 0.05\%\) and this represents a difference of 12% in relation to the proposed method. The SegNet algorithm obtained the values of \(C_R = 0.40\,\pm \,0.24\) and \(D_C = 0.60\,\pm \,0.16\), being 37% and 23% lower than the results of the proposed method. This behavior with lower results can also be noticed in the results presented by the EM-GMM method (\(A_{CC} =72.91\pm 0.15\) and \(C_R = 0.35\pm 0.30\)).
4 Conclusions
In this study, the proposal was put forward an approach for automatic segmentation of nuclei in oral epithelial tissue images. In literature, methods for segmentation of oral dysplastic images are not yet explored and this solution makes a contribution to specialists in the field. This work presented an algorithm of segmentation in images of oral dysplasias based on a deep learning approach. The EM-GMM, K-means and SegNet methods were applied on images of the dataset for comparison purposes. Through qualitative and quantitative analyzes, the combination of the algorithms used in this approach was able to reach more effective results than the compared techniques. In future studies pre-processing algorithms such as color normalization will be investigated aiming to improve the images in the processing initial stage.
References
Amin, M.M., Kermani, S., Talebi, A., Oghli, M.G.: Recognition of acute lymphoblastic leukemia cells in microscopic images using k-means clustering and support vector machine classifier. J. Med. Signals Sens. 5(1), 49 (2015)
Baratloo, A., Hosseini, M., Negida, A., El Ashal, G.: Part 1: simple definition and calculation of accuracy, sensitivity and specificity. Emergency 3(2), 48–49 (2015)
Belsare, A., Mushrif, M., Pangarkar, M., Meshram, N.: Breast histopathology image segmentation using spatio-colour-texture based graph partition method. J. Microsc. 262(3), 260–273 (2016)
Estrada, F.J., Jepson, A.D.: Benchmarking image segmentation algorithms. Int. J. Comput. Vis. 85(2), 167–181 (2009)
Fonseca-Silva, T., Diniz, M.G., Sousa, S.F., Gomez, R.S., Gomes, C.C.: Association between histopathological features of dysplasia in oral leukoplakia and loss of heterozygosity. Histopathology 68(3), 456–460 (2016)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Gonzalez, R.C., Woods, R.: Digital Image Processing (2018)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Husham, A., Hazim Alkawaz, M., Saba, T., Rehman, A., Saleh Alghamdi, J.: Automated nuclei segmentation of malignant using level sets. Microsc. Res. Tech. 79(10), 993–997 (2016)
Irshad, H., Veillard, A., Roux, L., Racoceanu, D.: Methods for nuclei detection, segmentation, and classification in digital histopathology: a review–current status and future potential. IEEE Rev. Biomed. Eng. 7, 97–114 (2014)
Kumar, V., Aster, J.C., Abbas, A.: Robbins and Cotran Patologia-Bases Patológicas das Doenças. Elsevier, Brasil (2010)
Ma, Z., Wu, X., Song, Q., Luo, Y., Wang, Y., Zhou, J.: Automated nasopharyngeal carcinoma segmentation in magnetic resonance images by combination of convolutional neural networks and graph cut. Exp. Ther. Med. 16(3), 2511–2521 (2018)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)
Tokime, R.B., Elassady, H., Akhloufi, M.A.: Identifying the cells’ nuclei using deep learning. In: 2018 IEEE Life Sciences Conference (LSC), pp. 61–64. IEEE (2018)
Tran, P.V.: A fully convolutional neural network for cardiac segmentation in short-axis MRI. arXiv preprint arXiv:1604.00494 (2016)
Warnakulasuriya, S., Reibel, J., Bouquot, J., Dabelsteen, E.: Oral epithelial dysplasia classification systems: predictive value, utility, weaknesses and scope for improvement. J. Oral Pathol. Med. 37(3), 127–133 (2008)
Acknowledgement
The authors gratefully acknowledge the financial support of National Council for Scientific and Technological Development CNPq (Grants 427114/2016-0, 304848/2018-2, 430965/2018-4 and 313365/2018-0), the State of Minas Gerais Research Foundation - FAPEMIG (Grant APQ-00578-18). This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) - Finance Code 001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Silva, A.B., Martins, A.S., Neves, L.A., Faria, P.R., Tosta, T.A.A., do Nascimento, M.Z. (2019). Automated Nuclei Segmentation in Dysplastic Histopathological Oral Tissues Using Deep Neural Networks. In: Nyström, I., Hernández Heredia, Y., Milián Núñez, V. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2019. Lecture Notes in Computer Science(), vol 11896. Springer, Cham. https://doi.org/10.1007/978-3-030-33904-3_34
Download citation
DOI: https://doi.org/10.1007/978-3-030-33904-3_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33903-6
Online ISBN: 978-3-030-33904-3
eBook Packages: Computer ScienceComputer Science (R0)