Abstract
In this paper, we propose a new collaborative process that aims to detect macrocalcifications from mammographic images while minimizing false negative detections. This process is made up of three main phases: suspicious area detection, candidate object identification, and collaborative classification. The main concept is to operate on the entire image divided into homogenous regions called superpixels which are used to identify both suspicious areas and candidate objects. The collaborative classification phase consists in making the initial results of different microcalcification detectors collaborate in order to produce a new common decision and reduce their initial disagreements. The detectors share the information about their detected objects and associated labels in order to refine their initial decisions based on those of the other collaborators. This refinement consists of iteratively updating the candidate object labels of each detector following local and contextual analyses based on prior knowledge about the links between super pixels and macrocalcifications. This process iteratively reduces the disagreement between different detectors and estimates local reliability terms for each super pixel. The final result is obtained by a conjunctive combination of the new detector decisions reached by the collaborative process. The proposed approach is evaluated on the publicly available INBreast dataset. Experimental results show the benefits gained in terms of improving microcalcification detection performances compared to existing detectors as well as ordinary fusion operators.
Supplementary Information
The online version contains supplementary material available at 10.1007/s10278-022-00678-9.
Keywords: Mammography, Microcalcification detection, Collaborative classification, Graph knowledge propagation
Introduction
Mammography is a radiography technique used by radiologists to interpret and identify potential breast cancer lesions. The microcalcifications’ (MCs) appearance is considered among the main early indirect visible signs of breast cancer’s production [1]. They constitute small deposits of calcium with diameters ranging between 0.1 and 1 mm [2] and appear on mammographic images as small bright spots grouped together and occupying one or several areas of the breast. The automatic detection of MCs represents a challenging task [3] for radiologists as well as researchers. Indeed, mammographic images are the result of the superimposition of the 3D breast tissues with their different types, structures, and scales through a 2D projection process. Also, MCs acquire the structure of their mammary superimposition tissues [2]. Thus, the characterization of breast tissues leads to a diversity of MC characteristics, even within the same image. On the other hand, MCs appear superimposed on breast tissues with small sizes and low contrasts. As a result, a great confusion appears between MCs and their superimposition tissues as well as other neighboring breast tissues.
In the literature, several researches tried to propose reliable approaches to detect MCs from mammographic images. Their classification can be carried out according to two aspects: the analysis level (local or global) and the type of supervision (supervised or unsupervised). The global-based analysis approaches [4–6] operate on the entire image while local-based analysis approaches [7–9] operate on regions of interest which are previously selected by radiologists. The anatomical diversity of MCs leads to a great sensitivity of the global analysis approaches to the choice of the parameters used to detect MCs. Limiting the research of MCs to the regions of interest provided by the radiologists, in local-based approaches, can reduce the sensitivity to the model parameters. However, it omits the semantic knowledge exploited by the global-based approaches.
Unsupervised techniques solve the MC detection issue using standard segmentation techniques such as mathematical morphology [10, 11], active contours [9, 12, 13], or clustering [14–17]. Each of these techniques uses a specific approach related to some MC characteristics. For instance, morphology-based methods use structuring elements and morphological operations to enhance MCs’ appearance. Active contour-based methods delineate MCs starting from selected seed regions. Unsupervised clustering techniques detect targeted MC regions by dividing the input image into homogeneous clusters. Conversely, supervised techniques solve the MC detection issue using machine [7, 18, 19] and deep learning methodologies [20–23]. They are based on trained classifiers and a set of features to characterize MCs.
The major drawback of unsupervised techniques is their sensitivity to the MC anatomical diversities, even on the same mammographic image. The major drawback of supervised techniques is related to the reliability of the data employed to train classifiers. These drawbacks may cause false detections including false positives (FP) or negatives (FN). In the context of MC detection, a false negative is a set of adjacent pixels referring to MC but considered as normal breast tissue pixels. Conversely, a false positive is a set of normal breast tissue pixels considered as MC pixels.
In this study, we are interested in unsupervised techniques with both global and local analyses. Our main goal is to develop an approach that produces a low number of false negative detections. Indeed, the morphology and the number of MCs are the most important factors in the decision made by radiologists [24, 25]. Thus, a detection with a low number of false negatives is more important than with false positives. A false positive detection may be at most filtered through postprocessing. On the contrary, a lost MC will no longer be possible to be recovered and may reduce the reliability of the obtained results. In this context, we propose a new collaborative classification process which takes advantage of different unsupervised MC detection techniques. Indeed, a single unsupervised technique is unable to deal with MC diversities in terms of characteristics. Therefore, it will generate false detections which will be different from those obtained by another technique, and hence the interest in having a new approach that collaborates the results of different techniques to retain their relevant ones and to generate a more reliable result. The proposed approach consists in a mutual and automatic refinement of the initial decisions made by a set of several collaborators based on the information shared between them. The main idea is to exchange the knowledge between collaborators to reduce their initial disagreements towards a new common and reliable decision. The applied refinement is conducted at two main levels of analysis: local and contextual levels. It allows an estimation of local reliability degrees and is on the basis of a prior knowledge about the searched objects (MCs) and the processed image.
The outline of this paper is as follows. "Section 2" describes the overall proposed methodology. "Section 3", "4" and "5" detail, each of them, a specific phase of the proposed approach. "Experimental Results and Discussion section" presents experimental results and comparisons to show the strength and benefits of the proposed approach with respect to standard detectors as well as ordinary fusion operators.
Proposed Approach: Overall Description
Collaboration is a process where at least two actors working together and sharing their knowledge to refine the initial results and achieve a common goal. The basic idea of the proposed approach is to operate through a collaborative process that allows for different unsupervised MC detection techniques, called detectors, to work together and review their initial decisions based on information shared with other detectors. It is made up of three main phases: suspicious areas detection, candidate objects identification, and collaborative classification (Fig. 1).
In this work, the M used detectors process the pixels of mammograms and generate, each of them, two pixelic maps: thematic and suspicion maps:
The “thematic map” (, i = 1 ... M) is a map that associates a binary value (0 or 1) to each pixel P. Pixels with the value 1 are suggested to belong to a true MC. A set of connected pixels with values equal to 1 is considered as a region of interest (called suspicious area) that refers to a potential MC. However, pixels with the value 0 are suggested to refer to normal breast tissue pixels and are considered background pixels. Such a representation offers a thematic segmentation of the mammographic image that provides the potential MCs.
The suspicion map (, i = 1 ... M) is a map that associates a continuous value (ranging between 0 and 1) to each pixel P. Pixels with low suspicion degree values (close to 0) could possibly refer to normal breast tissue pixels. Pixels with high suspicion degree values could possibly refer to a suspicious tissue and should always be worthy of interest by the radiologists. Such a representation is a kind of changeover from the standard gray level to a semantic representation space. It offers the opportunity to reduce the impact of the uncertain gray-level information on the decision process. Moreover, it can be used with the binary representation to classify the suspicious areas to be considered as MCs to benign or malignant based on their morphologies as well as distribution appearances.
Suspicious Areas Detection
The detection phase aims to select from a mammographic image the connected pixels to be considered as suspicious areas which can refer to MCs. It is divided into two steps: superpixel generation and suspicious areas identification. Its key features deal with:
Providing global results relative to the entire image, based on local decisions from homogeneous regions (superpixels).
Operating with different detectors to reduce the sensitivity to MC characteristic diversities and, thus, to avoid false negative detections.
Superpixel Generation
The principle of the proposed approach is to process on the entire mammographic image divided into homogeneous regions called superpixels. The main idea is to work with small regions that present the same tissue type to deal with the MC and breast tissue characteristic diversity. In this study, we consider two types of superpixels: SP1 and SP2 superpixels (Fig. 2).
SP1 superpixels refer to local homogeneous regions generated from the mammographic image and used to identify suspicious areas. These superpixels are constructed in such a way that they respect a specific gray-level homogeneity criterion and can contain more than a single MC. The set of MCs that belong to the same superpixel is called “group of MCs.” In this study, SP1 superpixels are generated from the SLIC over-segmentation algorithm [26] applied to the mammographic image.
- SP2 superpixels refer to small regions that are very close in size and shape to those of MCs. These regions are used to convert the obtained suspicious areas into “candidate objects” and to make a decision if they could, or not, correspond to potential MCs within the collaborative classification process. In this study, SP2 superpixels are generated from the watershed algorithm [27] applied to the mammographic image’s complement Eq. (1) where local maxima became local minima. The advantage of such superpixels lies in their abilities to comply with the granularity of MCs in terms of shape and size.
1
Figure 2 displays the SP1 and SP2 superpixel contours (blue contours) generated from a small region of a mammographic image with MCs. The enlarged zones show the relationship, described above, between MCs and the employed superpixels.
Suspicious Areas Identification
The detection of suspicious areas consists in applying different MC detection techniques. These latter are separately applied to SP1 superpixels for the purpose of identifying regions that could refer, or belong, to MCs. A suspicious area is a set of adjacent pixels that responds to the MCs’ characterization adopted by an applied detector. In order to reduce the sensitivity to the diversity of MC characteristics, four different detectors are used in this study. Each detector follows a specific reasoning to model and search potential MC pixels from SP1 superpixels:
A morphological-based detector that uses structuring elements and morphological operations to search connected pixels that could correspond to MCs
A conditional region growing detector that iteratively delineates MCs starting from selected seed points. It integrates prior knowledge-based criteria to the growing process instead of the simple homogeneity-based criterion.
A possibilistic fuzzy c-mean-based detector (PFCM) which models MCs as sets of adjacent “outlier” pixels compared to the superimposed breast tissue pixels.
A detector based on a Butterworth band-pass filter applied in the Fourier domain to enhance MC contours and correctly detect them.
The first two detectors are our well-known proposals and are respectively published in [11] and [28]. The third and fourth detectors ([15] and [29]) are selected from the literature and adapted to be applied to the entire image with respect to SP1 superpixels. The next subsections present a general description of each of these detectors.
First Detector: Morphological-based Detector
The first detector is based on an unsupervised detection technique using morphological operations and the structural similarity index (SSIM) [30]. The use of mathematical morphology makes it easy to deal with the issue of low contrast between MCs and their surrounding pixels. The key features of this detector concern:
The use of various structuring elements to reduce the sensitivity to the low and various contrast between MCs and their surrounding pixels.
The generation of a suspicion map using structural similarity indices.
The automatic estimation of threshold values locally determined from the SP1 superpixels using a dispersion analysis of both grayscale and suspicion maps to generate a thematic map.
Second Detector: Conditional Region Growing Detector
The conditional region growing (CRG) technique is based on the standard region growing [31] algorithm. Its main idea is to integrate prior knowledge-based criteria to control the growing process and correctly delineate MCs starting from selected seed points. These latter are selected based on the analysis of SP1 superpixels and a regional maxima detection. The criteria used to control the growing process are derived from the MC descriptions arising from radiologists and can be divided into two categories. The first one analyzes the neighborhood searching size. The second one exploits the gradient information and the shape evolution of the segmented region within the growing process. The key feature of this detector is to analyze below each individual MC to estimate the adequate criteria for an accurate delineation and not to use the same parameters for all of them. The SP2 superpixels are used by this detector to select the initial seed points for growing process purposes. However, the SP1 superpixels are used to estimate an intensity-based criterion as well as to select the set of candidate pixels for a possible evolution starting from a seed point.
Third Detector: PFCM-based Detector
The third detector is based on the possibilistic fuzzy C-means (PFCM) [32] algorithm. PFCM is an iterative unsupervised clustering algorithm that combines the advantages of both fuzzy C-means (FCM) [33] and possibilistic C-means (PCM) [34] algorithms. Indeed, it generates a fuzzy partition and associates to each pixel a membership and a typicality value. Membership values indicate the degrees to which pixels belong to each cluster. Typicality values indicate the degree of compatibility that a pixel has with respect to the cluster to which it belongs. The use of both membership and typicality values allows PFCM to solve the noise sensitivity issue of the FCM and to avoid the coincident cluster issue of the PCM [35].
MCs in mammographic images acquire the characteristics of the breast tissues on which they are superimposed. However, they appear with higher intensity values compared to them. Thus, they present low typicality degrees with respect to the cluster that represents their superimposed breast tissues. In order to identify these atypical pixels, the baseline technique [15] was based on PFCM algorithm and a static threshold value to segment the region of interest selected by the radiologists and where all presented MCs belong to. The change we have made, in this technique, consists of applying the PFCM algorithm to each SP1 superpixel in the entire mammographic image and to automatically estimate a threshold value per superpixel instead of using a single static value. Indeed, a mammographic image presents various breast tissues and the appearance of MCs is not restricted to only one breast tissue type. Thus, the use of a static threshold value makes the reliability of the results sensitive to the homogeneity of the superimposition tissues. Given that each SP1 superpixel refers to a homogeneous region and, thus, to a unique breast tissue type, the segmentation result will have a semantic signification and a local analysis will be able to estimate an accurate threshold value. Used threshold values are estimated on the basis of an intensity distribution analysis, inside each superpixel using John Wilder Tukey’s criterion [36].
Fourth Detector: Band-pass Fourier Filtering Detector
The last detector is an edge finder technique based on a Butterworth Band-Pass (BBP) filter in the Fourier domain. The BBP filter has the ability to properly control the frequencies by assigning the accurate low and high cutoff frequencies as well as the slope rate [29]. Thus, by applying the BBP filter to a mammographic image, fibroglandular tissue pixels are considered as background pixels and will be then removed. Also, MC edges will be considered foreground pixels and will be enhanced. To further improve the contrast of detected edges and reduce background noise, the resulting edge image is enhanced by applying a median filter and a gamma correction. The final image segmentation results from applying morphological operations.
Analyzing the descriptions of the four used detectors, we can notice that there is an implicit relationship between the thematic and suspicion maps generated by each of them where each of them can be derived from the other. The conversion, we propose, from one map to another is based on prior knowledge (Fig. 3). It is to analyze the gray-level distribution of the connected pixels in the thematic map and the suspicion degree distributions in the SP1 superpixels from the suspicion map.
From the thematic map, the obtained suspicious areas are selected and the gray-level distribution of the pixels in each of them is extracted. These distributions are shown to correspond to symmetrical Gaussian distributions with respect to regional maxima pixels. The suspicion degree values are estimated by transforming these distributions to probabilities based on the z-score table.
To transform a suspicion map into a thematic map, we start by projecting the SP1 superpixel boundaries on this latter. With such projection, the suspicion map is divided into local regions with homogeneous grayscale values. Adjacent pixels, which will be considered as potential MC pixels, are the outlier pixels in a considered superpixel. Therefore, we propose to estimate an adaptive threshold value for each superpixel based on modelling outlier pixels in it. In our study, the threshold value (for which a pixel P is considered a potential MC pixel on a given superpixel) is based on John Wilder Tukey’s criterion [36]. This latter is based on constructing boxplot on a given dataset.
The results of each of the used detectors are sensitive to the MC characteristic diversities. They usually generate false detections (FP, FN). Hence, combining these detectors is required to exchange their own knowledge and to reduce the overall FN rate.
Candidate Object Identification
Identification Issues
In the previous phase, pixel-based processes without constraints about the geometrical features of the obtained regions were applied. Thus, we have no information about the shapes and sizes of the generated suspicious areas compared to those of MCs. However, these latter are the starting point to decide if their pixels correspond or not to MC pixels. Therefore, it is in our interest to identify from them the set(s) of adjacent pixels that follow the geometric features of MCs in terms of sizes and shapes.
The identification phase is, allowing, to convert the thematic maps, results from the detectors, into a set of candidate objects with associated features. A “candidate object” is defined as the set of adjacent pixels which belong to a suspicious area in a thematic map and which can refer to a potential MC. Thus, it can cover the hole or only a part of a suspicious area in a thematic map. The major constraints at this level are to:
Ensure that the sizes of the identified objects conform to those of MCs while we have no knowledge about the accuracy of contours of the generated suspicious areas from each thematic map.
Take into account the heterogeneity of the thematic maps generated by the detectors in terms of characteristics of the generated objects.
Figure 4 provides an illustration of the different possible scenarios that may occur when identifying the objects from the thematic maps. It represents a thematic map with black and white pixels. Black pixels refer to normal breast tissue pixels (background) whereas white pixels correspond to potential MC pixels. Each set of connected white pixels corresponds to a suspicious area. In this thematic map, we represent the suspicious areas result from two detectors and delineated by respectively red and green contours.
As previously mentioned, the used detectors, in this work, exploit different strategies and parameters to identify potential MC pixels. Thus, two and regions that refer to two suspicious areas in the thematic maps of the and detectors can be:
Overlapped: The region obtained by the detector presents common pixels with the region (cases 1 and 2 in Fig. 4).
Distinct: The pixels which represent the region in the thematic map are associated to the background class in the thematic map (case 3 in Fig. 4).
At this level, several questions are raised:
Do obtained regions fulfill the MC granularities?
What are the contours for the object(s) to be considered if two detectors present some overlapping regions, knowing that we cannot justify the reliability of any of the used detectors? Thus, can we prefer the use of the contours given by one detector to those given by another one?
Which regions to consider as candidate objects if the detectors generate different regions (the set of all the regions or a selection and on what basis these decisions are made)?
Proposed Identification Scheme
In order to deal with the abovementioned issues, we propose an identification scheme that uses three different types of information: SP2 superpixels, and global thematic and suspicion maps.
The global thematic map is the union of the thematic maps obtained by the different detectors. Using such maps, we suppose that all the detector results are of interest while we are not able to decide which of them presents the best performances. Moreover, we can use the overall generated objects which will allow us to reduce the FN detections. Indeed, the generated objects which seem to be potential MCs are collected from all detectors. Therefore, the issue of FN detection can be reduced while these latter present different results in terms of FN and FP detections.
The SP2 superpixel map is used to define the object contours since SP2 superpixels respect the MC granularities (sizes and shapes). Using this map will enable us to solve the problems associated with overlapped or large suspicious areas.
The suspicion maps generated by the detectors offer a semantic description of their initial results. Such maps give us the opportunity to explore this new knowledge representation space and reduce the impact of the uncertainty of the grayscale representation.
The identification phase proceeds as follows. First, the initial detector results are modelled as thematic maps where the suspicious areas appear. Each of these maps is converted into a list of candidate objects. The global list is then generated from the union of all the previously generated ones. Using this latter, the candidate objects from each detector are refined and labelled based on the frequency of their detection.
Step 1: Generation of Initial Candidate Object Lists
The transformation of the thematic maps into lists of candidate objects is based on the SP2 map that associates a label to each superpixel. The number of different labels in this map is equal to that of the superpixels it contains. The proposed transformation is to project the SP2 map onto the different generated thematic maps. To accurately outline the identified objects starting from the suspicious areas, we propose to analyze the regions they occupy compared to those of the SP2 superpixels to which they belong. Figure 5 represents three samples of suspicious areas (white regions with red contour) in a thematic map (left image in each line). It illustrates the two possible situations when comparing a suspicious area to the SP2 superpixels (dashed blue contours):
The suspicious area intersects only one superpixel (first line).
The suspicious area intersects two (second line) or more (third line) superpixels.
White regions with red contours, in this figure, represent samples of suspicious areas in a thematic map (left image in each line).
For the first situation, we consider the suspicious area as a single candidate object. Its contours are those of this area while they do not extend those of the SP2 superpixel to which it belongs. For the second situation, we divide the suspicious area into different objects (two objects for the second line and nine objects for the third line). The intersections of the SP2 superpixel contours with the suspicious area define those associated to the identified objects. For instance, the second line in Fig. 5, the suspicious area (white region with red contour) intersects two different superpixels and . The contour dividing these two superpixels (blue dashed contour) is the one that defines the outlines of the two identified objects (regions with green and yellow contours in the right image).
This configuration gives the opportunity to deal with all the issues of selecting the best contours of an object starting from a suspicious area. It also normalizes the object identification process for all the detectors and ensures that all the considered objects comply with or at least show very close shapes and sizes to those of MCs. Thus, the largest object we accept corresponds to an SP2 superpixel (ex. , and from the third line in Fig. 5).
An object will be identified by its gravity center and a label inherent from that of the SP2 superpixel to which it belongs. Once these objects are identified, some characteristics are computed from the suspicion degree as well as the grayscale (mammographic image) maps and will be used in the collaborative process.
Step 2: Unification
The global list of candidate objects is the union of the lists identified from the different thematic maps arising from each detector. The objects with the same SP2 label will be considered a unique object. Their characterization results from projecting the SP2 superpixel map to the union thematic map. With this configuration, the same object can be differently characterized if it is not uniformly detected by the different detectors. Also, an object that was not detected by a given detector will inherit its properties from those associated in the global list. To resume, this list is a kind of a unifying result representing all the other ones.
Step 3: Refinement and Initial Classification
Once the global list of candidate objects is generated, we look back at the ones initially generated by the first step in order to:
Display the identified objects from a same suspicious area relative to each list.
Add the objects that do not appear in a list (not detected by a given detector).
Associate a class label to each object.
In this work, we define three different class labels (Fig. 6), for the candidate objects, based on their occurrence among the detector results:
The “absolute certainty” class label () which is assigned to the candidate objects detected by at least detectors. M refers to the number of used detectors and m (integer with ) refers to the quality parameter based on which the decision about a candidate object can be considered certain. In this work, M is equal to 4 and m is equal to 1.
The “partial certainty” class label () which is assigned to the candidate objects detected by [, ..., ] detectors.
The “uncertainty” class label () which is assigned to the candidate objects detected by [1, ..., m] detectors.
Such labelling follows the ordinary reasoning. Indeed, it affirms that a decision coming with detectors’ agreement is a reliable decision ().
Collaborative Classification Process
Motivations
The entire collaborative classification process we propose is presented in Fig. 7. It is decomposed into three main steps that will be detailed in the next subsections.
During this collaborative process, we propose to find a single decision from all the ones initially obtained by the used detectors. For that, the different lists of candidate objects (obtained from the previous step) will be compared with each other. Such comparisons allow evaluating the pairwise similarities and dissimilarities (will be called conflicts) between the decisions taken by the detectors on each SP1 superpixel. These similarity studies constitute the first step to analyze and relabel the candidate objects. Such a relabelling task consists in updating the class labels associated to objects in disagreement by the detectors. It starts from the fact that the candidate objects with detector agreements describe the first kernel of potential MCs in a given superpixel and that those with disagreement should present some similarities to them in order to change their labels. Indeed, analyzing the characteristic similarities of the objects in agreement gives us an idea about a starting kernel supposed to characterize the potential MCs present in a given SP1 superpixel (group of MCs). In this work, we propose to represent this kernel as a geometric graph with nodes N and edges E (Fig. 8). G is initialized as an unweighted graph. N Eq. (2) is defined as the set of nodes that correspond to the candidate objects from the list of objects () belonging to AC or PC classes. The position of each node corresponds to the gravity center position of the object to which it refers. E is defined as the set of edges () between the pairs of nodes and in N Eq. (3).
2 |
3 |
This representation, although it can increase the complexity of the treatment, is very adequate to our problem. Indeed, the geometrical representation of the nodes in the graph makes it able to preserve the knowledge resulting from the pixel level and, also, allows us to take advantage of this knowledge in the process to be followed.
To update the initial graph, generated from each detector, we proceed by two types of refinements. They consist on connecting the objects that belong to the CU class to the current graph if they comply with the similarity criteria with the connected nodes. Once connected, they will update their previously associated class labels. The similarity criteria we define are prior knowledge-based and model the groups of MCs. The two refinements we propose are:
A local refinement that concerns the candidate objects detected within the same SP1 superpixel. It decides on their geometrical similarities to add new objects to the current graph. The reasoning is built on the basis that a group of MCs are usually distributed with regular distances from each other [37]. From another side, it remains to the sensitivity of detectors to the small gray-level variations in an SP1 superpixel which can affect the similarities of the presented MCs (such as the contrast) in such region.
A contextual refinement that concerns the candidate objects within the immediate neighbors of an SP1 superpixel. It opts for a compromise between geometric and grayscale similarities to decide whether or not to add an object to the current graph. This defined similarity measure extends the reasoning made by the local refinement, namely the geometric distance regularity of objects, into the contextual level of superpixels. However, it reduces the importance of this latter if it is not associated with a numerical one which refers to the grayscale similarities of the objects in the neighboring superpixels to others in the analyzed one. This similarity, if it exists, reflects the homogeneity between the superpixels themselves.
The contextual refinement, as opposed to the local one, addresses the fact that the group of MCs can be dispersed into a large region and so occupies more than a single superpixel. It characterizes a group of MCs on the basis of the spatial knowledge issued from the SP1 neighborhood. Indeed, it imposes that MCs within the same group must have similar numerical characteristics together with the geometrical ones as well as they appear in homogeneous superpixels. Such characterization will reduce the probability of selecting equidistant objects regardless of their numerical characteristics and so to reduce the number of FP detections.
The proposed refinements can be considered as a revision of the detector decisions based on the information shared by the other collaborators. They are repeated until the convergence of a given graph which is reached if this latter remains stable or the change in the confidence degrees () Eq. (4) estimated for the SP1 superpixels between two iterations (t) and () is smaller than a given threshold ().
4 |
5 |
6 |
where:
The index k refers to the SP1 superpixel.
The index i refers to the detector.
refers to the cardinality of set of edges E of the graph.
refers to the weight of the new edges added to the previous graph.
refers to the maximum conflict degree associated to the superpixel and the detector. Its mathematical expressions will be presented in the next subsection.
With convergence, the set of nodes in each graph will compose the new list of candidate objects by the detector. Theoretically, these sets are more similar than the initial ones. Thus, the final combination of these new results could be simplified using a conjunctive operator while the agreement between them increases.
Detailed Mathematical Description
Estimation of Confidence Degrees
This step consists on observing the objects’ labels in each SP1 superpixel and to compare their characteristics. The main idea is to estimate a confidence degree per detector and SP1 superpixel and to update the labels associated to the objects in order to reduce the disagreements they present between the detectors. The estimation of the confidence degrees per SP1 superpixel enhances the interest of the semantic information it produces for a better decision-making. Indeed, these regions present a grayscale homogeneity which refers to a semantic homogeneity in the image (the same breast tissue type pixels appear with similar gray-level values on the mammographic image). And therefore, MCs that appear in the same superpixel usually present similar characteristics, at least those detected by the same detector (Fig. 9).
The starting point for the confidence degree estimations is to compare the pairs of candidate object lists for each SP1 superpixel relatively to two detectors ( and ). The similarity matrix () of the superpixel Eq. (7) we define evaluates the agreement between each and detector and is expressed as the proportion of their common detected objects (Os).
7 |
where
8 |
Once the similarity is calculated, we define the conflict degree (), which evaluates the disagreement between the and detectors Eq. (9), as well as the maximum conflict Eq. (10) for the SP1 superpixel and thus to initialize the corresponding confidence degree Eq. (11) for each detector . The maximum conflict degree matches for a given detector its least correspondent detector in terms of common detected objects and will be used in the re-labeling process.
9 |
10 |
11 |
The estimated confidence degree plays an important role in selecting the superpixels to treat as well as fusing the final results.
Re-labelling/Classification of Candidate Objects
The iterative re-labeling step is applied for each SP1 superpixel and each detector. It aims to improve the quality of the results of each detector in order to reduce the conflict it presents with the others and, thus, to have objects with identical labels. It starts by selecting the SP1 superpixels () with objects belonging to the or classes Eq. (12).
12 |
It is based on the fact that superpixels with objects presenting an absolute or a partial agreement by the detectors provide relevant information for better decision-making. It consists in updating the initial obtained graphs generated on the superpixels from with applying the intra- and inter-superpixel analyses. These analyses calculate the similarities between the candidate objects (), to be linked, and the nodes of the graph to be refined. These similarities are the weights of the new links added between the selected candidates and the old graph nodes.
(a) Intra-superpixel Analysis
Figure 10 illustrates an example of the intra-superpixel refinement.
A candidate object in an superpixel, associated to the CU class, is added to the initial graph () if and only if it satisfies the following condition:
13 |
14 |
where:
refers to the local similarity between the objects and
refers to the normalized Euclidean distance between the and objects from the list of nodes of the graph Eq. (14).
(b) Inter-superpixel Analysis
The inter-superpixel refinement (Fig. 11) is applied to the graph result from the local refinement. A new node (candidate object) which belongs to the immediate neighbors () of the superpixel and associated to the CU class is accepted to be added to the graph only and only if it fulfills the following similarity criterion defined as the average of the normalized geometrical () and feature () similarities Eq. (15):
15 |
16 |
where “Feat(OI)” is the feature’s value of the candidate object OI. In this work, we adopted the standard gray-level deviation of the pixels constituting an object is considered as the feature.
Considering the new retained graph, with the tth iteration (t>1), the local and contextual refinements are applied to the one retained with the previous iteration () and not the graphs of the t=1 iteration.
Overall Algorithm Description
The proposed approach, in this study, operates through a collaborative process that allows different detectors to iteratively review their initial decisions based on the information shared with other collaborators.
The supplementary section presents the detailed algorithms which describe the overall steps of this classification process Algorithm 3 as well as the different steps of the local Algorithm 1 and the contextual Algorithm 2 applied refinements.
Experimental Results and Discussion
In this study, 50 images extracted from the publicly available INBreast database [38] are used to evaluate the proposed approach. Selected mammograms belong to different breast densities and present various MC types. The validation is made on the basis of the comparisons with the ground truth (GT) masks defined by an expert in radiology from the University Hospital of Brest, France. A GT mask is built for each mammographic image, for a total of 50 GT masks, with all finding MCs. It is important to note here that the average of MC is around 125 per image for a total of 7000 MCs with all the images.
The evaluation we made is based on the reliability analysis and divided into two main parts. The first concerns an overall assessment of the obtained results. The second concerns the comparison of these latter with those of standard combination operators (union, intersection, etc.).
Evaluation Situations
The evaluation is made on the three different situations illustrated by the next figure: the true positive (TP), false positive (FP), and true negative (TN) situations (Fig. 12):
True positive situation refers to the MCs correctly identified as candidate objects by the proposed approach
False positive situation refers to the normal breast tissues considered candidate MCs by the proposed approach
False negative situation refers to the true MCs who are missed by the proposed approach
The main goal of the proposed approach is to decrease the false negative detection while increasing the number of true positives.
Performance Evaluation
Overall Assessment
The true positive rates Eq. (17) resulting from applying the proposed collaborative process are presented in Fig. 13. Obtained measures prove the reliability of the proposed approach to identify the MCs. Indeed, it is able to identify more than 70% (respectively 90%) of MCs for 58% (respectively 80%) of the analyzed images. It is important to note here that the average of MCs per image is around 125 MCs. On the other side, the MC detection rate is lower than 50% for less than 4% of the images.
17 |
Performance Evaluation Compared to the Used Detectors
In order to evaluate the robustness of the proposed approach, it is important to prove its reliability compared to the used detectors. Table 1 displays the TP average detection rates obtained for each detector (, i=1...4). It shows that the highest value (i.e., 85%) is derived from the proposed approach.
Table 1.
Table 2 presents the percentage analysis of the improved TP rates by the proposed approach compared to each of the used detectors.
Table 2.
The results in Table 2 indicate that more than the half of each of the initial detector results are improved by the proposed collaborative process. It also shows that the obtained TP rates are improved for 44% of the cases compared to those obtained by the best detector that returns the most relevant results for a given image. Such results highlight also the importance of adopting a collaborative fusion process instead of a simple detector to detect MCs. Indeed, even though the third detector () shows a comparable average of TP rate to that of the proposed approach, we have improved its TP rates for 52% of the cases.
Performance Evaluation Compared to Other Fusion Operators
In this study, the standard union (U) and intersection (I) operators for comparisons are used. This choice can be justified in particular by the features of the detectors as well as the mammographic image diversities. Indeed, no prior information is made about the global accuracy or reliability of the detector decisions. It depends on many factors such as the breast density and the MC feature diversities. From another side, the reliability of the used detectors depends on their reasoning compared to the MC appearances. For instance, the third detector describes the MCs as a set of adjacent “outlier” pixels in an SP1 superpixel. Despite the fact that it presents a concurrent average of TP rates, it shows some weaknesses with regard to MCs with various contrasts [29].
Figure 14 presents the boxplots of the different TP rates result from the intersection operator applied to combine the initial detectors’ decisions. It is clear that the proposed approach has successfully reduced the initial disagreements between the detectors and has improved the reliability of the intersection operator. However, the first assessment of the same figure with Table 3 suggests that the union operator presents higher TP rates compared to the proposed approach. Its average value is about 92% compared to 85% for the proposed approach.
Table 3.
Union operator | Intersection operator | Proposed approach | |
---|---|---|---|
Average of the TP rate | 92.05% | 42.2% | 85% |
In contrast, according to the quantitative analysis displayed in Table 4, the conjunctive operator, we applied, is a real competitor and even more reliable than the disjunctive operator applied to the initial detector results for 48% of the cases.
Table 4.
Union operator | Intersection operator | Best detector | |
---|---|---|---|
Improvement of the TP rate | 48% | 96% | 44% |
In addition to be competitive to the union operator, the proposed approach is able to reduce the false positive rates by an average of 32% for all the images. Such reduction helps discarding false positive objects with similar MC characteristics and, thus, to improve the reliability of a possible classification step which will discriminate the real MCs from the false positive ones. They also prove that the proposed approach was able to achieve its main objective, namely reducing the FN detections and reaching a competitive TP rated compared to those of the union operator.
Conclusion
In this paper, we have addressed the problem of a reliable detection of MCs from mammographic images while minimizing the false negative detections. In this context, we have proposed a collaborative process that makes different MC detection methods collaborate to refine their initial decisions and produce a new common decision able to reduce the disagreements they present and therefore the false negative detections. These refinements are applied for each superpixel and follow local and contextual analyses based on some prior knowledge about the links between superpixels and MCs. These analyses also iteratively estimate local reliability terms per superpixel. The final result is simplified to a conjunctive combination of the new detector decisions.
The obtained results, relative to the INBreast database, have proved the ability of the proposed collaborative process to improve the MC detection rates compared to the detectors as well as to some standard fusion operators.
As future works, we are planning to increase the size of the evaluation database in order to compare the proposed approach with a supervised MC detection approach. Moreover, the main goal of this study was to reduce the number of FN detections compared to a single detector. Nevertheless, the execution time has to be considered in parallel with the important advantage of reducing FNs. On the other side, reducing the FP detections is also of interest while it affects the reliability of the obtained results and so the experts’ diagnosis. The next steps are, then, to reduce the execution time, so that could be acceptable for the radiologists and to propose a classification scheme to distinguish the true MCs from the FP detections retained as potential MCs.
Supplementary Information
Below is the link to the electronic supplementary material.
Footnotes
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Contributor Information
Asma Touil, Email: asmaa.touil@gmail.com.
Karim Kalti, Email: karim.kalti@gmail.com.
Pierre-Henri Conze, Email: pierre-henri.conze@imt-atlantique.fr.
Basel Solaiman, Email: basel.solaiman@imt-atlantique.fr.
References
- 1.Hu K, Yang W, Gao X (2017) Microcalcification diagnosis in digital mammography using extreme learning machine based on hidden markov tree model of dual-tree complex wavelet transform. Expert Systems with Applications
- 2.Albiol A, Corbi A, Albiol F. Automatic intensity windowing of mammographic images based on a perceptual metric. Medical physics. 2017;44(4):1369–1378. doi: 10.1002/mp.12144. [DOI] [PubMed] [Google Scholar]
- 3.Wang J, Yang X, Cai H, Tan W, Jin C, Li L. Discrimination of breast cancer with microcalcifications on mammography by deep learning. Scientific reports. 2016;6:27327. doi: 10.1038/srep27327. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.BVignesh W, Sundaram M (2015) Effect of contourlet transform in detect of microcalcification in noisy environement. IEEE Sponsored 9th International Conference on Intelligent Systems and Control (ISCO)2015, At COIMBATORE
- 5.Guo Y, Dong M, Yang Z, Gao X, Wang K, Luo C, Ma Y, Zhang J. A new method of detecting micro-calcification clusters in mammograms using contourlet transform and non-linking simplified pcnn. Computer methods and programs in biomedicine. 2016;130:31–45. doi: 10.1016/j.cmpb.2016.02.019. [DOI] [PubMed] [Google Scholar]
- 6.Mordang JJ, Gubern-Mérida A, den Heeten G, Karssemeijer N. Reducing false positives of microcalcification detection systems by removal of breast arterial calcifications. Medical physics. 2016;43(4):1676–1687. doi: 10.1118/1.4943376. [DOI] [PubMed] [Google Scholar]
- 7.Bria A, Marrocco C, Galdran A, Campilho A, Marchesi A, Mordang JJ, Karssemeijer N, Molinara M, Tortorella F (2017) Spatial enhancement by dehazing for detection of microcalcifications with convolutional nets. In: International Conference on Image Analysis and Processing, Springer, pp 288–298
- 8.Diaz-Huerta CC, Felipe-Riveron EM, Montaño-Zetina LM. Quantitative analysis of morphological techniques for automatic classification of micro-calcifications in digitized mammograms. Expert Systems with Applications. 2014;41(16):7361–7369. doi: 10.1016/j.eswa.2014.05.051. [DOI] [Google Scholar]
- 9.Malek AA, Rahman WEZWA, Ibrahim A, Mahmud R, Yasiran SS, Jumaat AK. Region and boundary segmentation of microcalcifications using seed-based region growing and mathematical morphology. Procedia-Social and Behavioral Sciences. 2010;8:634–639. doi: 10.1016/j.sbspro.2010.12.088. [DOI] [Google Scholar]
- 10.Ciecholewski M (2016) Microcalcification segmentation from mammograms: A morphological approach. Journal of Digital Imaging, pp 1–13 [DOI] [PMC free article] [PubMed]
- 11.Touil A, Kalti K, Conze PH, Solaiman B, Mahjoub MA (2020) Automatic detection of microcalcification based on morphological operations and structural similarity indices. Biocybernetics and Biomedical Engineering
- 12.Duarte MA, Alvarenga AV, Azevedo CM, Calas MJG, Infantosi AF, Pereira WC. Evaluating geodesic active contours in microcalcifications segmentation on mammograms. Computer Methods and Programs in Biomedicine. 2015;122(3):304–315. doi: 10.1016/j.cmpb.2015.08.016. [DOI] [PubMed] [Google Scholar]
- 13.Touil A, Kalti K, Solaiman B, Mahjoub MA (2018) Microcalcifications detection from mammographie images based on region growing and variational energy convergence. In: 4th International Conference on Advanced Technologies for Signal and Image Processing, ATSIP 2018, Sousse, Tunisia, March 21-24, 2018, pp 1–6
- 14.Kalra PK, Kumar N, et al. A novel automatic microcalcification detection technique using tsallis entropy & a type ii fuzzy index. Computers & Mathematics with Applications. 2010;60(8):2426–2432. doi: 10.1016/j.camwa.2010.08.038. [DOI] [Google Scholar]
- 15.Quintanilla-Domínguez J, Ojeda-Magaña B, Marcano-Cedeño A, Barrón-Adame J, Vega-Corona A, Andina D. Automatic detection of microcalcifications in roi images based on pfcm and ann. International Journal of Intelligent Computing in Medical Sciences & Image Processing. 2013;5(2):161–174. doi: 10.1080/1931308X.2013.838070. [DOI] [Google Scholar]
- 16.Suhail Z, Sarwar M, Murtaza K. Automatic detection of abnormalities in mammograms. BMC medical imaging. 2015;15(1):53. doi: 10.1186/s12880-015-0094-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Veni G, Regentova E, Zhang L (2008) Detection of clustered microcalcifications with susan edge detector, adaptive contrast thresholding and spatial filters. In: Image Analysis and Recognition, Springer, pp 837–843
- 18.Fanizzi A, Basile TM, Losurdo L, Bellotti R, Bottigli U, Dentamaro R, Didonna V, Fausto A, Massafra R, Moschetta M, et al. A machine learning approach on multiscale texture analysis for breast microcalcification diagnosis. BMC bioinformatics. 2020;21(2):1–11. doi: 10.1186/s12859-020-3358-4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19.Wei L, Yang Y, Nishikawa RM, Jiang Y. A study on several machine-learning methods for classification of malignant and benign clustered microcalcifications. IEEE transactions on medical imaging. 2005;24(3):371–380. doi: 10.1109/TMI.2004.842457. [DOI] [PubMed] [Google Scholar]
- 20.Cai H, Huang Q, Rong W, Song Y, Li J, Wang J, Chen J, Li L (2019) Breast microcalcification diagnosis using deep convolutional neural network from digital mammograms. Computational and mathematical methods in medicine 2019 [DOI] [PMC free article] [PubMed]
- 21.Mordang JJ, Gubern-Mérida A, Bria A, Tortorella F, Heeten G, Karssemeijer N. Improving computer-aided detection assistance in breast cancer screening by removal of obviously false-positive findings. Medical Physics. 2017;44(4):1390–1401. doi: 10.1002/mp.12152. [DOI] [PubMed] [Google Scholar]
- 22.Valvano G, Della Latta D, Martini N, Santini G, Gori A, Iacconi C, Ripoli A, Landini L, Chiappino D (2017) Evaluation of a deep convolutional neural network method for the segmentation of breast microcalcifications in mammography imaging. In: EMBEC & NBC 2017, Springer, pp 438–441
- 23.Wahab N, Khan A, Lee YS. Two-phase deep convolutional neural network for reducing class skewness in histopathological images based breast cancer detection. Computers in biology and medicine. 2017;85:86–97. doi: 10.1016/j.compbiomed.2017.04.012. [DOI] [PubMed] [Google Scholar]
- 24.Hernández PLA, Estrada TT, Pizarro AL, Cisternas MLD (2016) Breast calcifications: description and classification according to bi-rads 5th edition. Revista Chilena de Radiología 22(2):80–91
- 25.Wilkinson L, Thomas V, Sharma N. Microcalcification on mammography: approaches to interpretation and biopsy. The British journal of radiology. 2017;90(1069):20160594. doi: 10.1259/bjr.20160594. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2010) Slic superpixels. Tech Rep [DOI] [PubMed]
- 27.Digabel H, Lantuéjoul C (1978) Iterative algorithms. In: Proc. 2nd European Symp. Quantitative Analysis of Microstructures in Material Science, Biology and Medicine, Stuttgart, West Germany: Riederer Verlag, vol 19, p 8
- 28.Touil A, Kalti K, Conze PH, Solaiman B, Mahjoub MA (2020) A new conditional region growing approach for an accurate detection of microcalci?cations from mammographic images
- 29.Meléndez EL, Urcid G (2016) Mammograms calcifications segmentation based on band-pass fourier filtering and adaptive statistical thresholding. European International Journal of Science and Technology
- 30.Wang Z, Bovik AC, Sheikh HR, Simoncelli EP, et al. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing. 2004;13(4):600–612. doi: 10.1109/TIP.2003.819861. [DOI] [PubMed] [Google Scholar]
- 31.Adams R, Bischof L. Seeded region growing. IEEE Transactions on pattern analysis and machine intelligence. 1994;16(6):641–647. doi: 10.1109/34.295913. [DOI] [Google Scholar]
- 32.Pal NR, Pal K, Keller JM, Bezdek JC. A possibilistic fuzzy c-means clustering algorithm. IEEE Transactions on Fuzzy Systems. 2005;13(4):517–530. doi: 10.1109/TFUZZ.2004.840099. [DOI] [Google Scholar]
- 33.Bezdek JC, Ehrlich R, Full W. Fcm: The fuzzy c-means clustering algorithm. Computers & Geosciences. 1984;10(2–3):191–203. doi: 10.1016/0098-3004(84)90020-7. [DOI] [Google Scholar]
- 34.Krishnapuram R, Keller JM. A possibilistic approach to clustering. IEEE transactions on fuzzy systems. 1993;1(2):98–110. doi: 10.1109/91.227387. [DOI] [Google Scholar]
- 35.Quintanilla-Domínguez J, Ojeda-Magaña B, Marcano-Cedeño A, Cortina-Januchs MG, Vega-Corona A, Andina D. Improvement for detection of microcalcifications through clustering algorithms and artificial neural networks. EURASIP J Adv Sig Proc. 2011;2011:91. doi: 10.1186/1687-6180-2011-91. [DOI] [Google Scholar]
- 36.Seo S (2006) A review and comparison of methods for detecting outliers in univariate data sets. PhD thesis, University of Pittsburgh
- 37.Alsheh Ali M, Eriksson M, Czene K, Hall P, Humphreys K. Detection of potential microcalcification clusters using multivendor for-presentation digital mammograms for short-term breast cancer risk estimation. Medical physics. 2019;46(4):1938–1946. doi: 10.1002/mp.13450. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Moreira IC, Amaral I, Domingues I, Cardoso A, Cardoso MJ, Cardoso JS. Inbreast: toward a full-field digital mammographic database. Academic radiology. 2012;19(2):236–248. doi: 10.1016/j.acra.2011.09.014. [DOI] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.