Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Next Article in Journal
Analyzing the Causality and Dependence between Exchange Rate and Real Estate Prices in Boom-and-Bust Markets: Quantile Causality and DCC Copula GARCH Approaches
Next Article in Special Issue
Towards Predictive Vietnamese Human Resource Migration by Machine Learning: A Case Study in Northeast Asian Countries
Previous Article in Journal
High-Order Compact Difference Method for Solving Two- and Three-Dimensional Unsteady Convection Diffusion Reaction Equations
Previous Article in Special Issue
RainPredRNN: A New Approach for Precipitation Nowcasting with Weather Radar Echo Images Based on Deep Learning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Cubical Homology-Based Machine Learning: An Application in Image Classification

Department of Applied Computer Science, University of Winnipeg, Winnipeg, MB R3B 2E9, Canada
*
Author to whom correspondence should be addressed.
This study is part of Seungho Choe’s MSc Thesis of Cubical homology-based Image Classification—A Comparative Study, defended at the University of Winnipeg in 2021.
Axioms 2022, 11(3), 112; https://doi.org/10.3390/axioms11030112
Submission received: 10 January 2022 / Revised: 15 February 2022 / Accepted: 24 February 2022 / Published: 3 March 2022
(This article belongs to the Special Issue Various Deep Learning Algorithms in Computational Intelligence)

Abstract

:
Persistent homology is a powerful tool in topological data analysis (TDA) to compute, study, and encode efficiently multi-scale topological features and is being increasingly used in digital image classification. The topological features represent a number of connected components, cycles, and voids that describe the shape of data. Persistent homology extracts the birth and death of these topological features through a filtration process. The lifespan of these features can be represented using persistent diagrams (topological signatures). Cubical homology is a more efficient method for extracting topological features from a 2D image and uses a collection of cubes to compute the homology, which fits the digital image structure of grids. In this research, we propose a cubical homology-based algorithm for extracting topological features from 2D images to generate their topological signatures. Additionally, we propose a novel score measure, which measures the significance of each of the sub-simplices in terms of persistence. In addition, gray-level co-occurrence matrix (GLCM) and contrast limited adapting histogram equalization (CLAHE) are used as supplementary methods for extracting features. Supervised machine learning models are trained on selected image datasets to study the efficacy of the extracted topological features. Among the eight tested models with six published image datasets of varying pixel sizes, classes, and distributions, our experiments demonstrate that cubical homology-based machine learning with the deep residual network (ResNet 1D) and Light Gradient Boosting Machine (lightGBM) shows promise with the extracted topological features.

1. Introduction

The origin of topological data analysis (TDA) and persistent homology can be traced back to H. Edelsbrunner, D. Letscher, and A. Zomorodian  [1]. More recently, TDA has emerged as a growing field in applied algebraic topology to infer relevant features for complex data [2]. One of the fundamental methods in computational topology is persistent homology [3,4], which is a powerful tool to compute, study, and encode efficiently multi-scale topological features of nested families of simplicial complexes and topological spaces [5]. Simplices are building blocks used to study the shape of data and a simplicial complex is its higher-level counterpart. The process of shape construction is commonly referred to as filtration [6]. There are many forms of filtrations and a good survey is presented in [7]. Persistent homology extracts the birth and death of topological features throughout a filtration built from a dataset [8]. In other words, persistent homology is a concise summary representation of topological features in data and is represented in a persistent diagram or barcode. This is important since it tracks changes and makes it possible to analyze data at multiple scales since the data structure associated with topological features is a multi-set, which makes learning harder. Persistent diagrams are then mapped into metric spaces with additional structures useful for machine learning tasks [9]. The application of TDA in machine learning (also known as TDA pipeline) in several fields is well documented [2]. The TDA pipeline consists of using data (e.g., images, signals) as input and then filtration operations are applied to obtain persistence diagrams. Subsequently, ML classifiers such as support vector machines, tree classifiers, and neural networks are applied to the persistent diagrams.
The TDA pipeline is an emerging research area to discover descriptors useful for image and graph classification learning and in science—for example, quantifying differences between force networks [10] and analyzing polyatomic structures [11]. In [12], microvascular patterns in endoscopy images can be categorized as regular and irregular. Furthermore, there are three types of regular surface of the microvasculature: oval, tubular, and villous. In this paper, topological features were derived with persistence diagrams and the q-th norm of the p-th diagram is computed as
N q = A Dgm p ( f ) pers ( A ) q 1 q ,
where Dgm p ( f ) denotes the p-th diagram of f and pers ( A ) is the persistence of a point A in Dgm p ( f ) . Since N q is a norm of p-th Betti number with restriction (or threshold) s, it will obtain the p-th Betti number of M s , where M is the rectangle covered by pixels. Then, M is mapped to R by a signed distance function. A naive Bayesian learning method that combines the results of several Adaboost classifiers is then used to classify the images. The authors in [13] introduce a multi-scale kernel for persistence diagrams that is based on scale space theory [14]. The focus is on the stability of persistent homology since any occurrence of small changes in the input affects both the 1-Wasserstein distance and persistent diagrams. Experiments on two benchmark datasets for 3D shape classification/retrieval and texture recognition are discussed.
Vector summaries of persistence diagrams is a technique that transforms a persistence diagram into vectors and summarizes a function by its minimum through a pooling technique. The authors in [15] present a novel pooling within the bag-of-words approach that shows a significant improvement in shape classification and recognition problems with the Non-Rigid 3D Human Models SHREC 2014 dataset.
The topological and geometric structures underlying data are often represented as point clouds. In [16], the RGB intensity values of each pixel of an image are mapped to the point cloud P R 5 and then a feature vector is derived. Computing and arranging the persistence of point cloud data by descending order makes it possible to understand the persistence of features. The extracted topological features and the traditional image processing features are used in both vector-based supervised classification and deep network-based classification experiments on the CIFAR-10 image dataset. More recently, multi-class classification of point cloud datasets was discussed in [17]. In [8], a random forest classifier was used to classify the well-known MNIST image dataset using the voxel structure to obtain topological features.
In [18], persistent diagrams were used with neural network classifiers in graph classification problems. In TDA, Betti numbers represent counts of the number of homology groups, such as points, cycles, and so on. In [19], the similarity of the brain networks of twins is measured using Betti numbers. In [20,21], persistent barcodes were used to visualize brain activation patterns in resting-state functional magnetic resonance imaging (rs-fMRI) video frames. The authors used a geometric Betti number that counts the total number of connected cycles forming a vortex (nested, usually non-concentric, connected cycles) derived from the triangulation of brain activation regions.
The success of deep learning [22] in computer vision problems has led to its use in deep networks that can handle barcodes [23]. Hofer et al. used a persistence diagram as a topological signature and computed a parametrized projection from the persistence diagram, and then leveraged it during the training of the network. The output of this process is stable when using the 1-Wasserstein distance. Classification of 2D object shapes and social network graphs was successfully demonstrated by the authors. In [24], the authors apply topological data analysis to the classification of time series data. A 1D convolutional neural network is used, where the input data are a Betti sequence. Since machine learning models rely on accurate feature representations, multi-scale representations of features are becoming increasingly important in applications involving computer vision and image analysis. Persistence homology is able to bridge the gap between geometry and topology, and persistent homology-based machine learning models have been used in various areas, including image classification and analysis [25].
However, it has been shown that the implementations of persistent homology (of simplicial complexes) are inefficient for computer vision since it requires excessive computational resources [26] due to the formulations based on triangulations. To mitigate the problem of complexity, cubical homology was introduced, which allows the direct application of its structure [27,28]. Simply, cubical homology uses a collection of cubes to compute the homology, which fits the digital image structure of grids. Since there is neither skeletonization nor triangulation in the computation of cubical homology, it has advantages in the fast segmentation of images for extracting features. This feature of cubical homology is the motivation for this work in exploring the extraction of topological features from 2D images using this method.
The focus in this work is twofold: (i) on the extraction of topological features from 2D images with varying pixel sizes, classes, and distributions using cubical homology; (ii) to study the effect of extracted 1D topological features in a supervised learning context using well-known machine learning models trained on selected image datasets. The work presented in this paper is based on the thesis by Choe [29]. Figure 1 illustrates our proposed approach. Steps 2.1 and 2.2 form the core of this paper, namely the generation of 1D topological signatures using a novel score that is proposed by this study. This score allows us to filter out low persistence features (or noise). Our contribution is as follows: (i) we propose a cubical homology-based algorithm for extracting topological features from 2D images to generate their topological signatures; (ii) we propose a score, which is used as a measure of the significance of the subcomplex calculated from the persistence diagram. Additionally, we use gray-level co-occurrence matrix (GLCM) and contrast limited adapting histogram equalization (CLAHE) for obtaining additional image features, in an effort to improve the classification performance, and (iii) we discuss the results of our supervised learning experiments of eight well-known machine learning models trained on six different published image datasets using the extracted topological features.
The paper is organized as follows. Section 2 gives basic definitions for simplicial, cubical, and persistent homology used in this work. Section 3.2 illustrates the feature engineering process, including the extraction of topological and other subsidiary features. Section 3.1 introduces the benchmark image datasets used in this work. Section 4 gives the results and analysis of using eight machine learning models trained on each dataset. Finally, Section 5 gives the conclusions of this study.

2. Basic Definitions

We recall some basic definitions of the concepts used in this work. A simplicial complex is a space or an object that is built from a union of points, edges, triangles, tetrahedra, and higher-dimensional polytopes. Homology theory is in the domain of algebraic topology related to the connectivity in multi-dimensional shapes [26].

2.1. Simplicial Homology

Graphs are mathematical structures used to study pairwise relationships between objects and entities.
Definition 1.
A graph is a pair of sets, G = ( V , E ) , where V is the set of vertices (or nodes) and E is a set of edges.
Let S be a subset of a group G. Then, the subgroup generated by S, denoted S , is the subgroup of all elements of G that can be expressed as the finite operation of elements in S and their inverses. For example, the set of all integers, Z , can be expressed by the operation of elements { 1 } so Z is the subgroup generated by { 1 } .
Definition 2.
A r a n k of a group G is the size of the smallest subset that generates G.
For instance, since Z is the subgroup generated by { 1 } , rank( Z )=1.
Definition 3.
A s i m p l e x c o m p l e x on a set V is a family of arbitrary cardinality subsets of V closed under the subset operation, which means that, if a set S is in the family, all subsets of S are also in the family. An element of the family is called a simplex or face.
Definition 4.
Moreover, p s i m p l e x can be defined to the convex hull of p + 1 affinely independent points x 0 , x 1 , , x p I R d .
For example, in a graph, 0-simplex is a point, 1-simplex is an edge, 2-simplex is a triangle, 3-simplex is a tetrahedron, and so on (see Figure 2).

Chain, Boundary, and Cycle

To extend simplicial homology to persistent homology, the notion of chain, boundary, and cycle is necessary [31].
Definition 5.
A p-chain is a subset of p-simplices in a simplicial complex K. Assume that K is a triangle. Then, a 1-chain is a subset of 1-simplices—in other words, a subset of the three edges.
Definition 6.
A boundary, generally denoted ∂, of a p-simplex is the set of ( p 1 ) -simplices’ faces.
For example, a triangle is a 2-simplex, so the boundary of a triangle is a set of 1-simplices which are the edges. Therefore, the boundary of the triangle is the three edges.
Definition 7.
A cycle can be defined using the definitions of chain and boundary. A p-cycle c is a p-chain with an empty boundary. More simply, it is a path where the starting point and destination point are the same.

2.2. Cubical Homology

Cubical homology [27] is efficient since it allows the direct use of the cubical structure of the image, whereas simplicial theory requires increasing the complexity of data. While the simplicial homology is built with the triangle and its higher-dimensional structure, such as a tetrahedron, cubical homology consists of cubes. In cubical homology, each cube has a unit size and the n-cube represents its dimension. For example, 0-cubes are points, 1-cubes are lines with unit length, 2-cubes are unit squares, and so on [27,32,33].
Definition 8.
Here, 0-cubes can be defined as an interval,
[ m ] = [ m , m ] , m Z ,
which generate subsets I R , such that
I = [ m , m + 1 ] , m Z .
Therefore, I is called a 1-cube, or e l e m e n t a r y i n t e r v a l .
Definition 9.
An n-cube can be expressed as a product of elementary intervals as
Q = I 1 × I 2 × × I n R n ,
where Q indicates that n-cube I i ( i = 1 , 2 , , n ) is an elementary interval.
A d-dimensional image is a map I : I Z d R .
Definition 10.
A pixel can be defined as an element v I , where d = 2 . If d > 2 , v is called a voxel.
Definition 11.
Let I ( v ) be the intensity or grayscale value. Moreover, in the case of binary images, we consider a map B : I Z d { 0 , 1 } .
A voxel is represented by a d-cube and, with all of its faces added, we have
I ( σ ) : = min σ face of τ I ( τ ) .
Let K be the cubical complex built from the image I, and let
K i : = { σ K | I ( σ ) i } ,
be the i-th sublevel set of K. Then, the set { K i } i Im ( I ) defines a filtration of the cubical complexes. Thus, the pipeline to filtration from an image with a cubical complex is as follows:
Image → Cubical complex → Sublevel sets → Filtration
Moreover, chain, boundary, and cycle in cubical homology can be defined in the same manner as in Section 2.1.

2.3. Persistent Homology

In topology, there are subcomplices of complex K and c u b e s are created (birth) and destroyed (death) by filtration. Assume that K i ( 0 i , i Z ) is a subcomplex of filtered complex K such that
K 0 K 1 K n = K ,
and Z k i , B k i are its corresponding cycle group and boundary group.
Definition 12.
The kth homology group [1] can be defined as
H k = Z k / B k .
Definition 13.
The p- p e r s i s t e n t kth homology group of K i  [1] can be defined as
H k i , p = Z k i / B k i + p Z k i .
Definition 14.
A p e r s i s t e n c e is a lifetime of these attributes based on the filtration method used [1].
One can plot the birth and death times of the topological features as a barcode, also known as a persistence barcode, shown in Figure 3. This diagram graphically represents the topological signature of the data. Illustration of persistence is useful when detecting a change in terms of topology and geometry, which plays a crucial role in supervised machine learning [34].

3. Materials and Methods

3.1. Image Datasets

In this section, we give a brief description of the six published image datasets used in this work. Datasets used for benchmarking were collected from various sources that include Mendeley Data (https://data.mendeley.com/ accessed on 7 January 2022), Tensorflow dataset (https://www.tensorflow.org/datasets accessed on 7 January 2022), and Kaggle competition (https://www.kaggle.com/competitions accessed on 7 January 2022). The c o n c r e t e c r a c k i m a g e s dataset [35] contains a total of 40,000 images, where each image consists of 227 × 227 pixels. These images were collected from the METU campus building and consist of two classes: 20,000 images where there are no cracks in the concrete (positive) and 20,000 images of concrete that is cracked (negative). A crack on an outer wall occurs with the passage of time or due to natural aging. It is important to detect these cracks in terms of evaluating and predicting the structural deterioration and reliability of buildings. Samples of the two types of images are shown in Figure 4.
The A P T O S b l i n d n e s s d e t e c t i o n dataset [36] is a set of retina images taken by fundus photography for detecting and preventing diabetic retinopathy from causing blindness (https://www.kaggle.com/c/aptos2019-blindness-detection/overview accessed on 7 January 2022). This dataset has 3662 images and consists of 1805 images diagnosed as non-diabetic (labeled as 0) retinopathy and 1857 images diagnosed as diabetic retinopathy, as shown in Figure 5. Figure 6 shows the distribution of examples in the four classes using a severity range from 1 to 4 with the following interpretation: 1: Mild, 2: Moderate, 3: Severe, 4: Proliferative DR.
The p e s t c l a s s i f i c a t i o n i n m a n g o f a r m s dataset [37] is a collection of 46,500 images of mango leaves affected by 15 different types of pests and one normal (unaffected) mango leaf, as shown in Figure 7. Some of these pests can be detected visually. Figure 8 shows the data distribution of examples in the 15 classes of pests and one normal class.
The I n d i a n f r u i t s dataset [38] contains 23,848 images that cover five popular fruits in India: apple, orange, mango, pomegranate, and tomato. This dataset includes variations of each fruit, resulting in 40 classes. This dataset was already separated into training and testing sets by the original publishers of the dataset, as shown in Figure 9. Note that this dataset has an imbalanced class distribution.
The c o l o r e c t a l h i s t o l o g y dataset [39] contains 5000 histological images of different tissue types of colorectal cancer. It consists of 8 classes of tissue types with 625 images for each class, as shown in Figure 10.
The F a s h i o n M N I S T dataset [40] is a collection of 60,000 training images of fashion products, as shown in Figure 11. It consists of 28 × 28 grayscale images of products from 10 classes. Since the dataset contains an equal number of images for each class, there are 6000 test images in each class, resulting in a balanced dataset.
Table 1 gives the dataset characteristics in terms of the various image datasets used in this work. Moreover, we provide the preprocessing time per image. For example, the feature extraction time for the concrete dataset was 5 h 12 min.

3.2. Methods—Feature Engineering

In this section, we describe the feature engineering process. The main purpose of this process is to obtain a 1-dimensional array from each image in the dataset. Each point from the persistence diagram plays a significant role in the extraction of the topological features. Moreover, the gray-level co-occurrence matrix (GLCM) supports these topological features as additional signatures. Because every image dataset is not identical in size and some images have very high resolution, resizing every image to 200 × 200 and converting them to grayscale guarantees a relatively constant duration of extraction (approximately 4 s) regardless of its original size.
Algorithm 1 gives the method for extracting topological features from a dataset. In this algorithm, β 0 and β 1 are Betti numbers derived from Equation (6), where the dimension of i th homology is called the i th Betti number of K. β 0 gives the number of connected components and β 1 gives the number of holes. Betti numbers represent the count of the number of topological features.
Algorithm 1: Extraction of Topological Features.
Input:N ← number of dataset
    for   i = 1 , 2 , , N do
           i m g load i th image from dataset
           i m g resize i m g to (200, 200) and convert to grayscale
           P D 0 set of points of β 0 in persistence diagram of i m g with cubical complex
           P D 1 set of points of β 1 in persistence diagram of i m g with cubical complex
           P D 0 sort P D 0 in descending order of p e r s i s t e n c e
           P D 1 sort P D 1 in descending order of p e r s i s t e n c e
           d i project each point in P D 0 to [0, 1]
           d i d i + project each point in P D 1 to [1, 2]
           f i m g adapt CLAHE filter to i m g
           f P D 0 set of points of β 0 in persistence diagram of f i m g with cubical complex
           f P D 1 set of points of β 1 in persistence diagram of f i m g with cubical complex
           f P D 0 sort f P D 0 in descending order of p e r s i s t e n c e
           f P D 1 sort f P D 1 in descending order of p e r s i s t e n c e
           d i d i + project each point in f P D 0 to [0, 1]
           d i d i + project each point in f P D 1 to [1, 2]
           d i d i + convert i m g to GLCM with distances (1, 2, 3), directions (0 , 45 , 90 , 135 ), and properties ( e n e r g y , h o m o g e n e i t y )
Output:  D ( d 1 , d 2 , , d N )

3.3. Projection of Persistence Diagrams

The construction of a persistence diagram is possible once the filtration (using cubical complex) is completed. The dth persistence diagram, D d , contains all of the d-dimensional topological information. These are a series of points with a pair of ( b i r t h , d e a t h ) , where b i r t h indicates the time at which the topological features were created and the d e a t h gives the time at which these features are destroyed. From here, p e r s i s t e n c e is defined using the definition of b i r t h and d e a t h as
p e r s ( b i r t h , d e a t h ) : = d e a t h b i r t h , w h e r e ( b i r t h , d e a t h ) D d .
Low-persistence features are treated as having low importance, or ‘noise’, whereas high-persistence features are regarded as true features [1]. However, using p e r s i s t e n c e as a result of a projection of a topological feature to a 1-dimensional value is not helpful, because it is impossible to distinguish the features which have the same p e r s i s t e n c e but different values for b i r t h . Therefore, we propose a measure ( s c o r e ) to compensate for this limitation of p e r s i s t e n c e , shown in Equation (9).
s c o r e d ( b i r t h , d e a t h ) : = 0 if p e r s i s t e n c e < t h r e s h o l d d + e sin ( d e a t h 255 · π 2 ) 1 e 1 3 e sin ( b i r t h 255 · π 2 ) 1 e 1 3 if p e r s i s t e n c e t h r e s h o l d
Since the sinusoidal term is increasing and has the value [0, 1] when the input is [0, π 2 ], the s c o r e d has a value range from d to d + 1. Hence, it is easy to distinguish the dimension and persistence of each feature. Moreover, a higher exponent emphasizes a feature that has longer persistence and, significantly, ignores a feature that has shorter persistence. This in is keeping with ideas underlying homology groups, where longer the persistence, the higher the significance of the homology. Conversely, the homology group that has short persistence is considered noise, which degrades the quality of the digital image and, as such, is less significant as a topological feature [10]. By ignoring such noise, using a threshold (as a parameter) allows us to separate useful features from noise. The optimal threshold (value = 10) was determined experimentally by comparing the performance of machine learning models. In summary, the s c o r e takes into account not only the persistence, but also other aspects such as the dimension, birth, and death of topological features.

3.4. Contrast Limited Adapting Histogram Equalization (CLAHE)

When pixel values are concentrated in a narrow range, it is hard to perceive features visually. Histogram equalization makes the distribution of pixel values in the image balanced, thereby enhancing the image. However, this method often results in degrading the content of the image and also amplifying the noise. Therefore, it produces undesirable results. Contrast limited adapting histogram equalization (CLAHE) is a well-known method for compensating for the weakness of histogram equalization by dividing an image into small-sized blocks and performing histogram equalization for each block [41]. After completing histogram equalization in all blocks, bilinear interpolation makes the boundary of the tiles (blocks) smooth. In this paper, we used the following hyperparameters: clipLimit=7 and tileGridSize=((8, 8)). An illustration of the CLAHE method on the APTOS data is given in Figure 12.
The texture of an image can be described by its statistical properties and this information is useful to classify images [42]. For extracting texture features, we used the well-known gray-level co-occurrence matrix (GLCM) [43]. GLCM extracts texture information regarding the structural arrangement of surfaces by using a displacement vector defined by its radius and orientation. We used three distances (1, 2, 3) and four directions (0 , 45 , 90 , 135 ) to obtain the GLCM features. From each of the co-occurrence matrices, two global statistics were extracted, energy and homogeneity, resulting in 3 × 4 × 2 = 24 texture features for each image.
Table 2 gives a list of extracted features from the APTOS dataset during the filtration process. In total, 144 features were extracted for each dimension ( f d i m 0 and f d i m 1 ) from the CLAHE-filtered image in descending order of persistence. Similarly, 100 ( d i m 0 and d i m 1 ) topological features for each dimension were extracted. Note that d i m 0 represents β 0 and d i m 1 represents β 1 Betti numbers, respectively. A total of 24 GLCM features were extracted from the original gray-level image.
In this paper, feature engineering and learning algorithms were implemented with the following Python libraries: Gudhi [44,45] for calculating persistent homology, PyTorch [46] for modeling and execution of ResNet 1D, and scikit-learn [47] for implementation of other machine learning algorithms. Moreover, libraries such as NumPy [48] and pandas [49] were used for computing matrices and analyzing the data structure. All tests were conducted using a desktop workstation with Intel i7-9700K at 3.6 GHz, 8 CPU cores, 16 GB RAM, and Gigabyte GeForce RTX 2080 GPU. The following algorithms were used.
D e e p R e s i d u a l N e t w o r k , suggested by [50], is an ensemble of VGG-19 [51], plain network, and residual network as a solution to the network depth-accuracy degradation problem. This is done by a residual learning framework, which is a feedforward network with a shortcut. Multi-scale 1D ResNet is used in this work, where multi-scale refers to flexible convolutional kernels rather than flexible strides [52]. The authors use different sizes of kernels so that the network can learn features from original signals with different views with multiple scales. The structure of the model is described in Figure 13. The 1D ResNet model consists of a number of subblocks of the basic CNN blocks. A b a s i c C N N b l o c k computes batch normalization after convolution as follows: y = W x + b , s = BN ( y ) , and h = ReLU ( s ) , where ⊗ denotes the convolution operator and BN is a batch normalization operator. Moreover, stacking two basic CNN blocks forms a s u b b l o c k of the basic CNN blocks as follows: h 1 = Basic ( x ) , h 2 = Basic ( h 1 ) , y = h 2 + x , and h ^ = ReLU ( y ) , where the B a s i c operator denotes the basic block described above. Using the above method, it is possible to construct multiple sub-blocks of the CNN with different kernel sizes. For training the network, we used an early stopping option if there was no improvement in the validation loss after 20 epochs. Using the early stopping option and a learning rate of 0.01, the network was trained over 100 epochs, since the average training time was around 50 epochs.
For the other machine learning models, the r a n d o m f o r e s t algorithm with 200 trees, g i n i as a criterion, and unlimited depth was used. For the K n e a r e s t n e i g h b o r s ( k N N ), the following parameters were used: k = 5 and M i n k o w s k i as the distance metric. While the random forest algorithm is an ensemble method based on the concept of bagging, G B M [53] uses the concept of boosting, iteratively training the model by adding new weak models consecutively with the negative gradient from the loss function. Both extreme gradient boosting ( X G B o o s t ) [54,55] and l i g h t G B M [56] are advanced models of the gradient boosting machines. LightGBM combines two techniques: G r a d i e n t - b a s e d O n e - S i d e S a m p l i n g and E x c l u s i v e F e a t u r e B u n d l i n g [57]. Since our dataset is tabular, XGBoost, which is a more efficient implementation of GBM, was used. For the XGBoost implementation, the following training parameters were used: 1000 n_estimators for creating weak learners, learning rate = 0.3 ( E t a ), and max_depth = 6. For the LightGBM implementation, the following training parameters were used: 1000 n_estimators for creating weak learners, learning rate = 0.1 ( E t a ), and num_leaves = 31.
For all the datasets (except the Indian fruits dataset), 80% of the dataset was used for training and 20% was used for testing. The Indian fruits dataset was already separated into 90% for training and 10% for testing. This ratio was used in our experiments.

4. Results and Discussion

Table 3 gives the accuracy, weighted F1 score, and run time information for each of the datasets. The accuracy score reported with the benchmark datasets is given in the last column. The best result is indicated in blue. Overall, ResNet 1D outperforms other ML models, while different types of gradient boosting machines show fairly good accuracy and weighted F1 scores. In terms of binary classification image datasets, as in the concrete dataset, most of the algorithms achieve 0.99 accuracy and F1 score. However, for the multi-class image datasets, the SVM and kNN perform poorly, mainly due to the inherent difficulty of finding the best parameters. All machine learning models perform significantly worse than the benchmark results with the Fashion MNIST and APTOS datasets with the extracted topological features. This is because it is hard to obtain good trainable topological signatures from images that have low resolution, even though Fashion MNIST was resized (please see Table 1).
In the case of the APTOS dataset, imbalanced training data were the main cause of the poor results. For example, Label 0 indicates the absence of diabetic retinopathy and has the highest number of images (See Figure 6). However, the presence of diabetic retinopathy can be found in four classes, of which Label 2 (severity level 2.0) has the largest number of cases. As a result, more than half of the examples were classified as Label 2 (see Figure 14c).
Imbalanced data such as Mangopest and Indian fruits were classified well because there were sufficient training examples. In summary, the best classification performance using cubical homology with the ResNet 1D classifier was obtained for 2 out of 6 datasets using our proposed feature extraction method and score measure. However, these topological signatures were not helpful in the classification of the Fashion MNIST and APTOS images. For the Indian fruits dataset, the model classifies with an accuracy of 1.00, which is comparable since there is just 0.001 improvement. Similarly, with the concrete dataset, the result is comparable, with only a slight difference (≤0.005) with the benchmark result.
In Table 4, we give an illustration of the performance of classifiers using features derived from cubical homology (TDA), GLCM, and combined TDA-GLCM for two datasets. It is clear from the results that the combined feature set using topological features and GLCM features results in better classifier accuracy.
Confusion matrices for experiments with the 1D ResNet model are given in Figure 14, Figure 15 and Figure 16. It is noteworthy that, for these datasets, the application of cubical homology has led to meaningful results in 4 out of 6 datasets.

5. Conclusions

The focus of this paper was on feature extraction from 2D image datasets using a specific topological method (cubical homology) and a novel score measure. These features were then used as input to well-known classification algorithms to study the efficacy of the proposed feature extraction process. We proposed a novel scoring method to transform the 2D input images into a one-dimensional array to vectorize the topological features. In this study, six published datasets were used as benchmarks. ResNet 1D, LightGBM, XGBoost, and five well-known machine learning models were trained on these datasets. Our experiments demonstrated that, in three out of six datasets, our proposed topological feature method is comparable to (or better than) the benchmark results in terms of accuracy. However, with two datasets, the performance of our proposed topological feature method is poor, due to either low resolution or an imbalanced dataset. We also demonstrate that topological features combined with GLCM features result in better classification accuracy in two of the datasets. This study reveals that the application of cubical homology to image classification shows promise. Since the conversion of input images to 2D data is very time-consuming, future work will involve (i) seeking more efficient ways to reduce the time for pre-processing and (ii) experimentation with more varied datasets. The problem of poor accuracy with imbalanced datasets needs further exploration.

Author Contributions

S.C.: Conceptualization, Investigation, Formal Analysis, Supervision, Resources, Original Draft. S.R.: Methodology, Validation, Software, Editing. All authors have read and agreed to the publisher version of the manuscript.

Funding

This research was funded by NSERC Discovery Grant#194376 and University of Winnipeg Major Research Grant#14977.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Edelsbrunner, H.; Letscher, D.; Zomorodian, A. Topological persistence and simplification. In Proceedings of the 41st Annual Symposium on Foundations of Computer Science, Redondo Beach, CA, USA, 12–14 November 2000; IEEE Computer Society Press: Los Alamitos, CA, USA, 2000; pp. 454–463. [Google Scholar]
  2. Chazal, F.; Michel, B. An Introduction to Topological Data Analysis: Fundamental and Practical aspects for Data Scientists. arXiv 2017, arXiv:1710.04019. [Google Scholar] [CrossRef]
  3. Zomorodian, A.; Carlsson, G. Computing persistent homology. Discr. Comput. Geom. 2005, 33, 249–274. [Google Scholar] [CrossRef] [Green Version]
  4. Carlsson, G. Topology and data. Bull. Am. Math. Soc. 2009, 46, 255–308. [Google Scholar] [CrossRef] [Green Version]
  5. Edelsbrunner, H.; Harer, J. Persistent homology. A survey. Contemp. Math. 2008, 453, 257–282. [Google Scholar]
  6. Zomorodian, A.F. Computing and Comprehending Topology: Persistence and Hierarchical Morse Complexes. Ph.D. Thesis, University of Illinois at Urbana-Champaign, Urbana, IL, USA, 2001. [Google Scholar]
  7. Aktas, M.E.; Akbas, E.; Fatmaoui, A.E. Persistence Homology of Networks: Methods and Applications. arXiv 2019, arXiv:math.AT/1907.08708. [Google Scholar] [CrossRef] [Green Version]
  8. Garin, A.; Tauzin, G. A topological “reading” lesson: Classification of MNIST using TDA. In Proceedings of the 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA), Boca Raton, FL, USA, 16–19 December 2019; pp. 1551–1556. [Google Scholar]
  9. Adams, H.; Chepushtanova, S.; Emerson, T.; Hanson, E.; Kirby, M.; Motta, F.; Neville, R.; Peterson, C.; Shipman, P.; Ziegelmeier, L. Persistence Images: A Stable Vector Representation of Persistent Homology. arXiv 2016, arXiv:cs.CG/1507.06217. [Google Scholar]
  10. Kramár, M.; Goullet, A.; Kondic, L.; Mischaikow, K. Persistence of force networks in compressed granular media. Phys. Rev. E 2013, 87, 042207. [Google Scholar] [CrossRef] [Green Version]
  11. Nakamura, T.; Hiraoka, Y.; Hirata, A.; Escolar, E.G.; Nishiura, Y. Persistent homology and many-body atomic structure for medium-range order in the glass. Nanotechnology 2015, 26, 304001. [Google Scholar] [CrossRef] [Green Version]
  12. Dunaeva, O.; Edelsbrunner, H.; Lukyanov, A.; Machin, M.; Malkova, D.; Kuvaev, R.; Kashin, S. The classification of endoscopy images with persistent homology. Pattern Recognit. Lett. 2016, 83, 13–22. [Google Scholar] [CrossRef]
  13. Reininghaus, J.; Huber, S.; Bauer, U.; Kwitt, R. A stable multi-scale kernel for topological machine learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 4741–4748. [Google Scholar]
  14. Iijima, T. Basic theory on the normalization of pattern (in case of typical one-dimensional pattern). Bull. Electro-Tech. Lab. 1962, 26, 368–388. [Google Scholar]
  15. Bonis, T.; Ovsjanikov, M.; Oudot, S.; Chazal, F. Persistence-based pooling for shape pose recognition. In Proceedings of the International Workshop on Computational Topology in Image Context, Marseille, France, 15–17 June 2016; pp. 19–29. [Google Scholar]
  16. Dey, T.; Mandal, S.; Varcho, W. Improved image classification using topological persistence. In Proceedings of the Conference on Vision, Modeling and Visualization, Bonn, Germany, 25–27 September 2017; pp. 161–168. [Google Scholar]
  17. Kindelan, R.; Frías, J.; Cerda, M.; Hitschfeld, N. Classification based on Topological Data Analysis. arXiv 2021, arXiv:cs.LG/2102.03709. [Google Scholar]
  18. Carrière, M.; Chazal, F.; Ike, Y.; Lacombe, T.; Royer, M.; Umeda, Y. Perslay: A neural network layer for persistence diagrams and new graph topological signatures. In Proceedings of the International Conference on Artificial Intelligence and Statistics (PMLR), Online, 26–28 August 2020; pp. 2786–2796. [Google Scholar]
  19. Chung, M.K.; Lee, H.; DiChristofano, A.; Ombao, H.; Solo, V. Exact topological inference of the resting-state brain networks in twins. Netw. Neurosci. 2019, 3, 674–694. [Google Scholar] [CrossRef]
  20. Don, A.P.H.; Peters, J.F.; Ramanna, S.; Tozzi, A. Topological View of Flows Inside the BOLD Spontaneous Activity of the Human Brain. Front. Comput. Neurosci. 2020, 14, 34. [Google Scholar] [CrossRef]
  21. Don, A.P.; Peters, J.F.; Ramanna, S.; Tozzi, A. Quaternionic views of rs-fMRI hierarchical brain activation regions. Discovery of multilevel brain activation region intensities in rs-fMRI video frames. Chaos Solitons Fractals 2021, 152, 111351. [Google Scholar] [CrossRef]
  22. Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
  23. Hofer, C.D.; Kwitt, R.; Niethammer, M. Learning Representations of Persistence Barcodes. J. Mach. Learn. Res. 2019, 20, 1–45. [Google Scholar]
  24. Umeda, Y. Time series classification via topological data analysis. Inf. Media Technol. 2017, 12, 228–239. [Google Scholar] [CrossRef] [Green Version]
  25. Pun, C.S.; Xia, K.; Lee, S.X. Persistent-Homology-Based Machine Learning and its Applications—A Survey. arXiv 2018, arXiv:math.AT/1811.00252. [Google Scholar] [CrossRef] [Green Version]
  26. Allili, M.; Mischaikow, K.; Tannenbaum, A. Cubical homology and the topological classification of 2D and 3D imagery. In Proceedings of the 2001 International Conference on Image Processing (Cat. No. 01CH37205), Thessaloniki, Greece, 7–10 October 2001; Volume 2, pp. 173–176. [Google Scholar]
  27. Kot, P. Homology calculation of cubical complexes in Rn. Comput. Methods Sci. Technol. 2006, 12, 115–121. [Google Scholar] [CrossRef] [Green Version]
  28. Strömbom, D. Persistent Homology in the Cubical Setting: Theory, Implementations and Applications. Master’s Thesis, Lulea University of Technology, Lulea, Sweden, 2007. [Google Scholar]
  29. Choe, S. Cubical homology-based Image Classification-A Comparative Study. Master’s Thesis, University of Winnipeg, Winnipeg, MB, Canada, 2021. [Google Scholar]
  30. Fisher, M.; Springborn, B.; Schröder, P.; Bobenko, A.I. An algorithm for the construction of intrinsic Delaunay triangulations with applications to digital geometry processing. Computing 2007, 81, 199–213. [Google Scholar] [CrossRef]
  31. Otter, N.; Porter, M.A.; Tillmann, U.; Grindrod, P.; Harrington, H.A. A roadmap for the computation of persistent homology. EPJ Data Sci. 2017, 6, 1–38. [Google Scholar] [CrossRef] [Green Version]
  32. Kaczynski, T.; Mischaikow, K.M.; Mrozek, M. Computational Homology; Springer: Berlin/Heidelberg, Germany, 2004. [Google Scholar]
  33. Kalies, W.D.; Mischaikow, K.; Watson, G. Cubical approximation and computation of homology. Banach Cent. Publ. 1999, 47, 115–131. [Google Scholar] [CrossRef] [Green Version]
  34. Marchese, A. Data Analysis Methods Using Persistence Diagrams. Ph.D. Thesis, University of Tennessee, Knoxville, TN, USA, 2017. [Google Scholar]
  35. Özgenel, Ç.F.; Sorguç, A.G. Performance comparison of pretrained convolutional neural networks on crack detection in buildings. In Proceedings of the International Symposium on Automation and Robotics in Construction (ISARC), Berlin, Germany, 20–25 July 2018; Volume 35, pp. 1–8. [Google Scholar]
  36. Avilés-Rodríguez, G.J.; Nieto-Hipólito, J.I.; Cosío-León, M.d.l.Á.; Romo-Cárdenas, G.S.; Sánchez-López, J.d.D.; Radilla-Chávez, P.; Vázquez-Briseño, M. Topological Data Analysis for Eye Fundus Image Quality Assessment. Diagnostics 2021, 11, 1322. [Google Scholar] [CrossRef]
  37. Kusrini, K.; Suputa, S.; Setyanto, A.; Agastya, I.M.A.; Priantoro, H.; Chandramouli, K.; Izquierdo, E. Data augmentation for automated pest classification in Mango farms. Comput. Electron. Agric. 2020, 179, 105842. [Google Scholar] [CrossRef]
  38. Behera, S.K.; Rath, A.K.; Sethy, P.K. Fruit Recognition using Support Vector Machine based on Deep Features. Karbala Int. J. Mod. Sci. 2020, 6, 16. [Google Scholar] [CrossRef]
  39. Kather, J.N.; Weis, C.A.; Bianconi, F.; Melchers, S.M.; Schad, L.R.; Gaiser, T.; Marx, A.; Zöllner, F.G. Multi-class texture analysis in colorectal cancer histology. Sci. Rep. 2016, 6, 27988. [Google Scholar] [CrossRef]
  40. Xiao, H.; Rasul, K.; Vollgraf, R. Fashion-MNIST: A Novel Image Dataset for Benchmarking Machine Learning Algorithms. arXiv 2017, arXiv:1708.07747. [Google Scholar]
  41. Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; ter Haar Romeny, B.; Zimmerman, J.B.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vision Graph. Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
  42. Gadkari, D. Image Quality Analysis Using GLCM. Master’s Thesis, University of Central Florida, Orlando, FL, USA, 2004. [Google Scholar]
  43. Mohanaiah, P.; Sathyanarayana, P.; GuruKumar, L. Image texture feature extraction using GLCM approach. Int. J. Sci. Res. Publ. 2013, 3, 1–5. [Google Scholar]
  44. The GUDHI Project. GUDHI User and Reference Manual, 3.4.1 ed.; GUDHI Editorial Board; GUDHI: Nice, France, 2021. [Google Scholar]
  45. Dlotko, P. Cubical complex. In GUDHI User and Reference Manual, 3.4.1 ed.; GUDHI Editorial Board; GUDHI: Nice, France, 2021. [Google Scholar]
  46. Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32; Wallach, H., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E., Garnett, R., Eds.; Curran Associates, Inc.: North Adams, MA, USA, 2019; pp. 8024–8035. [Google Scholar]
  47. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  48. Harris, C.R.; Millman, K.J.; van der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
  49. McKinney, W. Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference, Austin, TX, USA, 28 June–3 July 2010; Volume 445, pp. 51–56. [Google Scholar]
  50. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  51. Mateen, M.; Wen, J.; Song, S.; Huang, Z. Fundus image classification using VGG-19 architecture with PCA and SVD. Symmetry 2019, 11, 1. [Google Scholar] [CrossRef] [Green Version]
  52. Liu, R.; Wang, F.; Yang, B.; Qin, S.J. Multiscale kernel based residual convolutional neural network for motor fault diagnosis under nonstationary conditions. IEEE Trans. Ind. Inform. 2019, 16, 3797–3806. [Google Scholar] [CrossRef]
  53. Friedman, J.H. Greedy function approximation: A gradient boosting machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
  54. Chen, T.; He, T.; Benesty, M.; Khotilovich, V.; Tang, Y.; Cho, H. Xgboost: Extreme gradient boosting. R Package Version 0.4-2 2015, 1, 1–4. [Google Scholar]
  55. Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
  56. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 2017, 30, 3146–3154. [Google Scholar]
  57. Patel, V.; Choe, S.; Halabi, T. Predicting Future Malware Attacks on Cloud Systems using Machine Learning. In Proceedings of the 2020 IEEE 6th International Conference on Big Data Security on Cloud (BigDataSecurity), High Performance and Smart Computing, (HPSC) and Intelligent Data and Security (IDS), Baltimore, MD, USA, 25–27 May 2020; pp. 151–156. [Google Scholar]
  58. Kayed, M.; Anter, A.; Mohamed, H. Classification of garments from fashion MNIST dataset using CNN LeNet-5 architecture. In Proceedings of the 2020 International Conference on Innovative Trends in Communication and Computer Engineering (ITCE), Aswan, Egypt, 8–9 February 2020; pp. 238–243. [Google Scholar]
  59. Tymchenko, B.; Marchenko, P.; Spodarets, D. Deep learning approach to diabetic retinopathy detection. arXiv 2020, arXiv:2003.02261. [Google Scholar]
Figure 1. Classification pipeline.
Figure 1. Classification pipeline.
Axioms 11 00112 g001
Figure 2. Examples of p-simplex for p = 0 , 1 , 2 , 3 in tetrahedron. A 0-simplex is a point, a 1-simplex is an edge with a convex hull of two points, a 2-simplex is a triangle with a convex hull of three distinct points, and a 3-simplex is a tetrahedron with a convex hull of four points [30].
Figure 2. Examples of p-simplex for p = 0 , 1 , 2 , 3 in tetrahedron. A 0-simplex is a point, a 1-simplex is an edge with a convex hull of two points, a 2-simplex is a triangle with a convex hull of three distinct points, and a 3-simplex is a tetrahedron with a convex hull of four points [30].
Axioms 11 00112 g002
Figure 3. An example of persistent homology for grayscale image. (a) A given image, (b) a matrix of gray level of given image, (c) the filtered cubical complex of the image, (d) the persistence barcode according to (c).
Figure 3. An example of persistent homology for grayscale image. (a) A given image, (b) a matrix of gray level of given image, (c) the filtered cubical complex of the image, (d) the persistence barcode according to (c).
Axioms 11 00112 g003
Figure 4. Sample images of the concrete crack dataset.
Figure 4. Sample images of the concrete crack dataset.
Axioms 11 00112 g004
Figure 5. Sample images of APTOS dataset. (a) is a picture of Non-diabetic retionpathy which is ordinary case and (b) is a picture of diabetic retinopathy which cas cause blindness.
Figure 5. Sample images of APTOS dataset. (a) is a picture of Non-diabetic retionpathy which is ordinary case and (b) is a picture of diabetic retinopathy which cas cause blindness.
Axioms 11 00112 g005
Figure 6. Data distribution for APTOS 2019 blindness detection dataset. This dataset can be classified as non-diabetic and diabetic. Around 50% of images are categorized as non-diabetic retinopathy (label 0) and diabetic retinopathy is subdivided according to the severity range from 1 to 4.
Figure 6. Data distribution for APTOS 2019 blindness detection dataset. This dataset can be classified as non-diabetic and diabetic. Around 50% of images are categorized as non-diabetic retinopathy (label 0) and diabetic retinopathy is subdivided according to the severity range from 1 to 4.
Axioms 11 00112 g006
Figure 7. Sample images of pest classification in mango farms.
Figure 7. Sample images of pest classification in mango farms.
Axioms 11 00112 g007
Figure 8. Data distribution for pest classification in mango farms dataset. Labels are assigned in alphabetical order.
Figure 8. Data distribution for pest classification in mango farms dataset. Labels are assigned in alphabetical order.
Axioms 11 00112 g008
Figure 9. Data distribution for the Indian fruits dataset. These data are already split by the original publishers of the dataset by the ratio of 9 to 1. Labels are assigned in alphabetical order.
Figure 9. Data distribution for the Indian fruits dataset. These data are already split by the original publishers of the dataset by the ratio of 9 to 1. Labels are assigned in alphabetical order.
Axioms 11 00112 g009
Figure 10. Example of colorectal cancer histology. (a) Tumor epithelium, (b) simple stroma, (c) complex stroma, (d) immune cell conglomerates, (e) debris and mucus, (f) mucosal glands, (g) adipose tissue, (h) background.
Figure 10. Example of colorectal cancer histology. (a) Tumor epithelium, (b) simple stroma, (c) complex stroma, (d) immune cell conglomerates, (e) debris and mucus, (f) mucosal glands, (g) adipose tissue, (h) background.
Axioms 11 00112 g010
Figure 11. Example of the Fashion MNIST dataset.
Figure 11. Example of the Fashion MNIST dataset.
Axioms 11 00112 g011
Figure 12. Comparison of the original image and the CLAHE-filtered image. (a) Original image, (b) persistence diagram of the original image (a), (c) CLAHE-filtered image, (d) persistence diagram of the filtered image (c).
Figure 12. Comparison of the original image and the CLAHE-filtered image. (a) Original image, (b) persistence diagram of the original image (a), (c) CLAHE-filtered image, (d) persistence diagram of the filtered image (c).
Axioms 11 00112 g012
Figure 13. Structure of the multi-scale 1D ResNet.
Figure 13. Structure of the multi-scale 1D ResNet.
Axioms 11 00112 g013
Figure 14. Confusion matrices of implementation with ResNet 1D. (a) Concrete, (b) Fashion MNIST, (c) APTOS, (d) Colorectal Histology.
Figure 14. Confusion matrices of implementation with ResNet 1D. (a) Concrete, (b) Fashion MNIST, (c) APTOS, (d) Colorectal Histology.
Axioms 11 00112 g014
Figure 15. Confusion matrix for the Indian fruits dataset with ResNet 1D implementation.
Figure 15. Confusion matrix for the Indian fruits dataset with ResNet 1D implementation.
Axioms 11 00112 g015
Figure 16. Confusion matrix for the Mangopest dataset with ResNet 1D implementation.
Figure 16. Confusion matrix for the Mangopest dataset with ResNet 1D implementation.
Axioms 11 00112 g016
Table 1. Dataset details with preprocessing times.
Table 1. Dataset details with preprocessing times.
DatasetSizeNum of ClassesPixel DimensionBalancedTime in Sec/Image
Concrete 140,0002227 × 227Yes0.4713
Mangopest 246,00016from 500 × 333 to 1280 × 853No0.5394
Indian fruits 323,84840100 × 100No0.4422
Fashion MNIST 460,0001028 × 28Yes0.4297
APTOS 536625227 × 227No0.5393
Colorectal histology 650008150 × 150Yes0.3218
1 Çağlar Fırat Özgenel, 23 July 2019, https://data.mendeley.com/datasets/5y9wdsg2zt/2; 2 Kusrini Kusrini et al., accessed on 27 February 2020, https://data.mendeley.com/datasets/94jf97jzc8/1; 3 prabira Kumar sethy, accessed on 12 June 2020, https://data.mendeley.com/datasets/bg3js4z2xt/1; 4 Han Xiao et al., accessed on 28 August 2017, https://github.com/zalandoresearch/fashion-mnist; 5 Asia Pacific Tele-Ophthalmology Society, accessed on 27 June 2019, https://www.kaggle.com/c/aptos2019-blindness-detection/overview; 6 Kather, Jakob Nikolas et al., 26 May 2016, https://zenodo.org/record/53169#.XGZemKwzbmG.
Table 2. Results of feature engineering process applied to the APTOS dataset.
Table 2. Results of feature engineering process applied to the APTOS dataset.
imglabelglcm1glcm2glcm24dim0_ 0dim0_ 1dim0_ 99dim1_ 0dim1_ 1dim1_ 99fdim0_ 0fdim0_ 1fdim0_ 143fdim1_ 0fdim1_ 1fdim1_ 143
020.16030.15710.463910.036601.20541.181500.99990.20600.03191.76981.63391.1067
366120.11960.11600.53870.99990.002001.47871.063600.99990.12950.00421.94931.36581.10478
Table 3. Accuracy and weighted F1 score for each dataset.
Table 3. Accuracy and weighted F1 score for each dataset.
ResNet 1DCARTGBMLightGBMRandom ForestSVMXGBoostkNNRelated Works—Benchmark
Accuracy0.9940.9890.9910.99450.9930.9560.99350.8900.999 with CNN [35]
ConcreteWeighted F10.9940.9880.9890.9940.9920.9550.9930.884
Run time465.879.08252.1511.257.63214.0559.151.93
Accuracy0.9310.7640.6810.8980.8690.4740.8890.6660.76 with CNN [37]
MangopestWeighted F10.9310.7640.6760.8980.8690.4390.8890.663
Run time760.9417.175562.09260.6213.94662.452041.222.33
Accuracy1.0000.96080.96080.96080.96080.73130.96080.6760.999 SVM with
Indian fruitsWeighted F11.0000.96080.96080.96080.96080.72360.96080.656deep features [38]
Run time271.214.444265.0982.554.1372.73451.651.18
Accuracy0.74270.5670.6960.7490.6930.5350.7460.3970.99 with CNN [58]
Fashion MNISTWeighted F10.74140.5690.6940.7490.6920.5290.7460.390
Run time467.128.361808.0789.377.66935.211108.203.38
Accuracy0.73260.6980.7600.7870.7820.6740.7750.6550.971 with CNN [59]
APTOSWeighted F10.6670.6950.7370.7710.7570.5910.7640.637
Run time61.810.6386.0213.160.703.4942.340.08
Accuracy0.8920.750.8420.8690.8550.6790.8740.7590.874 with SVM [39]
Colorectal histologyWeighted F10.890.7270.8320.8500.8340.6860.8430.743
Run time86.231.18255.0812.521.104.0644.630.14
Accuracy0.882 ± 0.1090.789 ± 0.1470.822 ± 0.1210.876 ± 0.0870.856 ± 0.1020.675 ± 0.1540.873 ± 0.900.674 ± 0.148
Weighted F10.871 ± 0.1250.784 ± 0.1480.815 ± 0.1240.870 ± 0.0910.851 ± 0.1050.654 ± 0.1640.866 ± 0.0920.662 ± 0.147
Table 4. Comparison of performance of GLCM+TDA, TDA feature-only, and GLCM-only implemented by 1D ResNet model on two datasets.
Table 4. Comparison of performance of GLCM+TDA, TDA feature-only, and GLCM-only implemented by 1D ResNet model on two datasets.
Colorectal Histology DatasetAPTOS Dataset
GLCM+TDA0.8920.7326
TDA0.76970.6739
GLCM0.6940.7252
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Choe, S.; Ramanna, S. Cubical Homology-Based Machine Learning: An Application in Image Classification. Axioms 2022, 11, 112. https://doi.org/10.3390/axioms11030112

AMA Style

Choe S, Ramanna S. Cubical Homology-Based Machine Learning: An Application in Image Classification. Axioms. 2022; 11(3):112. https://doi.org/10.3390/axioms11030112

Chicago/Turabian Style

Choe, Seungho, and Sheela Ramanna. 2022. "Cubical Homology-Based Machine Learning: An Application in Image Classification" Axioms 11, no. 3: 112. https://doi.org/10.3390/axioms11030112

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop