Osteoporosis Detection Using Machine and Deep Learning Techniques
Osteoporosis Detection Using Machine and Deep Learning Techniques
Osteoporosis Detection Using Machine and Deep Learning Techniques
This paper describes the process of using a deep learning framework to train and test two convolutional neural network
models (VGG-16, Resnet…. GoogleNet) to classify ostheoporosis…. To improve the results, we created an ensemble
and also averaged over K nearest neighbors and …..
Introduction
Osteoporosis is a condition that influences the bones. Its name comes from
Latin for "porous bones." Osteoporosis can happen in individuals of all ages,
yet it's more normal in more seasoned grown-ups, particularly ladies. In
excess of 53 million individuals in the United States either have osteoporosis
or are at high risk be affected by it.
These constraints pave a need to find some alternate diagnostic tool for
osteoporosis.
Recently, the list of models that aim to capture the risk of fragility fracture
has been widely increased by the application of Artificial Intelligence tools,
in the hope of making the predictions ever more useful. The AI tools have the
potential to capture underlying trends and patterns, otherwise impossible
with previous modeling tools; they learn from data [10-12].
So, The input to our models are images of patients subject to osteoporosis.
Each image belongs to one of the two classes described in Section. We then
use different types of convolutional neural networks (CNN): VGG-16, Resnet
and GoogleNet to predict to which class the given images belong. The output
is a list of predicted class labels and the corresponding probabilities
(confidence) for all images. our aim/goal in this project is to detect
osteoporosis cases, by using various Machine Learning Models to classify the
provided images into different categories of osteoporosis.
………………………………………………………………………………………………
2. Related Work
Many solutions already exist because this problem was a public challenge.
Many top solutions used pre-trained CNN models and the most popular are
VGG-16 and ResNet, which were state-of-the-art one or two years ago and
had been improving . Besides these models, two of the top performing
solutions we consulted for our project also used several good ideas:
ensemble, K nearest neighbors (KNN) and data augmentation [4][5]. First,
because the test set size is about four times as large as the training set, it is
easy to overfit. Creating an ensemble of models can reduce variance and
alleviate this problem. Secondly, because the images are taken from a ………,
many images can be very similar or almost identical and they should be
classified to the same class. Applying KNN can yield more stable results. We
have seen from previous solutions that KNN was used for either test or the
training test (data augmentation) with different ideology and the results
were good in both cases.
……………………………………………………………………………………………………………
…………………………………………………………………………………………………………….
……………………………………………………………………………………………………………
……………………………………………………………………………………………………….
Dataset Visualization
Thus, the final training set has ……. images; the final validation set has …..
images and the final testing set has 4485 images.
3.2 Workflow
Detecting osteoporosis — Flow Chart
4. Feature Extraction
Following feature extraction techniques were used for extracting features
from the images:
4.3 SURF
The SURF method (Speeded Up Robust Features) is a fast and robust
algorithm for local, similarity invariant representation and comparison of
images. The main interest of the SURF approach lies in its fast computation
of operators using box filters, thus enabling real-time applications such as
tracking and object recognition.
4.4 Color Histogram
A color histogram is a representation of the distribution of colors in an
image. For digital images, a color histogram represents the number of pixels
that have colors in each of a fixed list of color ranges, that span the image’s
color space.
4.5 KAZE
KAZE is a 2D feature detection and description method that operates
completely in nonlinear scale space.
4.6 Normalization:
In this technique of data normalization, a linear transformation is performed
on the original data. Minimum and maximum value from data is fetched and
each value is replaced according to the following formula:
where x is the original value and x’ is the normalized value.
4.7.1 PCA
Principal component analysis (PCA) is a technique for reducing the
dimensionality of datasets, increasing interpretability but at the same time
minimizing information loss.
4.7.2 LDA
Linear discriminant analysis (LDA) is a generalization of Fisher’s linear
discriminant, a method used in statistics and other fields, to find a linear
combination of features that characterizes or separates two or more classes
of objects or events.
5. Models
Following Models were used for classification of our dataset:
5.1.1Decision Tree
Decision Tree is an important non-parametric method for image mining,
This method uses several simple decision rules of the image to provide a path
for image classification.
5.1.3 KNN
The KNN algorithm assumes that similar things exist in close proximity. In
other words, similar things are near to each other KNN captures the idea of
similarity (sometimes called distance, proximity, or closeness)
5.2.1 XGBoost
XGBoost is a decision-tree-based ensemble Machine Learning algorithm that
uses a gradient boosting framework. In prediction problems involving
unstructured data (images, text, etc.) artificial neural networks tend to
outperform all other algorithms or frameworks.
5.2.2 Bagging
Bootstrap aggregating also called bagging (from bootstrap aggregating), is a
machine learning ensemble meta-algorithm designed to improve the
stability and accuracy of machine learning algorithms used in statistical
classification and regression. It also reduces variance and helps to avoid
overfitting.
5.2.3 ADABoost
AdaBoost algorithm, short for Adaptive Boosting, is a Boosting technique
that is used as an Ensemble Method in Machine Learning. It is called
Adaptive Boosting as the weights are re-assigned to each instance, with
higher weights to incorrectly classified instances.
5.3.1 CNN
Convolutional Neural Network (CNN) has proven to be very effective in areas
such as image recognition and classification. The advantage of using CNN is
that it can automatically learn a complex feature utilizing massive simple
neurons and backpropagation.
5.3.2.1 Strategy-1
Retrain only the last classifier layer of the pre-trained model and freeze all
other parameters.
5.3.2.2 Strategy-2
Retrain the last few layers of the model including the classifier layer.
CNN
6.6 CAM
Class activation maps (CAMs) is a way to highlight the regions within an
image that a CNN uses to make a classification decision for that particular
image.
Class Activation Map for ResNet-101 — S2
7. Conclusion
PCA gave better results than LDA and LDA over PCA. Combination of
features from various feature extractions gave better accuracy than taking
individual features. Adaboost performed poorly for classification on this
dataset as boosting techniques usually perform better on high variance
datasets which is contrary to our data set which has low variance. After
plotting the ROC curves for various traditional models, the best AUC-ROC
score was found for SVM. ResNet-101 S2 performed much better than
ResNet-101 S1 which is also higher than the accuracy obtained using CNN.
Data Augmentation and Data Extraction techniques helped the classifier to
perform better.