Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Berger 1
QUALITY ISSUES IN
NANOMANUFACTURING
THADDEUS BERGER1 , MOSTAFA GILANIFAR2 , TANMOY DAS2 , GRANT KLEINER2 , DR. ABHISHEK
SHRIVASTAVA2 *
1 FLORIDA INSTITUTE OF TECHNOLOGY
2 FLORIDA STATE UNIVERSITY
HIGH-PERFORMANCE MATERIALS INSTITUTE
FLORIDA A&M UNIVERSITY – FLORIDA STATE UNIVERSITY, COLLEGE OF ENGINEERING
2525 POTTSDAMER STREET, TALLAHASSEE, FLORIDA 32310
ABSTRACT
Nanoparticles have potential in a variety of applications, including cancer diagnosis and treatment,
structural health monitoring and β€œsmart” buildings, and improved solar cells. Nanoparticle
fabrication, however, is currently not standardized and not viable on an industrial scale. The scale-
up of nanomanufacturing requires application of quality engineering tools for optimizing process
yield, variance reduction, and process monitoring and control. This requires methods for estimating
nanoparticle dimensions and spatial arrangement, as these significantly influence nanomaterial
thermal, physical, optical, and electromagnetic properties. The objective of this research is to use
supervised learning algorithms and machine learning to develop a system to automatically detect
nanoparticles and estimate size and spatial distribution.
This paper will first go into some detail on the applications of nanotechnology, the necessity of an
industrial-scale manufacturing procedure, and how supervised learning ties into achieving the goal
of commercialization. Then, background will be given on supervised learning before discussing
classification. After outlining the flow of the project, detail will be given on our chosen classification
and feature extraction techniques as well as clustering methods used for multi-class classification.
Berger 2
1: INTRODUCTION
Improving the scalability of nanomaterial
production has many commercial applications.
Composites with nanomaterials are becoming
increasingly common in research, and
applications for nanomaterials exist in many
commercial venues including cancer cell
targeting and treatment [1] [2], structural
health monitoring [3] [4], and more effective
solar panels [5]. The high surface area-to-
volume ratios and ideal thermal, physical,
optical, and electromagnetic properties of
nanoparticles [6] make nanoparticles critical to
the commercialization of modern technology.
Currently, labs around the country are using
nanomaterials for research and design of
lighter, stronger materials. Numerous labs use
scanning electronmicroscopes (SEM), but their
purposes are largely qualitative. No all-in-one
process or software package currently exists to
provide a quick, affordable way to make
nanomaterial production viable at an
industrial scale. Designing such a process
would allow the development of quality
engineering tools to optimize process yield and
reduce variance. Without such tools,
nanotechnology will be prohibitively expensive to
produce on an industrial scale. The system to
enable the development of the necessary
engineering tools would use data extracted from
the SEM images to learn patterns from the data
and allow predictions to be made. Machine
learning can be applied to complete this task and
provide the tools needed to scale up
nanomanufacturing. Therefore, the development
of automated systems for estimating size and
spatial distribution, which heavily influence
material properties at the nanoscale, is necessary
for the scale-up of nanomanufacturing processes.
Dimensional estimation and control would allow
industries to quickly, accurately, and affordably
determine the best nanomanufacturing processes.
This capability would also allow for the
standardization of nanomanufacturing processes,
further improving scalability.
FIGURE 1: APPLICATION OF NANOMATERIALS IN
CANCER TREATMENT [2].
FIGURE 2: APPLICATION OF NANOMATERIALS (IN
THIS CASE, SILVER NANOWIRES) IN IMPROVING
SOLAR CELL TECHNOLOGY [5].
FIGURE 3: APPLICATION OF NANOMATERIALS IN
STRUCTURAL HEALTH MONITORING [4].
Berger 3
The need for a way to scale up nanomanufacturing was addressed by using supervised machine learning
methods to correctly predict the locations of nanorods in SEM images with the goal of extracting dimension
estimates based on the projection lengths. Machine learning is a hot topic for research both by
statisticians and computer scientists, and as a result, new literature on this subject continues to
emerge. Pattern recognition, a branch of machine learning, has numerous applications including
facial recognition, medical imaging and diagnostics, and speech recognition, to name a few. As
computing power has grown exponentially [7], statistical computing has become widely used in a
variety of fields. We applied all of these principles to extract features from and classify our
micrographs in order to make predictions and estimate nanorod dimensions.
The following section will explore our experimental methodology, starting with our data collection
procedure before giving an overview of the methods used for selecting, building, and testing models,
extracting image features, and the use of clustering to infer additional features and for testing with multiple
classes. Then, the results and analysis will be presented, followed by discussion and our conclusions and
recommendations for future research.
2: EXPERIMENTAL METHODOLOGY
2.1: SUPERVISED LEARNING
Statistical learning is the field that deals with making predictions or inferences based on given data.
Supervised learning is a subset of statistical learning which focuses on making predictions. In
supervised learning, we assume that there is a relationship between the response (dependent
variable, output) and one or more predictors (independent variables, inputs, features). We use a set
of training data to learn a model, a function of the predictors. The model can be used to make
predictions, and is validated using test data. In contrast, unsupervised learning is done without
training – you simply analyze the data to establish trends, relationships, or distinctions between
groups of observations [8, pp. 26]. Both are very useful, and both were used in this research, but the
main focus was supervised learning.
As mentioned before, the goal of supervised learning is the make predictions. To ensure that the
predictions are accurate, we want to minimize the error in our predictions. The prediction error can
usually be decomposed into its components of bias, model variance, and observation noise
(variance). Equation 1 shows this breakdown for mean squared error (MSE), a popular metric for
continuous response. Bias refers to the error due to approximation (i.e. modeling a slightly curved
data set with a straight line), and variance refers to how much your approximation would change if
used to predict another data set (i.e. an overly curved model would change significantly with new
data) [8, pp. 33]. In general there is an inverse relationship between bias and variance, which leads
to the bias-variance tradeoff, so the goal of supervised learning becomes to minimize the following:
𝑬𝒙𝒑𝒆𝒄𝒕𝒆𝒅 𝒕𝒆𝒔𝒕 𝑴𝑺𝑬 = 𝑴𝒐𝒅𝒆𝒍 π‘½π’‚π’“π’Šπ’‚π’π’„π’† + (𝑴𝒐𝒅𝒆𝒍 π‘©π’Šπ’‚π’”) 𝟐
+ 𝑬𝒓𝒓𝒐𝒓 π‘½π’‚π’“π’Šπ’‚π’π’„π’† (1).
A model must be selected by minimizing an error function like that in (1) before using it to make
predictions. However, estimating the error terms above is limited by available data, and more
importantly, on whether the available data is representative of all the patterns and variations in
future samples. The estimation, thus, requires special techniques for making valid error estimates.
Sampling methods validate a model by repeatedly drawing different samples from the training data
Berger 4
to draw more information about the model. Cross validation is one of the most popular sampling
methods. Cross validation splits the training set into groups, and fits the model to each group,
assessing the model’s performance by averaging all the groups’ errors. K-fold cross validation is very
common in statistical learning; the training data, with n observations, is separated into K groups of
almost equal size. A special case is when K = n groups of size n-1, which is called leave-one-out cross
validation (LOOCV). LOOCV gives the best error estimate, but becomes extremely computationally
expensive when dealing with big data.
One of the most important aspects of our research was feature extraction. We needed to find the
features which distinguish the pixels of a nanorod from the background in order to build our model.
The effect is very similar to human memory – for example, you may recognize the people, places, and
objects you know based on their sounds, smells, physical features, or mannerisms (or a combination
of them). There are several feature detection algorithms available, and nearly limitless combinations
of features can be extracted, as there are approximately 10,000-30,000 different object categories
[9]. We needed to find the features that could not only segment out the nanorods in the images
correctly, but we needed to do so under varying levels of brightness, sharpness, or intensity across
multiple images. These differences can be significant, as highlighted in Figure 4.
2.2: CLASSIFICATION
All supervised learning models are built with the goal of predicting a response to one or more
predictors. Models can be built to serve a variety of purposes and to fit a variety of trends. For this
reason, selecting the proper method is of vital importance. Since the first goal of our research was to
pick out nanorods from images, it was clear that we would be using classification algorithms. Humans
perform classification instinctively many times per day. Classification is the simple association of
items to their descriptions (or other people to their names). While linear regression is used when
predicting a quantitative (numerical) response, classification is used when predicting a qualitative
(categorical) response. For our data, we needed to use classification to determine which pixels were
parts of nanorods and which were not.
There are many types of classifiers, or models used to classify data. One popular classifier is logistic
regression. This classifier assumes that the logarithm of the odds that an observation will be in a
certain class, or log-odds, is linear. The coefficients of the linear portion can be estimated using the
maximum likelihood method, which estimates the coefficients such that all training observations can
(a) (b)
FIGURE 4: NANOROD MICROGRAPHS. SEM IMAGES OFTEN VARY
SIGNIFICANTLY IN BRIGHTNESS, SHARPNESS, AND INTENSITY. A) [10],
B) [11].
Berger 5
be classified correctly [8, pp. 132]. This can be formalized mathematically using the likelihood
function, where Ξ²0 and Ξ²1 are regression coefficients:
𝒍(𝜷 𝟎, 𝜷 𝟏) = ∏ 𝒑(π’™π’Š)
π’Š:π’šπ’Š=𝟏
∏ (𝟏 βˆ’ 𝒑(π’™π’Šβ€²))
π’Šβ€²:π’šπ’Šβ€²=𝟎
(2).
Logistic regression is mostly used for binary classification. When dealing with more than two classes,
linear discriminant analysis (LDA) is commonly used. LDA attempts to approximate the Bayes
classifier, which is the ideal classifier [8, pp. 37]. Quadratic discriminant analysis (QDA) is similar to
LDA, but uses quadratic discriminant functions instead of linear. K-nearest neighbors (KNN) is a
classification method which attempts to classify observations based on the K nearest observations.
Some popular, more computer-intensive methods include decision trees, random forests, boosting,
and support vector machines (SVM) [8, pp. 127]. Our group used SVM for classification, which is
covered in section 2.4.
2.3: OVERALL FLOW
Our experiments followed the structure of the flow chart in Figure 5.
FIGURE 5: GENERAL FLOW FOR AN EXPERIMENT.
For an experiment, the features of a training image were extracted as a data matrix and then used to
train a model. Then, the model was validated using a new image. This process was repeated for
various methods and feature extraction techniques. The success or failure of the model was
determined by classification error or 0/1 loss. Classification error has two components: misdetection
(false positive) error and false alarm (false negative) error. The general goal was to solve the
following optimization problem:
π‘΄π’‚π’™π’Šπ’Žπ’Šπ’›π’†(
π‘΅π’–π’Žπ’ƒπ’†π’“ 𝒐𝒇 π’„π’π’“π’“π’†π’„π’•π’π’š π’„π’π’‚π’”π’”π’Šπ’‡π’Šπ’†π’… π’π’ƒπ’”π’†π’“π’—π’‚π’•π’Šπ’π’π’”
π‘΅π’–π’Žπ’ƒπ’†π’“ 𝒐𝒇 π’π’ƒπ’”π’†π’“π’—π’‚π’•π’Šπ’π’π’”
) (3).
Prediction On
New Images
Berger 6
2.4: SUPPORT VECTOR MACHINES (SVM)
SVM is a vector space-based classifier which
separates training data based on their class labels
with a hyperplane such that the hyperplane is the
farthest possible from points in either class [13].
SVMs are one of the most popular machine learning
techniques available today [14]. SVM can be used to
construct linear or nonlinear classifiers (Figure 6)
using the kernel trick. A kernel is a function which
quantifies the similarity of two observations and
implicitly maps the data to a higher dimensional
feature space. The SVM then learns a (linear
classifier) hyperplane in this high-dimensional
feature space, resulting in nonlinear classification
boundaries in the original space [8, pp. 350]. Using
kernels is computationally less expensive than
creating new features (data transformations)
explicitly. For our research, we used mostly radial
and some linear kernels. SVM can have a hard or soft
margin. A hard margin a classifier that does not allow any misclassified observations. A soft margin
yields a smoother classifier by allowing some misclassifications [15]. The soft margin’s ability to
ignore some observations usually results in a better overall fit. The parameter which controls the
margin of an SVM is C (cost). Another parameter, Ξ³ (gamma), parametrizes the kernel function [16].
2.5: SCALE-INVARIANT FEATURE TRANSFORM (SIFT)
SIFT is a robust image descriptor developed by
David Lowe in 1999 which is commonly used in the
field of computer vision for object detection and
point matching between different views of 3D
objects. SIFT detects key points (points of interest)
for features. Histograms are used to generate a
vector of 128 features at each key point. These
vectors (descriptors) are used to classify the image
[9]. Object detection was critical for this research,
because we needed to be able to detect the features
which distinguished nanorods from the
background and because these features had to hold
up to the differences shown in the β€œSupervised
Learning” section. We take this ability for granted as
humans because it is so easy for us. Humans are
capable of distinguishing thousands of types of
objects [9] with almost no difficulty through a wide range of illuminations, orientations, distances,
and distractions. For example, you may be able to recognize your car very quickly from relatively far
away. But how do you know the car is yours? Surely you do not own the only car of that type in the
world. You simply recognize the car intuitively, immediately recognizing all the features of the car
FIGURE 7: FLOW OF IMAGE CLASSIFICATION
USING SIFT FEATURE EXTRACTION ALGORITHM
[9].
FIGURE 6: SAMPLE SVM CLASSIFICATION PLOT.
Berger 7
that make it yours, and this is the goal of object detection. When
using SIFT, the first key points are detected by the scale-space
extrema of the Difference-of-Gaussian (DoG) values, and SIFT
extracts a 128-dimensional descriptor vector for each key point
[16]. Figure 8a shows a plot of SIFT key points overlaid on an
image. Plotting the SIFT descriptors was unnecessary for our
research, as the descriptors were simply extracted into tables
in CSV files. However, the results of plotting the SIFT
descriptors for an image can be seen in Figure 8b.
The DoG operator normalizes SIFT features and causes the
features to be scale invariant. This means that the features do
not vary with rotation, translation, or scaling. SIFT features can
be detected through wide differences in intensity, illumination,
and sharpness of an image. This made SIFT a top option, giving
us 128 low-variance predictors to study and to trainour models
instead. However, one issue with SIFT is that it may eliminate
critical variations which could help our SVM correctly classify
images. This issue was studied further by using K-means
clustering, an unsupervised learning method, to consider a
multi-level classification problem.
2.6: K-MEANS CLUSTERING & MULTI-LEVEL SVM CLASSIFICATION
Clustering is an unsupervised learning method which separates a data set into several groups of
similar observations. We used clustering to allow us to use more than two response variables.
Previously, we classified pixels or SIFT descriptors as belonging to either the foreground (nanorods)
or background (not nanorods). However, foreground observations may themselves have a variety of
patterns. Separating these patterns into separate groups, using clustering, can improve the accuracy
of the learned classifiers. K-means clustering was used to group the foreground and background data
so that we could use a multi-class SVM, which is an SVM with more than two classes to separate. K-
means clustering simply divides n observations into K clusters, where K is a selected value. In K-
means clustering each observation is placed into the cluster with the nearest mean. This is very
similar to K-medoids clustering, where each observation is placed into the cluster with the nearest
median. In R, there is a function to partition around medoids (PAM), part of a package called Flexible
Procedures for Clustering (FPC), which estimates the best value of K [18]. While PAM is meant for K-
medoids clustering, the similarities between K-means and K-medoids mean that PAM also gives a
good estimate of K for K-means. After using K-means to cluster a data set, cluster number can be used
as the response in a multi-class SVM. Foreground and background data were clustered separately,
with K = 10 and K = 2, respectively. Then, the data was combined to run a 12-class SVM on the entire
image.
(b)
(a)
FIGURE 8: A) SIFT KEYPOINTS ON A
TEST IMAGE [17]; B) SIFT
DESCRIPTORS.
Berger 8
2.7: EXPERIMENT SETUP
When building a supervised learning model, the first step is to acquire training data to build a model
to predict test data. We began by creating β€œGround Truth” (GT) from micrographs which we acquired
from publications and from other labs. GT was used to generate training data which was used to train
our models. GT was created by using Microsoft Paint to color over distinguishable nanorods in the
selected micrographs. Each nanorod was assigned a set of RGB (red, green, blue) color values.
The first predictors tested were the (x, y) pixel coordinates. The (x-y) coordinate system in image
processing is different from typical Cartesian coordinates. With a Cartesian system, the positive y axis
is oriented vertically upward, but is vertically downward in image processing. Next, SIFT features
were used as predictors. For an image with N descriptors, the data was an Nx128 matrix. The next
set of features used was based on a pixel’s neighbors. For every pixel in a training image, we used 25
features a 5x5 descriptor vector per pixel. Thus, for a training image with P pixels, the data was a
Px25 matrix. These features were extracted in CSV format in MATLAB to be used in R to train an SVM.
When implementing the SVM in R, we used tuning to determine the best combination of parameters
for our model. Since we were dealing with large, high-dimensional data sets, tuning became highly
computationally expensive. SVM tuning could take hours or even days to execute. The only solutions
were to use a faster computer or write faster codes. R, by default, does not utilize parallel computing,
so using a multi-core computer (almost all of today’s computers have multiple cores) initially
provides no advantage. To address the need for faster computing, it was necessary to develop an SVM
implementation which worked in parallel on a high-performance multi-core machine. Tuning
optimizes an SVM over a range of parameters. R tunes the SVM by testing one model at a time by
default. The doParallel library in R provides a parallel backend, or a parallel network of workers, for
the foreach loop. DoParallel must be combined with the parallel library, which is included in recent
versions of R. The foreach library enables the foreach loop, and along with the iterators library,
enables the parallel and doParallel libraries to be installed. The e1071 library is needed to run and
tune SVM. The foreach loop can be easily nested to conduct run SVM for variations of C and Ξ³ [19].
The foreach loop did not tune the SVM; the loop ran an SVM for each combination of C and Ξ³, all at
once. For a given training data set, the model with the best accuracy (lowest error) was selected and
a single SVM (a much faster computation) was run with that set of parameters. This implementation
provided us with a major speed increase when dealing with big data and would be advantageous for
industrial nanomanufacturers with access to high-performance servers.
Berger 9
3: RESULTS AND DATA ANALYSIS
3.1: DATA COLLECTION
GT was created for 16 SEM images. Numerical data matrices were extracted using MATLAB in the
form of CSV files for both training (GT) images and test images. Models were trained using R on a 40-
core Linux (Ubuntu) platform. Using GT images to train our models allowed us to predict the locations
of nanorods in other images.
3.2: FEATURE EXTRACTION
Our first tests used the (x, y) coordinate values
as the only variables to predict the response of
foreground or background. Next, we used SIFT
features as predictors, first with a binary
response, and then using K-means clustering to
conduct 12-class classifications. Using PAM, we
found the ideal value of K for the foreground
and background data to be 10 and 2,
respectively. A section of the data matrix for
Figure 8 is shown to visualize our data. A
sample training image and GT are shown in
Figure 9 with SIFT keypoints plotted. FIGURE 9: EXAMPLE SIFT IMAGE DATA MATRIX. THE
IMAGE HAD 2,480 DESCRIPTORS AND A BINARY
RESPONSE (0 FOR BACKGROUND AND 1 FOR
FOREGROUND).
FIGURE 10: EXAMPLE ORIGINAL IMAGE (A) AND GT (B) [20].
(b)(a)
Berger 10
3.3: CLASSIFICATION
SVM tuning originally lasted from as little as 30 minutes for smaller data sets to days for larger data
sets. To address this issue, I developed a parallel computing method as discussed in the Experiment
Setup section. Using nested foreach loops, I was able to run an
SVM for all combinations of C and Ξ³ very quickly. When testing
the effectiveness of my code, running on 25 of our 40 cores, I was
able to increase the speed of the tuning process by a factor of 30
(the image took ~30 minutes to tune originally but only 57
seconds with my code) when running nine combinations of C and
Ξ³. This implementation became very useful for tuning large data
sets like those generated by the 25 feature descriptors for every
pixel. I tuned one of these images in parallel over three hours,
meaning that the default tuning process would have lasted
almost four days. After tuning in parallel, a new SVM had to be
made using the parameters from the loop which resulted in the
lowest error. However, the parallel implementation was still an
order of magnitude faster than the default tuning.
When we applied a model from our SVM to a test image, we could
generate a table called a confusion matrix, which is simply a table of predicted values vs. real values
where each value corresponds to a class. A binary SVM classification was run for the GT image in
Figure 11 and tested with the image in Figure 8a, resulting in a
confusion matrix (Figure 13) and an SVM classification plot (Figure 12).
Using the confusion matrix and Equation 3, accuracy (Ξ·) can be
computed for each test of a classifier. Table 1 (next page) shows Ξ· for
17 experiments which used six images (designated Image 1 – Image 6,
see Figures 13 and 14 on the next page). Table 2 (pp. 12) shows the
average and standard deviation for Ξ· for each set of features and each
value of C and Ξ³.
FIGURE 11: SVM CLASSIFICATION
PLOT FOR EXAMPLE TEST IMAGE.
FIGURE 12: CONFUSION
MATRIX FOR EXAMPLE SVM.
Berger 11
Experimental Results
Experiment Training (GT)
Image No.
Test
Image No.
Classes Features C Ξ³ Accuracy (Ξ·)
1 1* 1* 12 25 100 0.5 0.927537
2 5 3 2 (x, y) 100 5 0.761436
3 5 4 2 (x, y) 100 5 0.718798
4 3 2 2 SIFT 0.1 0.5 0.912782
5 3 4 2 SIFT 0.1 0.5 0.798438
6 3 5 2 SIFT 0.1 0.5 0.851351
7 3 6 2 SIFT 0.1 0.5 0.919440
8 1 1 2 SIFT 10 1 0.865906
9 1 1 2 SIFT 100 1 0.865906
10 1 1 2 SIFT 1000 1 0.865906
11 1 1 2 SIFT 10 5 0.865906
12 1 1 2 SIFT 100 5 0.865906
13 1 1 2 SIFT 1000 5 0.865906
14 1 1 2 SIFT 10 10 0.865906
15 1 1 2 SIFT 100 10 0.865906
16 1 1 2 SIFT 1000 10 0.865906
17 4 6 2 SIFT 100 0.5 0.919440
Avg. Ξ· 0.858963
Std. Dev. 0.053262
TABLE 1: EXPERIMENTAL RESULTS. IMAGES 1-6 SHOWN IN FIGURES 13 AND 14.
FIGURE 13: ORIGINAL IMAGES FROM EXPERIMENTS, WITH A-F CORRESPONDING
TO IMAGES 1-6 IN TABLE 1: A) [11], B) [21], C) [10], D) [20], E) [22], F) [17].
FIGURE 13: GT IMAGES FROM EXPERIMENTS, WITH A-F CORRESPONDING TO IMAGES 1-6
IN TABLE 1: A) [11], B) [21], C) [10], D) [20], E) [22], F) [17].
(a)
(e)(d) (f)
(c)(b)
(a) (b) (c)
(f)(e)(d)
Berger 12
4: DISCUSSION AND CONCLUSIONS
4.1: DISCUSSION
The main assumption in this research is
that all nanorods extend into the
substrate. This assumption will allow
us to use the top and side edges and the
angle of the side edges to find the
nanorods’ projection lengths.
Tables 1 and 2 show that the features
have the strongest correlation to SVM
classifier accuracy. As expected, using
only the (x, y) coordinates of the pixels
resulted in poor accuracy. The SIFT
features were quite robust, and the 25
feature image was well above the
average, but more trials from all three
sets of features are necessary to be able
to truly say that which features are the
best. Varying C and Ξ³ shows very little
effect on the accuracy of the classifier,
but low values of C and Ξ³ did result in a marked improvement in accuracy. This makes sense because
we were dealing with big data sets, meaning that giving single data points too much influence could
lower the classifier accuracy. Although more testing will be needed to confirm the positive effect,
using multiple classes by clustering, using cluster number as the response, may improve accuracy
dramatically. Based on these results, using SIFT feature descriptors or the 25 feature descriptors with
low values of C and Ξ³ for a 12 class SVM classifier should result in high accuracy. Maximizing classifier
accuracy will allow industries to classify images with very little error and accurately determine the
optimal nanomanufacturing processes.
The SVM classification plot in Figure 11 appears to give an upside down view of the test image (Figure
8a). This makes sense because the positive y-axis in images is oriented vertically downward, as
mentioned in the Experimental Setup section. Knowing this, the plot clearly performs reasonably
well, as the boundary separating the nanorods from the background appears clearly in the SVM
classification plot. Some misclassifications are clear both in the nanorods and in the background,
which makes sense due to this particular SVM’s accuracy rating of about 86%.
Most of the error in our experiments was due to the propagation of human error in coloring GT
images. Using multiple GT images as training data in the future may minimize the effect of human
error. Any remaining error is due to random intrinsic errors. Therefore, running more experiments
will be critical to finding the features and parameters to maximize classifier accuracy. Another
important source of error is due to the estimation of nanoparticle dimensions using projection
lengths. In the future, it will be important for nanomanufacturers to understand that the dimensions
being extracted are not exact. However, the goal is to minimize these errors to the point where valid
observations of changes in length can be made.
Features Avg. Ξ· Difference from
Total Avg. Ξ·
Std. Dev.
(x, y) 0.740117 -0.118846 0.021319
SIFT 0.871043 0.012080 0.029669
25* 0.927537* 0.068574* 0*
C
0.1 0.870503 0.011539 0.049352
10 0.865906** 0.006943** 0**
100 0.846418 -0.012545 0.072270
1000 0.865906** 0.006943** 0**
Ξ³
0.5 0.888165 0.029201 0.047467
1 0.865906** 0.006943** 0**
5 0.815590 -0.043373 0.063082
10 0.865906** 0.006943** 0**
TABLE 2: EFFECTS OF FEATURE TYPE, C, AND GAMMA ON
ACCURACY. *ONLY ONE TRIAL USING 25 FEATURES, **ALL
EXPERIMENTS YIELDED THE SAME RESULTS.
Berger 13
Experiments 8-16 in Table 1 resulted in the same accuracy. These results mean that one or more of
the following are true:
1. SIFT features are unaffected by C and Ξ³. This is unlikely because experiments 4-7 show a
chance in accuracy using SIFT features with low values of C and Ξ³. One possible conclusion is
that the classifier accuracy stagnated due to the C and Ξ³ values being too, meaning that SIFT
features should only be used with low C and Ξ³.
2. For all the experiments where Ξ· = 0.865906, there were zeroes in the bottom right corner of
the confusion matrix. This may mean that there were errors in this data or that binary
classification does not work well for SIFT features. Multi-class classification should be tested
for these experimental setups.
3. The accuracy data in Tables 1 and 2 are biased. As mentioned before, more experiments
should be run for all combinations of parameters for binary and multi-level classification in
order to get a clearer picture of how to maximize accuracy.
4. The SIFT features are too robust to properly classify our images. SIFT was designed to only
use features which did not respond to changes in lighting, viewing angle, or changes in image
brightness, sharpness, or intensity. Some of the factors which are critical to properly
classifying our images may not be considered by using SIFT. For example, changes in size may
affect optical properties of the nanoparticle being examined, and the SIFT features may be
too robust to recognize the change.
4.2: FUTURE WORK AND RECOMMENDATIONS
For future research, more experiments for all combinations of methods and parameters shown in
Table 1 are recommended to get a clearer picture of what features and parameters work best. Wider
ranges of C and Ξ³ values, as well as intermediate values, should be tested in order to potentially
establish relationships between the SVM classifier accuracy for different sets of features and the
combination of C and Ξ³. Further use of multi-class classification is also recommended due to the high
accuracy (and the need of more data to confirm that high accuracy). Furthermore, different clustering
algorithms such as K-medoids and hierarchical clustering should be attempted for multi-level
classifications. Also, the effect of boosting or changing classification methods should be investigated.
In industry, speed, automation, and accuracy will be critical. Therefore, once classifier accuracy is
optimized, I propose the following recommendations before commercializing this research:
1. Codes should be adapted to a newer, faster language, such as C++ or Python, to improve
computation speed and to create a single software package to be licensed. Computer science
professionals are increasingly being educated in newer languages (i.e. C++, Python), and our
programs should be written in the languages they are familiar with, as well as implementing
optimized algorithms, multiple processors, and GPUs.
2. All codes should be written to utilize multi-core machines, and if possible, should
automatically run on a specified percentage of the available cores. This would maximize
accuracy for all nanomanufacturers who use the software.
3. An autorun script should be implemented, and a user-friendly Graphical User Interface (GUI)
should be developed. This would reduce the cost of labor for nanomanufacturers.
Berger 14
4.3: CONCLUSIONS
This research shows that supervised learning algorithms have potential to be an excellent solution
for nanoparticle dimension estimation and control, which would be a significant quality engineering
development. The completion of this research will enable the scale-up, standardization, and
commercialization of nanomanufacturing processes. The initial classifier accuracy is promising, and
points to a bright future for this technology as further improvements are discovered.
Berger 15
REFERENCES
[1] G. Ali Mansoori, P. Mohazzabi, P. McCormack, S. Jabbari. β€œNanotechnology in cancer
prevention, detection and treatment: bright future lies ahead,” World Review of Science,
Technology and Sustainable Development, vol. 4, nos. 2/3, 2007.
[2] D. Jenvey. (2012, November 4) Materials for Dermatological Nanotechnology [Online].
Wikispaces, Tangient LLC. Available: http://nanotechnology-cis.wikispaces.com/
Materials+for+dermatological+nanotechnology. Last accessed: 24 July 2015.
[3] R. Elhajjar, V. La Saponara, A. Muliana. Smart Composites: Mechanics & Design, CRC Press,
Taylor & Francis Group, LLC., Boca Raton, FL, 2014.
[4] I. Kang, M. J. Schulz, J. H. Kim, V. Shanov, D. Shi. β€œA carbon nanotube strain sensor for
structural health monitoring,” Smart Materials and Structures, 15 (2006) 737-748.
[5] L. J. AndrΓ©s, M. F. MenΓ©ndez, D. GΓ³mez, A. L. MartΓ­nez, J. P. Kettle, A. MenΓ©ndez, B. Ruiz. β€œRapid
synthesis of ultra-long silver nanowires for tailor-made transparent conductive electrodes:
proof of concept in organic solar cells,” Nanotechnology, vol. 26, 2015.
[6] A. Mandal. (2012, November 4). Properties of Nanoparticles [Online]. Available: http://
http://www.news-medical.net/health/Properties-of-Nanoparticles.aspx. Last accessed: 24
July 2015.
[7] Berkeley School of Information. (2014, March 5). Moore’s Law and Computer Processing
Power [Online]. Available: http://datascience.berkeley.edu/moores-law-processing-power/.
[8] G. James, D. Witten, T. Hastie, R. Tibshirani. An Inroduction to Statistical Learning with
Applications in R, Springer Science+Business Media, New York, 2013.
[9] J. Kim, B. Kim, S. Savarese. β€œComparing Image Classification Methods: K-Nearest Neighbor and
Support Vector Machines,” Proc. 6th WSEAS International Conference on Circuits, Systems,
Signals and Telecommunications, Cambridge, pp. 133-138, 2012.
[10] K. Kim, K. Utashiro, Y. Abe, M. Kawamura. β€œStructural Properties of Zinc Oxide Nanorods
Grown on Al-Doped Zinc Oxide Seed Layer and Their Applications in Dye-Sensitized Solar
Cells,” Materials, 7(4):2522{2533, 2014.
[11] C. Thelander, P.Agarwal, S. Brongersma, J. Eymery, L. F. Feiner, A. Forchel, M. Scheffler, W.
Riess, B.J. Ohlsson, U. GΓΆsele, L. Samuelson. β€œNanowire-based one dimensional electronics,”
Materials Today, vol. 9, no. 10 (pp. 28-35), October 2006.
[12] C. D. Manning, P. Raghavan, H. SchΓΌtze. An Introduction to Information Retrieval, Cambridge
University Press, Cambridge, England, 1 April 2009.
[13] X. Wu et, V. Kumar, J. R. Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A Ng. B. Liu, P.
S. Yu, Z. Zhou, M. Steinback, D. J. Hand, D. Steinberg. β€œTop 10 algorithms in data mining,”
Knowledge and Information Systems, no. 14:1-37, 2008.
Berger 16
[14] Y. Liu, H. Zhang, Y. Wu. β€œHard or Soft Classification? Large-margin Unified Machines,” North
Carolina State University, Raleigh, North Carolina, 10 January 2011.
[15] SciKit-Learn. (2014). RBF SVM Parameters [Online]. Available: http://scikit-
learn.org/stable/auto_examples/svm/plot_rbf_parameters.html. Last accessed: 24 July
2015.
[16] T. Lindeberg. (2012). Scale Invariant Feature Transform [Online]. Scholarpedia. Available:
http://www.scholarpedia.org/article/Scale_Invariant_Feature_Transform. Last accessed: 24
July 2015.
[17] H. Gao W. Cai, P. Shimpi, H. Lin, P. Gao. β€œ(La,Sr)CoO 3/ZnO nanofilm–nanorod diode arrays
for photo-responsive moisture and humidity detection,” Journal of Physics D: Applied
Physics, 43(27):272002, 2010.
[18] C. Hennig. (2014, October 2) Flexible procedures for clustering [Online]. CRAN Repository.
Available: https://cran.r-project.org/web/packages/fpc/fpc.pdf.
[19] S. Weston. β€œNesting Foreach Loops,” Revolution Analytics, Redmond, Washington, 10 April
2014.
[20] R. Wang, H. Tan, Z. Zhao, G. Zhang, L. Song, W. Dong, Z. Sun. β€œStable ZnO@TiO2 core/shell
nanorod arrays with exposed high energy facets for self-cleaning coatings with anti-reflective
properties,” Journal of Material Chemistry A, 2:7313–7318, 2014.
[21] Y. Li, J. Kubota, K. Domen, β€œA Protocol for Fabrication of Barium-doped Tantalum Nitride
Nanorod Arrays,” Protocol Exchange, Nature Publising Group, doi 10.1038/protex.2013.080,
2013.
[22] Y. Luo, L. Wang, Y. Zou, X. Sheng, L. Chang, D. Yang. β€œElectrochemically Deposited Cu2O on
TiO2 Nanorod Arrays for Photovoltaic Application,” Electrochemical and Solid-State Letters,
15(2):H34{H36, 2011.

More Related Content

What's hot

C013141723
C013141723C013141723
C013141723
IOSR Journals
Β 
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
ijsc
Β 
Ja3615721579
Ja3615721579Ja3615721579
Ja3615721579
IJERA Editor
Β 
Utilization of Super Pixel Based Microarray Image Segmentation
Utilization of Super Pixel Based Microarray Image SegmentationUtilization of Super Pixel Based Microarray Image Segmentation
Utilization of Super Pixel Based Microarray Image Segmentation
ijtsrd
Β 
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATAGRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
cscpconf
Β 
Introductionedited
IntroductioneditedIntroductionedited
Introductionedited
Mefratechnologies
Β 
Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...
IAEME Publication
Β 
internship project1 report
internship project1 reportinternship project1 report
internship project1 report
sheyk98
Β 
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
Tarun Kumar
Β 
Identification of Disease in Leaves using Genetic Algorithm
Identification of Disease in Leaves using Genetic AlgorithmIdentification of Disease in Leaves using Genetic Algorithm
Identification of Disease in Leaves using Genetic Algorithm
ijtsrd
Β 
Maximum likelihood estimation from uncertain
Maximum likelihood estimation from uncertainMaximum likelihood estimation from uncertain
Maximum likelihood estimation from uncertain
IEEEFINALYEARPROJECTS
Β 
IRJET- Plant Leaf Disease Detection using Image Processing
IRJET- Plant Leaf Disease Detection using Image ProcessingIRJET- Plant Leaf Disease Detection using Image Processing
IRJET- Plant Leaf Disease Detection using Image Processing
IRJET Journal
Β 
Plant disease detection and classification using deep learning
Plant disease detection and classification using deep learning Plant disease detection and classification using deep learning
Plant disease detection and classification using deep learning
JAVAID AHMAD WANI
Β 
Kapil dikshit ppt
Kapil dikshit pptKapil dikshit ppt
Kapil dikshit ppt
kapil dikshit
Β 
V.KARTHIKEYAN PUBLISHED ARTICLE
V.KARTHIKEYAN PUBLISHED ARTICLEV.KARTHIKEYAN PUBLISHED ARTICLE
V.KARTHIKEYAN PUBLISHED ARTICLE
KARTHIKEYAN V
Β 
IRJET - Disease Detection in Plant using Machine Learning
IRJET -  	  Disease Detection in Plant using Machine LearningIRJET -  	  Disease Detection in Plant using Machine Learning
IRJET - Disease Detection in Plant using Machine Learning
IRJET Journal
Β 
Feature selection for multiple water quality status: integrated bootstrapping...
Feature selection for multiple water quality status: integrated bootstrapping...Feature selection for multiple water quality status: integrated bootstrapping...
Feature selection for multiple water quality status: integrated bootstrapping...
IJECEIAES
Β 
CLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORK
CLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORKCLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORK
CLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORK
International Research Journal of Modernization in Engineering Technology and Science
Β 
Comparative error of the phenomena model
Comparative error of the phenomena modelComparative error of the phenomena model
Comparative error of the phenomena model
irjes
Β 

What's hot (19)

C013141723
C013141723C013141723
C013141723
Β 
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
AN EFFICIENT PSO BASED ENSEMBLE CLASSIFICATION MODEL ON HIGH DIMENSIONAL DATA...
Β 
Ja3615721579
Ja3615721579Ja3615721579
Ja3615721579
Β 
Utilization of Super Pixel Based Microarray Image Segmentation
Utilization of Super Pixel Based Microarray Image SegmentationUtilization of Super Pixel Based Microarray Image Segmentation
Utilization of Super Pixel Based Microarray Image Segmentation
Β 
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATAGRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
GRC-MS: A GENETIC RULE-BASED CLASSIFIER MODEL FOR ANALYSIS OF MASS SPECTRA DATA
Β 
Introductionedited
IntroductioneditedIntroductionedited
Introductionedited
Β 
Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...Comparison between the genetic algorithms optimization and particle swarm opt...
Comparison between the genetic algorithms optimization and particle swarm opt...
Β 
internship project1 report
internship project1 reportinternship project1 report
internship project1 report
Β 
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
Plant Leaf Disease Analysis using Image Processing Technique with Modified SV...
Β 
Identification of Disease in Leaves using Genetic Algorithm
Identification of Disease in Leaves using Genetic AlgorithmIdentification of Disease in Leaves using Genetic Algorithm
Identification of Disease in Leaves using Genetic Algorithm
Β 
Maximum likelihood estimation from uncertain
Maximum likelihood estimation from uncertainMaximum likelihood estimation from uncertain
Maximum likelihood estimation from uncertain
Β 
IRJET- Plant Leaf Disease Detection using Image Processing
IRJET- Plant Leaf Disease Detection using Image ProcessingIRJET- Plant Leaf Disease Detection using Image Processing
IRJET- Plant Leaf Disease Detection using Image Processing
Β 
Plant disease detection and classification using deep learning
Plant disease detection and classification using deep learning Plant disease detection and classification using deep learning
Plant disease detection and classification using deep learning
Β 
Kapil dikshit ppt
Kapil dikshit pptKapil dikshit ppt
Kapil dikshit ppt
Β 
V.KARTHIKEYAN PUBLISHED ARTICLE
V.KARTHIKEYAN PUBLISHED ARTICLEV.KARTHIKEYAN PUBLISHED ARTICLE
V.KARTHIKEYAN PUBLISHED ARTICLE
Β 
IRJET - Disease Detection in Plant using Machine Learning
IRJET -  	  Disease Detection in Plant using Machine LearningIRJET -  	  Disease Detection in Plant using Machine Learning
IRJET - Disease Detection in Plant using Machine Learning
Β 
Feature selection for multiple water quality status: integrated bootstrapping...
Feature selection for multiple water quality status: integrated bootstrapping...Feature selection for multiple water quality status: integrated bootstrapping...
Feature selection for multiple water quality status: integrated bootstrapping...
Β 
CLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORK
CLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORKCLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORK
CLASSIFICATION OF CANCER BY GENE EXPRESSION USING NEURAL NETWORK
Β 
Comparative error of the phenomena model
Comparative error of the phenomena modelComparative error of the phenomena model
Comparative error of the phenomena model
Β 

Viewers also liked

The Hashtag Power
The Hashtag PowerThe Hashtag Power
The Hashtag Power
Mehdi LAGHA
Β 
Hashtag and Social Media Marketing
Hashtag and Social Media MarketingHashtag and Social Media Marketing
Hashtag and Social Media Marketing
Andreas Friedeheim
Β 
7 rules to create the perfect hashtag
7 rules to create the perfect hashtag7 rules to create the perfect hashtag
7 rules to create the perfect hashtag
Cinzia Di Martino
Β 
Tweet Tweet Tweet Twitter
Tweet Tweet Tweet TwitterTweet Tweet Tweet Twitter
Tweet Tweet Tweet Twitter
Jimmy Jay
Β 
Hashtag 101 - All You Need to Know About Hashtags
Hashtag 101 - All You Need to Know About HashtagsHashtag 101 - All You Need to Know About Hashtags
Hashtag 101 - All You Need to Know About Hashtags
Modicum
Β 
FontShop - Typography
FontShop - TypographyFontShop - Typography
FontShop - Typography
Poppy Young
Β 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
Luminary Labs
Β 

Viewers also liked (7)

The Hashtag Power
The Hashtag PowerThe Hashtag Power
The Hashtag Power
Β 
Hashtag and Social Media Marketing
Hashtag and Social Media MarketingHashtag and Social Media Marketing
Hashtag and Social Media Marketing
Β 
7 rules to create the perfect hashtag
7 rules to create the perfect hashtag7 rules to create the perfect hashtag
7 rules to create the perfect hashtag
Β 
Tweet Tweet Tweet Twitter
Tweet Tweet Tweet TwitterTweet Tweet Tweet Twitter
Tweet Tweet Tweet Twitter
Β 
Hashtag 101 - All You Need to Know About Hashtags
Hashtag 101 - All You Need to Know About HashtagsHashtag 101 - All You Need to Know About Hashtags
Hashtag 101 - All You Need to Know About Hashtags
Β 
FontShop - Typography
FontShop - TypographyFontShop - Typography
FontShop - Typography
Β 
Hype vs. Reality: The AI Explainer
Hype vs. Reality: The AI ExplainerHype vs. Reality: The AI Explainer
Hype vs. Reality: The AI Explainer
Β 

Similar to TBerger_FinalReport

Skin Cancer Detection Application
Skin Cancer Detection ApplicationSkin Cancer Detection Application
Skin Cancer Detection Application
IRJET Journal
Β 
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNINGSEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
gerogepatton
Β 
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNINGSEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
gerogepatton
Β 
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNINGSEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
ijaia
Β 
An efficient convolutional neural network-based classifier for an imbalanced ...
An efficient convolutional neural network-based classifier for an imbalanced ...An efficient convolutional neural network-based classifier for an imbalanced ...
An efficient convolutional neural network-based classifier for an imbalanced ...
IAESIJAI
Β 
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
gerogepatton
Β 
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
ijaia
Β 
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
ijsc
Β 
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real LifeSimplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Peea Bal Chakraborty
Β 
Controlling informative features for improved accuracy and faster predictions...
Controlling informative features for improved accuracy and faster predictions...Controlling informative features for improved accuracy and faster predictions...
Controlling informative features for improved accuracy and faster predictions...
Damian R. Mingle, MBA
Β 
IRJET - Survey on Analysis of Breast Cancer Prediction
IRJET - Survey on Analysis of Breast Cancer PredictionIRJET - Survey on Analysis of Breast Cancer Prediction
IRJET - Survey on Analysis of Breast Cancer Prediction
IRJET Journal
Β 
IRJET- A Novel Segmentation Technique for MRI Brain Tumor Images
IRJET- A Novel Segmentation Technique for MRI Brain Tumor ImagesIRJET- A Novel Segmentation Technique for MRI Brain Tumor Images
IRJET- A Novel Segmentation Technique for MRI Brain Tumor Images
IRJET Journal
Β 
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET Journal
Β 
Review on Mesothelioma Diagnosis
Review on Mesothelioma DiagnosisReview on Mesothelioma Diagnosis
Review on Mesothelioma Diagnosis
IRJET Journal
Β 
Melanoma Skin Cancer Detection using Deep Learning
Melanoma Skin Cancer Detection using Deep LearningMelanoma Skin Cancer Detection using Deep Learning
Melanoma Skin Cancer Detection using Deep Learning
IRJET Journal
Β 
Efficiency of Prediction Algorithms for Mining Biological Databases
Efficiency of Prediction Algorithms for Mining Biological  DatabasesEfficiency of Prediction Algorithms for Mining Biological  Databases
Efficiency of Prediction Algorithms for Mining Biological Databases
IOSR Journals
Β 
3D Segmentation of Brain Tumor Imaging
3D Segmentation of Brain Tumor Imaging3D Segmentation of Brain Tumor Imaging
3D Segmentation of Brain Tumor Imaging
IJAEMSJORNAL
Β 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking Scheme
Editor IJMTER
Β 
Computer Aided System for Detection and Classification of Breast Cancer
Computer Aided System for Detection and Classification of Breast CancerComputer Aided System for Detection and Classification of Breast Cancer
Computer Aided System for Detection and Classification of Breast Cancer
IJITCA Journal
Β 
SEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSIS
SEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSISSEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSIS
SEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSIS
IRJET Journal
Β 

Similar to TBerger_FinalReport (20)

Skin Cancer Detection Application
Skin Cancer Detection ApplicationSkin Cancer Detection Application
Skin Cancer Detection Application
Β 
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNINGSEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
Β 
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNINGSEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
Β 
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNINGSEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
SEGMENTATION OF THE GASTROINTESTINAL TRACT MRI USING DEEP LEARNING
Β 
An efficient convolutional neural network-based classifier for an imbalanced ...
An efficient convolutional neural network-based classifier for an imbalanced ...An efficient convolutional neural network-based classifier for an imbalanced ...
An efficient convolutional neural network-based classifier for an imbalanced ...
Β 
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
Β 
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
A NOVEL SCHEME FOR ACCURATE REMAINING USEFUL LIFE PREDICTION FOR INDUSTRIAL I...
Β 
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
An Efficient PSO Based Ensemble Classification Model on High Dimensional Data...
Β 
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real LifeSimplified Knowledge Prediction: Application of Machine Learning in Real Life
Simplified Knowledge Prediction: Application of Machine Learning in Real Life
Β 
Controlling informative features for improved accuracy and faster predictions...
Controlling informative features for improved accuracy and faster predictions...Controlling informative features for improved accuracy and faster predictions...
Controlling informative features for improved accuracy and faster predictions...
Β 
IRJET - Survey on Analysis of Breast Cancer Prediction
IRJET - Survey on Analysis of Breast Cancer PredictionIRJET - Survey on Analysis of Breast Cancer Prediction
IRJET - Survey on Analysis of Breast Cancer Prediction
Β 
IRJET- A Novel Segmentation Technique for MRI Brain Tumor Images
IRJET- A Novel Segmentation Technique for MRI Brain Tumor ImagesIRJET- A Novel Segmentation Technique for MRI Brain Tumor Images
IRJET- A Novel Segmentation Technique for MRI Brain Tumor Images
Β 
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
Β 
Review on Mesothelioma Diagnosis
Review on Mesothelioma DiagnosisReview on Mesothelioma Diagnosis
Review on Mesothelioma Diagnosis
Β 
Melanoma Skin Cancer Detection using Deep Learning
Melanoma Skin Cancer Detection using Deep LearningMelanoma Skin Cancer Detection using Deep Learning
Melanoma Skin Cancer Detection using Deep Learning
Β 
Efficiency of Prediction Algorithms for Mining Biological Databases
Efficiency of Prediction Algorithms for Mining Biological  DatabasesEfficiency of Prediction Algorithms for Mining Biological  Databases
Efficiency of Prediction Algorithms for Mining Biological Databases
Β 
3D Segmentation of Brain Tumor Imaging
3D Segmentation of Brain Tumor Imaging3D Segmentation of Brain Tumor Imaging
3D Segmentation of Brain Tumor Imaging
Β 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking Scheme
Β 
Computer Aided System for Detection and Classification of Breast Cancer
Computer Aided System for Detection and Classification of Breast CancerComputer Aided System for Detection and Classification of Breast Cancer
Computer Aided System for Detection and Classification of Breast Cancer
Β 
SEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSIS
SEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSISSEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSIS
SEMI SUPERVISED BASED SPATIAL EM FRAMEWORK FOR MICROARRAY ANALYSIS
Β 

More from Thaddeus Berger

Showcase2016_Poster_BME_Berger-3
Showcase2016_Poster_BME_Berger-3Showcase2016_Poster_BME_Berger-3
Showcase2016_Poster_BME_Berger-3
Thaddeus Berger
Β 
01 PriMA_DetailDesignReport
01 PriMA_DetailDesignReport01 PriMA_DetailDesignReport
01 PriMA_DetailDesignReport
Thaddeus Berger
Β 
PriMA_Design_Portfolio
PriMA_Design_PortfolioPriMA_Design_Portfolio
PriMA_Design_Portfolio
Thaddeus Berger
Β 
ARInitialPaper
ARInitialPaperARInitialPaper
ARInitialPaper
Thaddeus Berger
Β 
Vibrations
VibrationsVibrations
Vibrations
Thaddeus Berger
Β 
Thermal Systems Design
Thermal Systems DesignThermal Systems Design
Thermal Systems Design
Thaddeus Berger
Β 
Fluids Lab
Fluids LabFluids Lab
Fluids Lab
Thaddeus Berger
Β 
ThaddeusBerger_Poster
ThaddeusBerger_PosterThaddeusBerger_Poster
ThaddeusBerger_Poster
Thaddeus Berger
Β 

More from Thaddeus Berger (8)

Showcase2016_Poster_BME_Berger-3
Showcase2016_Poster_BME_Berger-3Showcase2016_Poster_BME_Berger-3
Showcase2016_Poster_BME_Berger-3
Β 
01 PriMA_DetailDesignReport
01 PriMA_DetailDesignReport01 PriMA_DetailDesignReport
01 PriMA_DetailDesignReport
Β 
PriMA_Design_Portfolio
PriMA_Design_PortfolioPriMA_Design_Portfolio
PriMA_Design_Portfolio
Β 
ARInitialPaper
ARInitialPaperARInitialPaper
ARInitialPaper
Β 
Vibrations
VibrationsVibrations
Vibrations
Β 
Thermal Systems Design
Thermal Systems DesignThermal Systems Design
Thermal Systems Design
Β 
Fluids Lab
Fluids LabFluids Lab
Fluids Lab
Β 
ThaddeusBerger_Poster
ThaddeusBerger_PosterThaddeusBerger_Poster
ThaddeusBerger_Poster
Β 

TBerger_FinalReport

  • 1. Berger 1 QUALITY ISSUES IN NANOMANUFACTURING THADDEUS BERGER1 , MOSTAFA GILANIFAR2 , TANMOY DAS2 , GRANT KLEINER2 , DR. ABHISHEK SHRIVASTAVA2 * 1 FLORIDA INSTITUTE OF TECHNOLOGY 2 FLORIDA STATE UNIVERSITY HIGH-PERFORMANCE MATERIALS INSTITUTE FLORIDA A&M UNIVERSITY – FLORIDA STATE UNIVERSITY, COLLEGE OF ENGINEERING 2525 POTTSDAMER STREET, TALLAHASSEE, FLORIDA 32310 ABSTRACT Nanoparticles have potential in a variety of applications, including cancer diagnosis and treatment, structural health monitoring and β€œsmart” buildings, and improved solar cells. Nanoparticle fabrication, however, is currently not standardized and not viable on an industrial scale. The scale- up of nanomanufacturing requires application of quality engineering tools for optimizing process yield, variance reduction, and process monitoring and control. This requires methods for estimating nanoparticle dimensions and spatial arrangement, as these significantly influence nanomaterial thermal, physical, optical, and electromagnetic properties. The objective of this research is to use supervised learning algorithms and machine learning to develop a system to automatically detect nanoparticles and estimate size and spatial distribution. This paper will first go into some detail on the applications of nanotechnology, the necessity of an industrial-scale manufacturing procedure, and how supervised learning ties into achieving the goal of commercialization. Then, background will be given on supervised learning before discussing classification. After outlining the flow of the project, detail will be given on our chosen classification and feature extraction techniques as well as clustering methods used for multi-class classification.
  • 2. Berger 2 1: INTRODUCTION Improving the scalability of nanomaterial production has many commercial applications. Composites with nanomaterials are becoming increasingly common in research, and applications for nanomaterials exist in many commercial venues including cancer cell targeting and treatment [1] [2], structural health monitoring [3] [4], and more effective solar panels [5]. The high surface area-to- volume ratios and ideal thermal, physical, optical, and electromagnetic properties of nanoparticles [6] make nanoparticles critical to the commercialization of modern technology. Currently, labs around the country are using nanomaterials for research and design of lighter, stronger materials. Numerous labs use scanning electronmicroscopes (SEM), but their purposes are largely qualitative. No all-in-one process or software package currently exists to provide a quick, affordable way to make nanomaterial production viable at an industrial scale. Designing such a process would allow the development of quality engineering tools to optimize process yield and reduce variance. Without such tools, nanotechnology will be prohibitively expensive to produce on an industrial scale. The system to enable the development of the necessary engineering tools would use data extracted from the SEM images to learn patterns from the data and allow predictions to be made. Machine learning can be applied to complete this task and provide the tools needed to scale up nanomanufacturing. Therefore, the development of automated systems for estimating size and spatial distribution, which heavily influence material properties at the nanoscale, is necessary for the scale-up of nanomanufacturing processes. Dimensional estimation and control would allow industries to quickly, accurately, and affordably determine the best nanomanufacturing processes. This capability would also allow for the standardization of nanomanufacturing processes, further improving scalability. FIGURE 1: APPLICATION OF NANOMATERIALS IN CANCER TREATMENT [2]. FIGURE 2: APPLICATION OF NANOMATERIALS (IN THIS CASE, SILVER NANOWIRES) IN IMPROVING SOLAR CELL TECHNOLOGY [5]. FIGURE 3: APPLICATION OF NANOMATERIALS IN STRUCTURAL HEALTH MONITORING [4].
  • 3. Berger 3 The need for a way to scale up nanomanufacturing was addressed by using supervised machine learning methods to correctly predict the locations of nanorods in SEM images with the goal of extracting dimension estimates based on the projection lengths. Machine learning is a hot topic for research both by statisticians and computer scientists, and as a result, new literature on this subject continues to emerge. Pattern recognition, a branch of machine learning, has numerous applications including facial recognition, medical imaging and diagnostics, and speech recognition, to name a few. As computing power has grown exponentially [7], statistical computing has become widely used in a variety of fields. We applied all of these principles to extract features from and classify our micrographs in order to make predictions and estimate nanorod dimensions. The following section will explore our experimental methodology, starting with our data collection procedure before giving an overview of the methods used for selecting, building, and testing models, extracting image features, and the use of clustering to infer additional features and for testing with multiple classes. Then, the results and analysis will be presented, followed by discussion and our conclusions and recommendations for future research. 2: EXPERIMENTAL METHODOLOGY 2.1: SUPERVISED LEARNING Statistical learning is the field that deals with making predictions or inferences based on given data. Supervised learning is a subset of statistical learning which focuses on making predictions. In supervised learning, we assume that there is a relationship between the response (dependent variable, output) and one or more predictors (independent variables, inputs, features). We use a set of training data to learn a model, a function of the predictors. The model can be used to make predictions, and is validated using test data. In contrast, unsupervised learning is done without training – you simply analyze the data to establish trends, relationships, or distinctions between groups of observations [8, pp. 26]. Both are very useful, and both were used in this research, but the main focus was supervised learning. As mentioned before, the goal of supervised learning is the make predictions. To ensure that the predictions are accurate, we want to minimize the error in our predictions. The prediction error can usually be decomposed into its components of bias, model variance, and observation noise (variance). Equation 1 shows this breakdown for mean squared error (MSE), a popular metric for continuous response. Bias refers to the error due to approximation (i.e. modeling a slightly curved data set with a straight line), and variance refers to how much your approximation would change if used to predict another data set (i.e. an overly curved model would change significantly with new data) [8, pp. 33]. In general there is an inverse relationship between bias and variance, which leads to the bias-variance tradeoff, so the goal of supervised learning becomes to minimize the following: 𝑬𝒙𝒑𝒆𝒄𝒕𝒆𝒅 𝒕𝒆𝒔𝒕 𝑴𝑺𝑬 = 𝑴𝒐𝒅𝒆𝒍 π‘½π’‚π’“π’Šπ’‚π’π’„π’† + (𝑴𝒐𝒅𝒆𝒍 π‘©π’Šπ’‚π’”) 𝟐 + 𝑬𝒓𝒓𝒐𝒓 π‘½π’‚π’“π’Šπ’‚π’π’„π’† (1). A model must be selected by minimizing an error function like that in (1) before using it to make predictions. However, estimating the error terms above is limited by available data, and more importantly, on whether the available data is representative of all the patterns and variations in future samples. The estimation, thus, requires special techniques for making valid error estimates. Sampling methods validate a model by repeatedly drawing different samples from the training data
  • 4. Berger 4 to draw more information about the model. Cross validation is one of the most popular sampling methods. Cross validation splits the training set into groups, and fits the model to each group, assessing the model’s performance by averaging all the groups’ errors. K-fold cross validation is very common in statistical learning; the training data, with n observations, is separated into K groups of almost equal size. A special case is when K = n groups of size n-1, which is called leave-one-out cross validation (LOOCV). LOOCV gives the best error estimate, but becomes extremely computationally expensive when dealing with big data. One of the most important aspects of our research was feature extraction. We needed to find the features which distinguish the pixels of a nanorod from the background in order to build our model. The effect is very similar to human memory – for example, you may recognize the people, places, and objects you know based on their sounds, smells, physical features, or mannerisms (or a combination of them). There are several feature detection algorithms available, and nearly limitless combinations of features can be extracted, as there are approximately 10,000-30,000 different object categories [9]. We needed to find the features that could not only segment out the nanorods in the images correctly, but we needed to do so under varying levels of brightness, sharpness, or intensity across multiple images. These differences can be significant, as highlighted in Figure 4. 2.2: CLASSIFICATION All supervised learning models are built with the goal of predicting a response to one or more predictors. Models can be built to serve a variety of purposes and to fit a variety of trends. For this reason, selecting the proper method is of vital importance. Since the first goal of our research was to pick out nanorods from images, it was clear that we would be using classification algorithms. Humans perform classification instinctively many times per day. Classification is the simple association of items to their descriptions (or other people to their names). While linear regression is used when predicting a quantitative (numerical) response, classification is used when predicting a qualitative (categorical) response. For our data, we needed to use classification to determine which pixels were parts of nanorods and which were not. There are many types of classifiers, or models used to classify data. One popular classifier is logistic regression. This classifier assumes that the logarithm of the odds that an observation will be in a certain class, or log-odds, is linear. The coefficients of the linear portion can be estimated using the maximum likelihood method, which estimates the coefficients such that all training observations can (a) (b) FIGURE 4: NANOROD MICROGRAPHS. SEM IMAGES OFTEN VARY SIGNIFICANTLY IN BRIGHTNESS, SHARPNESS, AND INTENSITY. A) [10], B) [11].
  • 5. Berger 5 be classified correctly [8, pp. 132]. This can be formalized mathematically using the likelihood function, where Ξ²0 and Ξ²1 are regression coefficients: 𝒍(𝜷 𝟎, 𝜷 𝟏) = ∏ 𝒑(π’™π’Š) π’Š:π’šπ’Š=𝟏 ∏ (𝟏 βˆ’ 𝒑(π’™π’Šβ€²)) π’Šβ€²:π’šπ’Šβ€²=𝟎 (2). Logistic regression is mostly used for binary classification. When dealing with more than two classes, linear discriminant analysis (LDA) is commonly used. LDA attempts to approximate the Bayes classifier, which is the ideal classifier [8, pp. 37]. Quadratic discriminant analysis (QDA) is similar to LDA, but uses quadratic discriminant functions instead of linear. K-nearest neighbors (KNN) is a classification method which attempts to classify observations based on the K nearest observations. Some popular, more computer-intensive methods include decision trees, random forests, boosting, and support vector machines (SVM) [8, pp. 127]. Our group used SVM for classification, which is covered in section 2.4. 2.3: OVERALL FLOW Our experiments followed the structure of the flow chart in Figure 5. FIGURE 5: GENERAL FLOW FOR AN EXPERIMENT. For an experiment, the features of a training image were extracted as a data matrix and then used to train a model. Then, the model was validated using a new image. This process was repeated for various methods and feature extraction techniques. The success or failure of the model was determined by classification error or 0/1 loss. Classification error has two components: misdetection (false positive) error and false alarm (false negative) error. The general goal was to solve the following optimization problem: π‘΄π’‚π’™π’Šπ’Žπ’Šπ’›π’†( π‘΅π’–π’Žπ’ƒπ’†π’“ 𝒐𝒇 π’„π’π’“π’“π’†π’„π’•π’π’š π’„π’π’‚π’”π’”π’Šπ’‡π’Šπ’†π’… π’π’ƒπ’”π’†π’“π’—π’‚π’•π’Šπ’π’π’” π‘΅π’–π’Žπ’ƒπ’†π’“ 𝒐𝒇 π’π’ƒπ’”π’†π’“π’—π’‚π’•π’Šπ’π’π’” ) (3). Prediction On New Images
  • 6. Berger 6 2.4: SUPPORT VECTOR MACHINES (SVM) SVM is a vector space-based classifier which separates training data based on their class labels with a hyperplane such that the hyperplane is the farthest possible from points in either class [13]. SVMs are one of the most popular machine learning techniques available today [14]. SVM can be used to construct linear or nonlinear classifiers (Figure 6) using the kernel trick. A kernel is a function which quantifies the similarity of two observations and implicitly maps the data to a higher dimensional feature space. The SVM then learns a (linear classifier) hyperplane in this high-dimensional feature space, resulting in nonlinear classification boundaries in the original space [8, pp. 350]. Using kernels is computationally less expensive than creating new features (data transformations) explicitly. For our research, we used mostly radial and some linear kernels. SVM can have a hard or soft margin. A hard margin a classifier that does not allow any misclassified observations. A soft margin yields a smoother classifier by allowing some misclassifications [15]. The soft margin’s ability to ignore some observations usually results in a better overall fit. The parameter which controls the margin of an SVM is C (cost). Another parameter, Ξ³ (gamma), parametrizes the kernel function [16]. 2.5: SCALE-INVARIANT FEATURE TRANSFORM (SIFT) SIFT is a robust image descriptor developed by David Lowe in 1999 which is commonly used in the field of computer vision for object detection and point matching between different views of 3D objects. SIFT detects key points (points of interest) for features. Histograms are used to generate a vector of 128 features at each key point. These vectors (descriptors) are used to classify the image [9]. Object detection was critical for this research, because we needed to be able to detect the features which distinguished nanorods from the background and because these features had to hold up to the differences shown in the β€œSupervised Learning” section. We take this ability for granted as humans because it is so easy for us. Humans are capable of distinguishing thousands of types of objects [9] with almost no difficulty through a wide range of illuminations, orientations, distances, and distractions. For example, you may be able to recognize your car very quickly from relatively far away. But how do you know the car is yours? Surely you do not own the only car of that type in the world. You simply recognize the car intuitively, immediately recognizing all the features of the car FIGURE 7: FLOW OF IMAGE CLASSIFICATION USING SIFT FEATURE EXTRACTION ALGORITHM [9]. FIGURE 6: SAMPLE SVM CLASSIFICATION PLOT.
  • 7. Berger 7 that make it yours, and this is the goal of object detection. When using SIFT, the first key points are detected by the scale-space extrema of the Difference-of-Gaussian (DoG) values, and SIFT extracts a 128-dimensional descriptor vector for each key point [16]. Figure 8a shows a plot of SIFT key points overlaid on an image. Plotting the SIFT descriptors was unnecessary for our research, as the descriptors were simply extracted into tables in CSV files. However, the results of plotting the SIFT descriptors for an image can be seen in Figure 8b. The DoG operator normalizes SIFT features and causes the features to be scale invariant. This means that the features do not vary with rotation, translation, or scaling. SIFT features can be detected through wide differences in intensity, illumination, and sharpness of an image. This made SIFT a top option, giving us 128 low-variance predictors to study and to trainour models instead. However, one issue with SIFT is that it may eliminate critical variations which could help our SVM correctly classify images. This issue was studied further by using K-means clustering, an unsupervised learning method, to consider a multi-level classification problem. 2.6: K-MEANS CLUSTERING & MULTI-LEVEL SVM CLASSIFICATION Clustering is an unsupervised learning method which separates a data set into several groups of similar observations. We used clustering to allow us to use more than two response variables. Previously, we classified pixels or SIFT descriptors as belonging to either the foreground (nanorods) or background (not nanorods). However, foreground observations may themselves have a variety of patterns. Separating these patterns into separate groups, using clustering, can improve the accuracy of the learned classifiers. K-means clustering was used to group the foreground and background data so that we could use a multi-class SVM, which is an SVM with more than two classes to separate. K- means clustering simply divides n observations into K clusters, where K is a selected value. In K- means clustering each observation is placed into the cluster with the nearest mean. This is very similar to K-medoids clustering, where each observation is placed into the cluster with the nearest median. In R, there is a function to partition around medoids (PAM), part of a package called Flexible Procedures for Clustering (FPC), which estimates the best value of K [18]. While PAM is meant for K- medoids clustering, the similarities between K-means and K-medoids mean that PAM also gives a good estimate of K for K-means. After using K-means to cluster a data set, cluster number can be used as the response in a multi-class SVM. Foreground and background data were clustered separately, with K = 10 and K = 2, respectively. Then, the data was combined to run a 12-class SVM on the entire image. (b) (a) FIGURE 8: A) SIFT KEYPOINTS ON A TEST IMAGE [17]; B) SIFT DESCRIPTORS.
  • 8. Berger 8 2.7: EXPERIMENT SETUP When building a supervised learning model, the first step is to acquire training data to build a model to predict test data. We began by creating β€œGround Truth” (GT) from micrographs which we acquired from publications and from other labs. GT was used to generate training data which was used to train our models. GT was created by using Microsoft Paint to color over distinguishable nanorods in the selected micrographs. Each nanorod was assigned a set of RGB (red, green, blue) color values. The first predictors tested were the (x, y) pixel coordinates. The (x-y) coordinate system in image processing is different from typical Cartesian coordinates. With a Cartesian system, the positive y axis is oriented vertically upward, but is vertically downward in image processing. Next, SIFT features were used as predictors. For an image with N descriptors, the data was an Nx128 matrix. The next set of features used was based on a pixel’s neighbors. For every pixel in a training image, we used 25 features a 5x5 descriptor vector per pixel. Thus, for a training image with P pixels, the data was a Px25 matrix. These features were extracted in CSV format in MATLAB to be used in R to train an SVM. When implementing the SVM in R, we used tuning to determine the best combination of parameters for our model. Since we were dealing with large, high-dimensional data sets, tuning became highly computationally expensive. SVM tuning could take hours or even days to execute. The only solutions were to use a faster computer or write faster codes. R, by default, does not utilize parallel computing, so using a multi-core computer (almost all of today’s computers have multiple cores) initially provides no advantage. To address the need for faster computing, it was necessary to develop an SVM implementation which worked in parallel on a high-performance multi-core machine. Tuning optimizes an SVM over a range of parameters. R tunes the SVM by testing one model at a time by default. The doParallel library in R provides a parallel backend, or a parallel network of workers, for the foreach loop. DoParallel must be combined with the parallel library, which is included in recent versions of R. The foreach library enables the foreach loop, and along with the iterators library, enables the parallel and doParallel libraries to be installed. The e1071 library is needed to run and tune SVM. The foreach loop can be easily nested to conduct run SVM for variations of C and Ξ³ [19]. The foreach loop did not tune the SVM; the loop ran an SVM for each combination of C and Ξ³, all at once. For a given training data set, the model with the best accuracy (lowest error) was selected and a single SVM (a much faster computation) was run with that set of parameters. This implementation provided us with a major speed increase when dealing with big data and would be advantageous for industrial nanomanufacturers with access to high-performance servers.
  • 9. Berger 9 3: RESULTS AND DATA ANALYSIS 3.1: DATA COLLECTION GT was created for 16 SEM images. Numerical data matrices were extracted using MATLAB in the form of CSV files for both training (GT) images and test images. Models were trained using R on a 40- core Linux (Ubuntu) platform. Using GT images to train our models allowed us to predict the locations of nanorods in other images. 3.2: FEATURE EXTRACTION Our first tests used the (x, y) coordinate values as the only variables to predict the response of foreground or background. Next, we used SIFT features as predictors, first with a binary response, and then using K-means clustering to conduct 12-class classifications. Using PAM, we found the ideal value of K for the foreground and background data to be 10 and 2, respectively. A section of the data matrix for Figure 8 is shown to visualize our data. A sample training image and GT are shown in Figure 9 with SIFT keypoints plotted. FIGURE 9: EXAMPLE SIFT IMAGE DATA MATRIX. THE IMAGE HAD 2,480 DESCRIPTORS AND A BINARY RESPONSE (0 FOR BACKGROUND AND 1 FOR FOREGROUND). FIGURE 10: EXAMPLE ORIGINAL IMAGE (A) AND GT (B) [20]. (b)(a)
  • 10. Berger 10 3.3: CLASSIFICATION SVM tuning originally lasted from as little as 30 minutes for smaller data sets to days for larger data sets. To address this issue, I developed a parallel computing method as discussed in the Experiment Setup section. Using nested foreach loops, I was able to run an SVM for all combinations of C and Ξ³ very quickly. When testing the effectiveness of my code, running on 25 of our 40 cores, I was able to increase the speed of the tuning process by a factor of 30 (the image took ~30 minutes to tune originally but only 57 seconds with my code) when running nine combinations of C and Ξ³. This implementation became very useful for tuning large data sets like those generated by the 25 feature descriptors for every pixel. I tuned one of these images in parallel over three hours, meaning that the default tuning process would have lasted almost four days. After tuning in parallel, a new SVM had to be made using the parameters from the loop which resulted in the lowest error. However, the parallel implementation was still an order of magnitude faster than the default tuning. When we applied a model from our SVM to a test image, we could generate a table called a confusion matrix, which is simply a table of predicted values vs. real values where each value corresponds to a class. A binary SVM classification was run for the GT image in Figure 11 and tested with the image in Figure 8a, resulting in a confusion matrix (Figure 13) and an SVM classification plot (Figure 12). Using the confusion matrix and Equation 3, accuracy (Ξ·) can be computed for each test of a classifier. Table 1 (next page) shows Ξ· for 17 experiments which used six images (designated Image 1 – Image 6, see Figures 13 and 14 on the next page). Table 2 (pp. 12) shows the average and standard deviation for Ξ· for each set of features and each value of C and Ξ³. FIGURE 11: SVM CLASSIFICATION PLOT FOR EXAMPLE TEST IMAGE. FIGURE 12: CONFUSION MATRIX FOR EXAMPLE SVM.
  • 11. Berger 11 Experimental Results Experiment Training (GT) Image No. Test Image No. Classes Features C Ξ³ Accuracy (Ξ·) 1 1* 1* 12 25 100 0.5 0.927537 2 5 3 2 (x, y) 100 5 0.761436 3 5 4 2 (x, y) 100 5 0.718798 4 3 2 2 SIFT 0.1 0.5 0.912782 5 3 4 2 SIFT 0.1 0.5 0.798438 6 3 5 2 SIFT 0.1 0.5 0.851351 7 3 6 2 SIFT 0.1 0.5 0.919440 8 1 1 2 SIFT 10 1 0.865906 9 1 1 2 SIFT 100 1 0.865906 10 1 1 2 SIFT 1000 1 0.865906 11 1 1 2 SIFT 10 5 0.865906 12 1 1 2 SIFT 100 5 0.865906 13 1 1 2 SIFT 1000 5 0.865906 14 1 1 2 SIFT 10 10 0.865906 15 1 1 2 SIFT 100 10 0.865906 16 1 1 2 SIFT 1000 10 0.865906 17 4 6 2 SIFT 100 0.5 0.919440 Avg. Ξ· 0.858963 Std. Dev. 0.053262 TABLE 1: EXPERIMENTAL RESULTS. IMAGES 1-6 SHOWN IN FIGURES 13 AND 14. FIGURE 13: ORIGINAL IMAGES FROM EXPERIMENTS, WITH A-F CORRESPONDING TO IMAGES 1-6 IN TABLE 1: A) [11], B) [21], C) [10], D) [20], E) [22], F) [17]. FIGURE 13: GT IMAGES FROM EXPERIMENTS, WITH A-F CORRESPONDING TO IMAGES 1-6 IN TABLE 1: A) [11], B) [21], C) [10], D) [20], E) [22], F) [17]. (a) (e)(d) (f) (c)(b) (a) (b) (c) (f)(e)(d)
  • 12. Berger 12 4: DISCUSSION AND CONCLUSIONS 4.1: DISCUSSION The main assumption in this research is that all nanorods extend into the substrate. This assumption will allow us to use the top and side edges and the angle of the side edges to find the nanorods’ projection lengths. Tables 1 and 2 show that the features have the strongest correlation to SVM classifier accuracy. As expected, using only the (x, y) coordinates of the pixels resulted in poor accuracy. The SIFT features were quite robust, and the 25 feature image was well above the average, but more trials from all three sets of features are necessary to be able to truly say that which features are the best. Varying C and Ξ³ shows very little effect on the accuracy of the classifier, but low values of C and Ξ³ did result in a marked improvement in accuracy. This makes sense because we were dealing with big data sets, meaning that giving single data points too much influence could lower the classifier accuracy. Although more testing will be needed to confirm the positive effect, using multiple classes by clustering, using cluster number as the response, may improve accuracy dramatically. Based on these results, using SIFT feature descriptors or the 25 feature descriptors with low values of C and Ξ³ for a 12 class SVM classifier should result in high accuracy. Maximizing classifier accuracy will allow industries to classify images with very little error and accurately determine the optimal nanomanufacturing processes. The SVM classification plot in Figure 11 appears to give an upside down view of the test image (Figure 8a). This makes sense because the positive y-axis in images is oriented vertically downward, as mentioned in the Experimental Setup section. Knowing this, the plot clearly performs reasonably well, as the boundary separating the nanorods from the background appears clearly in the SVM classification plot. Some misclassifications are clear both in the nanorods and in the background, which makes sense due to this particular SVM’s accuracy rating of about 86%. Most of the error in our experiments was due to the propagation of human error in coloring GT images. Using multiple GT images as training data in the future may minimize the effect of human error. Any remaining error is due to random intrinsic errors. Therefore, running more experiments will be critical to finding the features and parameters to maximize classifier accuracy. Another important source of error is due to the estimation of nanoparticle dimensions using projection lengths. In the future, it will be important for nanomanufacturers to understand that the dimensions being extracted are not exact. However, the goal is to minimize these errors to the point where valid observations of changes in length can be made. Features Avg. Ξ· Difference from Total Avg. Ξ· Std. Dev. (x, y) 0.740117 -0.118846 0.021319 SIFT 0.871043 0.012080 0.029669 25* 0.927537* 0.068574* 0* C 0.1 0.870503 0.011539 0.049352 10 0.865906** 0.006943** 0** 100 0.846418 -0.012545 0.072270 1000 0.865906** 0.006943** 0** Ξ³ 0.5 0.888165 0.029201 0.047467 1 0.865906** 0.006943** 0** 5 0.815590 -0.043373 0.063082 10 0.865906** 0.006943** 0** TABLE 2: EFFECTS OF FEATURE TYPE, C, AND GAMMA ON ACCURACY. *ONLY ONE TRIAL USING 25 FEATURES, **ALL EXPERIMENTS YIELDED THE SAME RESULTS.
  • 13. Berger 13 Experiments 8-16 in Table 1 resulted in the same accuracy. These results mean that one or more of the following are true: 1. SIFT features are unaffected by C and Ξ³. This is unlikely because experiments 4-7 show a chance in accuracy using SIFT features with low values of C and Ξ³. One possible conclusion is that the classifier accuracy stagnated due to the C and Ξ³ values being too, meaning that SIFT features should only be used with low C and Ξ³. 2. For all the experiments where Ξ· = 0.865906, there were zeroes in the bottom right corner of the confusion matrix. This may mean that there were errors in this data or that binary classification does not work well for SIFT features. Multi-class classification should be tested for these experimental setups. 3. The accuracy data in Tables 1 and 2 are biased. As mentioned before, more experiments should be run for all combinations of parameters for binary and multi-level classification in order to get a clearer picture of how to maximize accuracy. 4. The SIFT features are too robust to properly classify our images. SIFT was designed to only use features which did not respond to changes in lighting, viewing angle, or changes in image brightness, sharpness, or intensity. Some of the factors which are critical to properly classifying our images may not be considered by using SIFT. For example, changes in size may affect optical properties of the nanoparticle being examined, and the SIFT features may be too robust to recognize the change. 4.2: FUTURE WORK AND RECOMMENDATIONS For future research, more experiments for all combinations of methods and parameters shown in Table 1 are recommended to get a clearer picture of what features and parameters work best. Wider ranges of C and Ξ³ values, as well as intermediate values, should be tested in order to potentially establish relationships between the SVM classifier accuracy for different sets of features and the combination of C and Ξ³. Further use of multi-class classification is also recommended due to the high accuracy (and the need of more data to confirm that high accuracy). Furthermore, different clustering algorithms such as K-medoids and hierarchical clustering should be attempted for multi-level classifications. Also, the effect of boosting or changing classification methods should be investigated. In industry, speed, automation, and accuracy will be critical. Therefore, once classifier accuracy is optimized, I propose the following recommendations before commercializing this research: 1. Codes should be adapted to a newer, faster language, such as C++ or Python, to improve computation speed and to create a single software package to be licensed. Computer science professionals are increasingly being educated in newer languages (i.e. C++, Python), and our programs should be written in the languages they are familiar with, as well as implementing optimized algorithms, multiple processors, and GPUs. 2. All codes should be written to utilize multi-core machines, and if possible, should automatically run on a specified percentage of the available cores. This would maximize accuracy for all nanomanufacturers who use the software. 3. An autorun script should be implemented, and a user-friendly Graphical User Interface (GUI) should be developed. This would reduce the cost of labor for nanomanufacturers.
  • 14. Berger 14 4.3: CONCLUSIONS This research shows that supervised learning algorithms have potential to be an excellent solution for nanoparticle dimension estimation and control, which would be a significant quality engineering development. The completion of this research will enable the scale-up, standardization, and commercialization of nanomanufacturing processes. The initial classifier accuracy is promising, and points to a bright future for this technology as further improvements are discovered.
  • 15. Berger 15 REFERENCES [1] G. Ali Mansoori, P. Mohazzabi, P. McCormack, S. Jabbari. β€œNanotechnology in cancer prevention, detection and treatment: bright future lies ahead,” World Review of Science, Technology and Sustainable Development, vol. 4, nos. 2/3, 2007. [2] D. Jenvey. (2012, November 4) Materials for Dermatological Nanotechnology [Online]. Wikispaces, Tangient LLC. Available: http://nanotechnology-cis.wikispaces.com/ Materials+for+dermatological+nanotechnology. Last accessed: 24 July 2015. [3] R. Elhajjar, V. La Saponara, A. Muliana. Smart Composites: Mechanics & Design, CRC Press, Taylor & Francis Group, LLC., Boca Raton, FL, 2014. [4] I. Kang, M. J. Schulz, J. H. Kim, V. Shanov, D. Shi. β€œA carbon nanotube strain sensor for structural health monitoring,” Smart Materials and Structures, 15 (2006) 737-748. [5] L. J. AndrΓ©s, M. F. MenΓ©ndez, D. GΓ³mez, A. L. MartΓ­nez, J. P. Kettle, A. MenΓ©ndez, B. Ruiz. β€œRapid synthesis of ultra-long silver nanowires for tailor-made transparent conductive electrodes: proof of concept in organic solar cells,” Nanotechnology, vol. 26, 2015. [6] A. Mandal. (2012, November 4). Properties of Nanoparticles [Online]. Available: http:// http://www.news-medical.net/health/Properties-of-Nanoparticles.aspx. Last accessed: 24 July 2015. [7] Berkeley School of Information. (2014, March 5). Moore’s Law and Computer Processing Power [Online]. Available: http://datascience.berkeley.edu/moores-law-processing-power/. [8] G. James, D. Witten, T. Hastie, R. Tibshirani. An Inroduction to Statistical Learning with Applications in R, Springer Science+Business Media, New York, 2013. [9] J. Kim, B. Kim, S. Savarese. β€œComparing Image Classification Methods: K-Nearest Neighbor and Support Vector Machines,” Proc. 6th WSEAS International Conference on Circuits, Systems, Signals and Telecommunications, Cambridge, pp. 133-138, 2012. [10] K. Kim, K. Utashiro, Y. Abe, M. Kawamura. β€œStructural Properties of Zinc Oxide Nanorods Grown on Al-Doped Zinc Oxide Seed Layer and Their Applications in Dye-Sensitized Solar Cells,” Materials, 7(4):2522{2533, 2014. [11] C. Thelander, P.Agarwal, S. Brongersma, J. Eymery, L. F. Feiner, A. Forchel, M. Scheffler, W. Riess, B.J. Ohlsson, U. GΓΆsele, L. Samuelson. β€œNanowire-based one dimensional electronics,” Materials Today, vol. 9, no. 10 (pp. 28-35), October 2006. [12] C. D. Manning, P. Raghavan, H. SchΓΌtze. An Introduction to Information Retrieval, Cambridge University Press, Cambridge, England, 1 April 2009. [13] X. Wu et, V. Kumar, J. R. Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A Ng. B. Liu, P. S. Yu, Z. Zhou, M. Steinback, D. J. Hand, D. Steinberg. β€œTop 10 algorithms in data mining,” Knowledge and Information Systems, no. 14:1-37, 2008.
  • 16. Berger 16 [14] Y. Liu, H. Zhang, Y. Wu. β€œHard or Soft Classification? Large-margin Unified Machines,” North Carolina State University, Raleigh, North Carolina, 10 January 2011. [15] SciKit-Learn. (2014). RBF SVM Parameters [Online]. Available: http://scikit- learn.org/stable/auto_examples/svm/plot_rbf_parameters.html. Last accessed: 24 July 2015. [16] T. Lindeberg. (2012). Scale Invariant Feature Transform [Online]. Scholarpedia. Available: http://www.scholarpedia.org/article/Scale_Invariant_Feature_Transform. Last accessed: 24 July 2015. [17] H. Gao W. Cai, P. Shimpi, H. Lin, P. Gao. β€œ(La,Sr)CoO 3/ZnO nanofilm–nanorod diode arrays for photo-responsive moisture and humidity detection,” Journal of Physics D: Applied Physics, 43(27):272002, 2010. [18] C. Hennig. (2014, October 2) Flexible procedures for clustering [Online]. CRAN Repository. Available: https://cran.r-project.org/web/packages/fpc/fpc.pdf. [19] S. Weston. β€œNesting Foreach Loops,” Revolution Analytics, Redmond, Washington, 10 April 2014. [20] R. Wang, H. Tan, Z. Zhao, G. Zhang, L. Song, W. Dong, Z. Sun. β€œStable ZnO@TiO2 core/shell nanorod arrays with exposed high energy facets for self-cleaning coatings with anti-reflective properties,” Journal of Material Chemistry A, 2:7313–7318, 2014. [21] Y. Li, J. Kubota, K. Domen, β€œA Protocol for Fabrication of Barium-doped Tantalum Nitride Nanorod Arrays,” Protocol Exchange, Nature Publising Group, doi 10.1038/protex.2013.080, 2013. [22] Y. Luo, L. Wang, Y. Zou, X. Sheng, L. Chang, D. Yang. β€œElectrochemically Deposited Cu2O on TiO2 Nanorod Arrays for Photovoltaic Application,” Electrochemical and Solid-State Letters, 15(2):H34{H36, 2011.