TBerger_FinalReport

Berger 1
QUALITY ISSUES IN
NANOMANUFACTURING
THADDEUS BERGER1 , MOSTAFA GILANIFAR2 , TANMOY DAS2 , GRANT KLEINER2 , DR. ABHISHEK
SHRIVASTAVA2 *
1 FLORIDA INSTITUTE OF TECHNOLOGY
2 FLORIDA STATE UNIVERSITY
HIGH-PERFORMANCE MATERIALS INSTITUTE
FLORIDA A&M UNIVERSITY – FLORIDA STATE UNIVERSITY, COLLEGE OF ENGINEERING
2525 POTTSDAMER STREET, TALLAHASSEE, FLORIDA 32310
ABSTRACT
Nanoparticles have potential in a variety of applications, including cancer diagnosis and treatment,
structural health monitoring and “smart” buildings, and improved solar cells. Nanoparticle
fabrication, however, is currently not standardized and not viable on an industrial scale. The scale-
up of nanomanufacturing requires application of quality engineering tools for optimizing process
yield, variance reduction, and process monitoring and control. This requires methods for estimating
nanoparticle dimensions and spatial arrangement, as these significantly influence nanomaterial
thermal, physical, optical, and electromagnetic properties. The objective of this research is to use
supervised learning algorithms and machine learning to develop a system to automatically detect
nanoparticles and estimate size and spatial distribution.
This paper will first go into some detail on the applications of nanotechnology, the necessity of an
industrial-scale manufacturing procedure, and how supervised learning ties into achieving the goal
of commercialization. Then, background will be given on supervised learning before discussing
classification. After outlining the flow of the project, detail will be given on our chosen classification
and feature extraction techniques as well as clustering methods used for multi-class classification.

Berger 2
1: INTRODUCTION
Improving the scalability of nanomaterial
production has many commercial applications.
Composites with nanomaterials are becoming
increasingly common in research, and
applications for nanomaterials exist in many
commercial venues including cancer cell
targeting and treatment [1] [2], structural
health monitoring [3] [4], and more effective
solar panels [5]. The high surface area-to-
volume ratios and ideal thermal, physical,
optical, and electromagnetic properties of
nanoparticles [6] make nanoparticles critical to
the commercialization of modern technology.
Currently, labs around the country are using
nanomaterials for research and design of
lighter, stronger materials. Numerous labs use
scanning electronmicroscopes (SEM), but their
purposes are largely qualitative. No all-in-one
process or software package currently exists to
provide a quick, affordable way to make
nanomaterial production viable at an
industrial scale. Designing such a process
would allow the development of quality
engineering tools to optimize process yield and
reduce variance. Without such tools,
nanotechnology will be prohibitively expensive to
produce on an industrial scale. The system to
enable the development of the necessary
engineering tools would use data extracted from
the SEM images to learn patterns from the data
and allow predictions to be made. Machine
learning can be applied to complete this task and
provide the tools needed to scale up
nanomanufacturing. Therefore, the development
of automated systems for estimating size and
spatial distribution, which heavily influence
material properties at the nanoscale, is necessary
for the scale-up of nanomanufacturing processes.
Dimensional estimation and control would allow
industries to quickly, accurately, and affordably
determine the best nanomanufacturing processes.
This capability would also allow for the
standardization of nanomanufacturing processes,
further improving scalability.
FIGURE 1: APPLICATION OF NANOMATERIALS IN
CANCER TREATMENT [2].
FIGURE 2: APPLICATION OF NANOMATERIALS (IN
THIS CASE, SILVER NANOWIRES) IN IMPROVING
SOLAR CELL TECHNOLOGY [5].
FIGURE 3: APPLICATION OF NANOMATERIALS IN
STRUCTURAL HEALTH MONITORING [4].

Berger 3
The need for a way to scale up nanomanufacturing was addressed by using supervised machine learning
methods to correctly predict the locations of nanorods in SEM images with the goal of extracting dimension
estimates based on the projection lengths. Machine learning is a hot topic for research both by
statisticians and computer scientists, and as a result, new literature on this subject continues to
emerge. Pattern recognition, a branch of machine learning, has numerous applications including
facial recognition, medical imaging and diagnostics, and speech recognition, to name a few. As
computing power has grown exponentially [7], statistical computing has become widely used in a
variety of fields. We applied all of these principles to extract features from and classify our
micrographs in order to make predictions and estimate nanorod dimensions.
The following section will explore our experimental methodology, starting with our data collection
procedure before giving an overview of the methods used for selecting, building, and testing models,
extracting image features, and the use of clustering to infer additional features and for testing with multiple
classes. Then, the results and analysis will be presented, followed by discussion and our conclusions and
recommendations for future research.
2: EXPERIMENTAL METHODOLOGY
2.1: SUPERVISED LEARNING
Statistical learning is the field that deals with making predictions or inferences based on given data.
Supervised learning is a subset of statistical learning which focuses on making predictions. In
supervised learning, we assume that there is a relationship between the response (dependent
variable, output) and one or more predictors (independent variables, inputs, features). We use a set
of training data to learn a model, a function of the predictors. The model can be used to make
predictions, and is validated using test data. In contrast, unsupervised learning is done without
training – you simply analyze the data to establish trends, relationships, or distinctions between
groups of observations [8, pp. 26]. Both are very useful, and both were used in this research, but the
main focus was supervised learning.
As mentioned before, the goal of supervised learning is the make predictions. To ensure that the
predictions are accurate, we want to minimize the error in our predictions. The prediction error can
usually be decomposed into its components of bias, model variance, and observation noise
(variance). Equation 1 shows this breakdown for mean squared error (MSE), a popular metric for
continuous response. Bias refers to the error due to approximation (i.e. modeling a slightly curved
data set with a straight line), and variance refers to how much your approximation would change if
used to predict another data set (i.e. an overly curved model would change significantly with new
data) [8, pp. 33]. In general there is an inverse relationship between bias and variance, which leads
to the bias-variance tradeoff, so the goal of supervised learning becomes to minimize the following:
𝑬𝒙𝒑𝒆𝒄𝒕𝒆𝒅 𝒕𝒆𝒔𝒕 𝑴𝑺𝑬 = 𝑴𝒐𝒅𝒆𝒍 𝑽𝒂𝒓𝒊𝒂𝒏𝒄𝒆 + (𝑴𝒐𝒅𝒆𝒍 𝑩𝒊𝒂𝒔) 𝟐
+ 𝑬𝒓𝒓𝒐𝒓 𝑽𝒂𝒓𝒊𝒂𝒏𝒄𝒆 (1).
A model must be selected by minimizing an error function like that in (1) before using it to make
predictions. However, estimating the error terms above is limited by available data, and more
importantly, on whether the available data is representative of all the patterns and variations in
future samples. The estimation, thus, requires special techniques for making valid error estimates.
Sampling methods validate a model by repeatedly drawing different samples from the training data

Berger 4
to draw more information about the model. Cross validation is one of the most popular sampling
methods. Cross validation splits the training set into groups, and fits the model to each group,
assessing the model’s performance by averaging all the groups’ errors. K-fold cross validation is very
common in statistical learning; the training data, with n observations, is separated into K groups of
almost equal size. A special case is when K = n groups of size n-1, which is called leave-one-out cross
validation (LOOCV). LOOCV gives the best error estimate, but becomes extremely computationally
expensive when dealing with big data.
One of the most important aspects of our research was feature extraction. We needed to find the
features which distinguish the pixels of a nanorod from the background in order to build our model.
The effect is very similar to human memory – for example, you may recognize the people, places, and
objects you know based on their sounds, smells, physical features, or mannerisms (or a combination
of them). There are several feature detection algorithms available, and nearly limitless combinations
of features can be extracted, as there are approximately 10,000-30,000 different object categories
[9]. We needed to find the features that could not only segment out the nanorods in the images
correctly, but we needed to do so under varying levels of brightness, sharpness, or intensity across
multiple images. These differences can be significant, as highlighted in Figure 4.
2.2: CLASSIFICATION
All supervised learning models are built with the goal of predicting a response to one or more
predictors. Models can be built to serve a variety of purposes and to fit a variety of trends. For this
reason, selecting the proper method is of vital importance. Since the first goal of our research was to
pick out nanorods from images, it was clear that we would be using classification algorithms. Humans
perform classification instinctively many times per day. Classification is the simple association of
items to their descriptions (or other people to their names). While linear regression is used when
predicting a quantitative (numerical) response, classification is used when predicting a qualitative
(categorical) response. For our data, we needed to use classification to determine which pixels were
parts of nanorods and which were not.
There are many types of classifiers, or models used to classify data. One popular classifier is logistic
regression. This classifier assumes that the logarithm of the odds that an observation will be in a
certain class, or log-odds, is linear. The coefficients of the linear portion can be estimated using the
maximum likelihood method, which estimates the coefficients such that all training observations can
(a) (b)
FIGURE 4: NANOROD MICROGRAPHS. SEM IMAGES OFTEN VARY
SIGNIFICANTLY IN BRIGHTNESS, SHARPNESS, AND INTENSITY. A) [10],
B) [11].

Berger 5
be classified correctly [8, pp. 132]. This can be formalized mathematically using the likelihood
function, where β0 and β1 are regression coefficients:
𝒍(𝜷 𝟎, 𝜷 𝟏) = ∏ 𝒑(𝒙𝒊)
𝒊:𝒚𝒊=𝟏
∏ (𝟏 − 𝒑(𝒙𝒊′))
𝒊′:𝒚𝒊′=𝟎
(2).
Logistic regression is mostly used for binary classification. When dealing with more than two classes,
linear discriminant analysis (LDA) is commonly used. LDA attempts to approximate the Bayes
classifier, which is the ideal classifier [8, pp. 37]. Quadratic discriminant analysis (QDA) is similar to
LDA, but uses quadratic discriminant functions instead of linear. K-nearest neighbors (KNN) is a
classification method which attempts to classify observations based on the K nearest observations.
Some popular, more computer-intensive methods include decision trees, random forests, boosting,
and support vector machines (SVM) [8, pp. 127]. Our group used SVM for classification, which is
covered in section 2.4.
2.3: OVERALL FLOW
Our experiments followed the structure of the flow chart in Figure 5.
FIGURE 5: GENERAL FLOW FOR AN EXPERIMENT.
For an experiment, the features of a training image were extracted as a data matrix and then used to
train a model. Then, the model was validated using a new image. This process was repeated for
various methods and feature extraction techniques. The success or failure of the model was
determined by classification error or 0/1 loss. Classification error has two components: misdetection
(false positive) error and false alarm (false negative) error. The general goal was to solve the
following optimization problem:
𝑴𝒂𝒙𝒊𝒎𝒊𝒛𝒆(
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒄𝒐𝒓𝒓𝒆𝒄𝒕𝒍𝒚 𝒄𝒍𝒂𝒔𝒔𝒊𝒇𝒊𝒆𝒅 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏𝒔
𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝒐𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏𝒔
) (3).
Prediction On
New Images

Berger 6
2.4: SUPPORT VECTOR MACHINES (SVM)
SVM is a vector space-based classifier which
separates training data based on their class labels
with a hyperplane such that the hyperplane is the
farthest possible from points in either class [13].
SVMs are one of the most popular machine learning
techniques available today [14]. SVM can be used to
construct linear or nonlinear classifiers (Figure 6)
using the kernel trick. A kernel is a function which
quantifies the similarity of two observations and
implicitly maps the data to a higher dimensional
feature space. The SVM then learns a (linear
classifier) hyperplane in this high-dimensional
feature space, resulting in nonlinear classification
boundaries in the original space [8, pp. 350]. Using
kernels is computationally less expensive than
creating new features (data transformations)
explicitly. For our research, we used mostly radial
and some linear kernels. SVM can have a hard or soft
margin. A hard margin a classifier that does not allow any misclassified observations. A soft margin
yields a smoother classifier by allowing some misclassifications [15]. The soft margin’s ability to
ignore some observations usually results in a better overall fit. The parameter which controls the
margin of an SVM is C (cost). Another parameter, γ (gamma), parametrizes the kernel function [16].
2.5: SCALE-INVARIANT FEATURE TRANSFORM (SIFT)
SIFT is a robust image descriptor developed by
David Lowe in 1999 which is commonly used in the
field of computer vision for object detection and
point matching between different views of 3D
objects. SIFT detects key points (points of interest)
for features. Histograms are used to generate a
vector of 128 features at each key point. These
vectors (descriptors) are used to classify the image
[9]. Object detection was critical for this research,
because we needed to be able to detect the features
which distinguished nanorods from the
background and because these features had to hold
up to the differences shown in the “Supervised
Learning” section. We take this ability for granted as
humans because it is so easy for us. Humans are
capable of distinguishing thousands of types of
objects [9] with almost no difficulty through a wide range of illuminations, orientations, distances,
and distractions. For example, you may be able to recognize your car very quickly from relatively far
away. But how do you know the car is yours? Surely you do not own the only car of that type in the
world. You simply recognize the car intuitively, immediately recognizing all the features of the car
FIGURE 7: FLOW OF IMAGE CLASSIFICATION
USING SIFT FEATURE EXTRACTION ALGORITHM
[9].
FIGURE 6: SAMPLE SVM CLASSIFICATION PLOT.

Berger 7
that make it yours, and this is the goal of object detection. When
using SIFT, the first key points are detected by the scale-space
extrema of the Difference-of-Gaussian (DoG) values, and SIFT
extracts a 128-dimensional descriptor vector for each key point
[16]. Figure 8a shows a plot of SIFT key points overlaid on an
image. Plotting the SIFT descriptors was unnecessary for our
research, as the descriptors were simply extracted into tables
in CSV files. However, the results of plotting the SIFT
descriptors for an image can be seen in Figure 8b.
The DoG operator normalizes SIFT features and causes the
features to be scale invariant. This means that the features do
not vary with rotation, translation, or scaling. SIFT features can
be detected through wide differences in intensity, illumination,
and sharpness of an image. This made SIFT a top option, giving
us 128 low-variance predictors to study and to trainour models
instead. However, one issue with SIFT is that it may eliminate
critical variations which could help our SVM correctly classify
images. This issue was studied further by using K-means
clustering, an unsupervised learning method, to consider a
multi-level classification problem.
2.6: K-MEANS CLUSTERING & MULTI-LEVEL SVM CLASSIFICATION
Clustering is an unsupervised learning method which separates a data set into several groups of
similar observations. We used clustering to allow us to use more than two response variables.
Previously, we classified pixels or SIFT descriptors as belonging to either the foreground (nanorods)
or background (not nanorods). However, foreground observations may themselves have a variety of
patterns. Separating these patterns into separate groups, using clustering, can improve the accuracy
of the learned classifiers. K-means clustering was used to group the foreground and background data
so that we could use a multi-class SVM, which is an SVM with more than two classes to separate. K-
means clustering simply divides n observations into K clusters, where K is a selected value. In K-
means clustering each observation is placed into the cluster with the nearest mean. This is very
similar to K-medoids clustering, where each observation is placed into the cluster with the nearest
median. In R, there is a function to partition around medoids (PAM), part of a package called Flexible
Procedures for Clustering (FPC), which estimates the best value of K [18]. While PAM is meant for K-
medoids clustering, the similarities between K-means and K-medoids mean that PAM also gives a
good estimate of K for K-means. After using K-means to cluster a data set, cluster number can be used
as the response in a multi-class SVM. Foreground and background data were clustered separately,
with K = 10 and K = 2, respectively. Then, the data was combined to run a 12-class SVM on the entire
image.
(b)
(a)
FIGURE 8: A) SIFT KEYPOINTS ON A
TEST IMAGE [17]; B) SIFT
DESCRIPTORS.

Berger 8
2.7: EXPERIMENT SETUP
When building a supervised learning model, the first step is to acquire training data to build a model
to predict test data. We began by creating “Ground Truth” (GT) from micrographs which we acquired
from publications and from other labs. GT was used to generate training data which was used to train
our models. GT was created by using Microsoft Paint to color over distinguishable nanorods in the
selected micrographs. Each nanorod was assigned a set of RGB (red, green, blue) color values.
The first predictors tested were the (x, y) pixel coordinates. The (x-y) coordinate system in image
processing is different from typical Cartesian coordinates. With a Cartesian system, the positive y axis
is oriented vertically upward, but is vertically downward in image processing. Next, SIFT features
were used as predictors. For an image with N descriptors, the data was an Nx128 matrix. The next
set of features used was based on a pixel’s neighbors. For every pixel in a training image, we used 25
features a 5x5 descriptor vector per pixel. Thus, for a training image with P pixels, the data was a
Px25 matrix. These features were extracted in CSV format in MATLAB to be used in R to train an SVM.
When implementing the SVM in R, we used tuning to determine the best combination of parameters
for our model. Since we were dealing with large, high-dimensional data sets, tuning became highly
computationally expensive. SVM tuning could take hours or even days to execute. The only solutions
were to use a faster computer or write faster codes. R, by default, does not utilize parallel computing,
so using a multi-core computer (almost all of today’s computers have multiple cores) initially
provides no advantage. To address the need for faster computing, it was necessary to develop an SVM
implementation which worked in parallel on a high-performance multi-core machine. Tuning
optimizes an SVM over a range of parameters. R tunes the SVM by testing one model at a time by
default. The doParallel library in R provides a parallel backend, or a parallel network of workers, for
the foreach loop. DoParallel must be combined with the parallel library, which is included in recent
versions of R. The foreach library enables the foreach loop, and along with the iterators library,
enables the parallel and doParallel libraries to be installed. The e1071 library is needed to run and
tune SVM. The foreach loop can be easily nested to conduct run SVM for variations of C and γ [19].
The foreach loop did not tune the SVM; the loop ran an SVM for each combination of C and γ, all at
once. For a given training data set, the model with the best accuracy (lowest error) was selected and
a single SVM (a much faster computation) was run with that set of parameters. This implementation
provided us with a major speed increase when dealing with big data and would be advantageous for
industrial nanomanufacturers with access to high-performance servers.

Berger 9
3: RESULTS AND DATA ANALYSIS
3.1: DATA COLLECTION
GT was created for 16 SEM images. Numerical data matrices were extracted using MATLAB in the
form of CSV files for both training (GT) images and test images. Models were trained using R on a 40-
core Linux (Ubuntu) platform. Using GT images to train our models allowed us to predict the locations
of nanorods in other images.
3.2: FEATURE EXTRACTION
Our first tests used the (x, y) coordinate values
as the only variables to predict the response of
foreground or background. Next, we used SIFT
features as predictors, first with a binary
response, and then using K-means clustering to
conduct 12-class classifications. Using PAM, we
found the ideal value of K for the foreground
and background data to be 10 and 2,
respectively. A section of the data matrix for
Figure 8 is shown to visualize our data. A
sample training image and GT are shown in
Figure 9 with SIFT keypoints plotted. FIGURE 9: EXAMPLE SIFT IMAGE DATA MATRIX. THE
IMAGE HAD 2,480 DESCRIPTORS AND A BINARY
RESPONSE (0 FOR BACKGROUND AND 1 FOR
FOREGROUND).
FIGURE 10: EXAMPLE ORIGINAL IMAGE (A) AND GT (B) [20].
(b)(a)

Berger 10
3.3: CLASSIFICATION
SVM tuning originally lasted from as little as 30 minutes for smaller data sets to days for larger data
sets. To address this issue, I developed a parallel computing method as discussed in the Experiment
Setup section. Using nested foreach loops, I was able to run an
SVM for all combinations of C and γ very quickly. When testing
the effectiveness of my code, running on 25 of our 40 cores, I was
able to increase the speed of the tuning process by a factor of 30
(the image took ~30 minutes to tune originally but only 57
seconds with my code) when running nine combinations of C and
γ. This implementation became very useful for tuning large data
sets like those generated by the 25 feature descriptors for every
pixel. I tuned one of these images in parallel over three hours,
meaning that the default tuning process would have lasted
almost four days. After tuning in parallel, a new SVM had to be
made using the parameters from the loop which resulted in the
lowest error. However, the parallel implementation was still an
order of magnitude faster than the default tuning.
When we applied a model from our SVM to a test image, we could
generate a table called a confusion matrix, which is simply a table of predicted values vs. real values
where each value corresponds to a class. A binary SVM classification was run for the GT image in
Figure 11 and tested with the image in Figure 8a, resulting in a
confusion matrix (Figure 13) and an SVM classification plot (Figure 12).
Using the confusion matrix and Equation 3, accuracy (η) can be
computed for each test of a classifier. Table 1 (next page) shows η for
17 experiments which used six images (designated Image 1 – Image 6,
see Figures 13 and 14 on the next page). Table 2 (pp. 12) shows the
average and standard deviation for η for each set of features and each
value of C and γ.
FIGURE 11: SVM CLASSIFICATION
PLOT FOR EXAMPLE TEST IMAGE.
FIGURE 12: CONFUSION
MATRIX FOR EXAMPLE SVM.

Berger 11
Experimental Results
Experiment Training (GT)
Image No.
Test
Image No.
Classes Features C γ Accuracy (η)
1 1* 1* 12 25 100 0.5 0.927537
2 5 3 2 (x, y) 100 5 0.761436
3 5 4 2 (x, y) 100 5 0.718798
4 3 2 2 SIFT 0.1 0.5 0.912782
5 3 4 2 SIFT 0.1 0.5 0.798438
6 3 5 2 SIFT 0.1 0.5 0.851351
7 3 6 2 SIFT 0.1 0.5 0.919440
8 1 1 2 SIFT 10 1 0.865906
9 1 1 2 SIFT 100 1 0.865906
10 1 1 2 SIFT 1000 1 0.865906
11 1 1 2 SIFT 10 5 0.865906
12 1 1 2 SIFT 100 5 0.865906
13 1 1 2 SIFT 1000 5 0.865906
14 1 1 2 SIFT 10 10 0.865906
15 1 1 2 SIFT 100 10 0.865906
16 1 1 2 SIFT 1000 10 0.865906
17 4 6 2 SIFT 100 0.5 0.919440
Avg. η 0.858963
Std. Dev. 0.053262
TABLE 1: EXPERIMENTAL RESULTS. IMAGES 1-6 SHOWN IN FIGURES 13 AND 14.
FIGURE 13: ORIGINAL IMAGES FROM EXPERIMENTS, WITH A-F CORRESPONDING
TO IMAGES 1-6 IN TABLE 1: A) [11], B) [21], C) [10], D) [20], E) [22], F) [17].
FIGURE 13: GT IMAGES FROM EXPERIMENTS, WITH A-F CORRESPONDING TO IMAGES 1-6
IN TABLE 1: A) [11], B) [21], C) [10], D) [20], E) [22], F) [17].
(a)
(e)(d) (f)
(c)(b)
(a) (b) (c)
(f)(e)(d)

Berger 12
4: DISCUSSION AND CONCLUSIONS
4.1: DISCUSSION
The main assumption in this research is
that all nanorods extend into the
substrate. This assumption will allow
us to use the top and side edges and the
angle of the side edges to find the
nanorods’ projection lengths.
Tables 1 and 2 show that the features
have the strongest correlation to SVM
classifier accuracy. As expected, using
only the (x, y) coordinates of the pixels
resulted in poor accuracy. The SIFT
features were quite robust, and the 25
feature image was well above the
average, but more trials from all three
sets of features are necessary to be able
to truly say that which features are the
best. Varying C and γ shows very little
effect on the accuracy of the classifier,
but low values of C and γ did result in a marked improvement in accuracy. This makes sense because
we were dealing with big data sets, meaning that giving single data points too much influence could
lower the classifier accuracy. Although more testing will be needed to confirm the positive effect,
using multiple classes by clustering, using cluster number as the response, may improve accuracy
dramatically. Based on these results, using SIFT feature descriptors or the 25 feature descriptors with
low values of C and γ for a 12 class SVM classifier should result in high accuracy. Maximizing classifier
accuracy will allow industries to classify images with very little error and accurately determine the
optimal nanomanufacturing processes.
The SVM classification plot in Figure 11 appears to give an upside down view of the test image (Figure
8a). This makes sense because the positive y-axis in images is oriented vertically downward, as
mentioned in the Experimental Setup section. Knowing this, the plot clearly performs reasonably
well, as the boundary separating the nanorods from the background appears clearly in the SVM
classification plot. Some misclassifications are clear both in the nanorods and in the background,
which makes sense due to this particular SVM’s accuracy rating of about 86%.
Most of the error in our experiments was due to the propagation of human error in coloring GT
images. Using multiple GT images as training data in the future may minimize the effect of human
error. Any remaining error is due to random intrinsic errors. Therefore, running more experiments
will be critical to finding the features and parameters to maximize classifier accuracy. Another
important source of error is due to the estimation of nanoparticle dimensions using projection
lengths. In the future, it will be important for nanomanufacturers to understand that the dimensions
being extracted are not exact. However, the goal is to minimize these errors to the point where valid
observations of changes in length can be made.
Features Avg. η Difference from
Total Avg. η
Std. Dev.
(x, y) 0.740117 -0.118846 0.021319
SIFT 0.871043 0.012080 0.029669
25* 0.927537* 0.068574* 0*
C
0.1 0.870503 0.011539 0.049352
10 0.865906** 0.006943** 0**
100 0.846418 -0.012545 0.072270
1000 0.865906** 0.006943** 0**
γ
0.5 0.888165 0.029201 0.047467
1 0.865906** 0.006943** 0**
5 0.815590 -0.043373 0.063082
10 0.865906** 0.006943** 0**
TABLE 2: EFFECTS OF FEATURE TYPE, C, AND GAMMA ON
ACCURACY. *ONLY ONE TRIAL USING 25 FEATURES, **ALL
EXPERIMENTS YIELDED THE SAME RESULTS.

Berger 13
Experiments 8-16 in Table 1 resulted in the same accuracy. These results mean that one or more of
the following are true:
1. SIFT features are unaffected by C and γ. This is unlikely because experiments 4-7 show a
chance in accuracy using SIFT features with low values of C and γ. One possible conclusion is
that the classifier accuracy stagnated due to the C and γ values being too, meaning that SIFT
features should only be used with low C and γ.
2. For all the experiments where η = 0.865906, there were zeroes in the bottom right corner of
the confusion matrix. This may mean that there were errors in this data or that binary
classification does not work well for SIFT features. Multi-class classification should be tested
for these experimental setups.
3. The accuracy data in Tables 1 and 2 are biased. As mentioned before, more experiments
should be run for all combinations of parameters for binary and multi-level classification in
order to get a clearer picture of how to maximize accuracy.
4. The SIFT features are too robust to properly classify our images. SIFT was designed to only
use features which did not respond to changes in lighting, viewing angle, or changes in image
brightness, sharpness, or intensity. Some of the factors which are critical to properly
classifying our images may not be considered by using SIFT. For example, changes in size may
affect optical properties of the nanoparticle being examined, and the SIFT features may be
too robust to recognize the change.
4.2: FUTURE WORK AND RECOMMENDATIONS
For future research, more experiments for all combinations of methods and parameters shown in
Table 1 are recommended to get a clearer picture of what features and parameters work best. Wider
ranges of C and γ values, as well as intermediate values, should be tested in order to potentially
establish relationships between the SVM classifier accuracy for different sets of features and the
combination of C and γ. Further use of multi-class classification is also recommended due to the high
accuracy (and the need of more data to confirm that high accuracy). Furthermore, different clustering
algorithms such as K-medoids and hierarchical clustering should be attempted for multi-level
classifications. Also, the effect of boosting or changing classification methods should be investigated.
In industry, speed, automation, and accuracy will be critical. Therefore, once classifier accuracy is
optimized, I propose the following recommendations before commercializing this research:
1. Codes should be adapted to a newer, faster language, such as C++ or Python, to improve
computation speed and to create a single software package to be licensed. Computer science
professionals are increasingly being educated in newer languages (i.e. C++, Python), and our
programs should be written in the languages they are familiar with, as well as implementing
optimized algorithms, multiple processors, and GPUs.
2. All codes should be written to utilize multi-core machines, and if possible, should
automatically run on a specified percentage of the available cores. This would maximize
accuracy for all nanomanufacturers who use the software.
3. An autorun script should be implemented, and a user-friendly Graphical User Interface (GUI)
should be developed. This would reduce the cost of labor for nanomanufacturers.

Berger 14
4.3: CONCLUSIONS
This research shows that supervised learning algorithms have potential to be an excellent solution
for nanoparticle dimension estimation and control, which would be a significant quality engineering
development. The completion of this research will enable the scale-up, standardization, and
commercialization of nanomanufacturing processes. The initial classifier accuracy is promising, and
points to a bright future for this technology as further improvements are discovered.

Berger 15
REFERENCES
[1] G. Ali Mansoori, P. Mohazzabi, P. McCormack, S. Jabbari. “Nanotechnology in cancer
prevention, detection and treatment: bright future lies ahead,” World Review of Science,
Technology and Sustainable Development, vol. 4, nos. 2/3, 2007.
[2] D. Jenvey. (2012, November 4) Materials for Dermatological Nanotechnology [Online].
Wikispaces, Tangient LLC. Available: http://nanotechnology-cis.wikispaces.com/
Materials+for+dermatological+nanotechnology. Last accessed: 24 July 2015.
[3] R. Elhajjar, V. La Saponara, A. Muliana. Smart Composites: Mechanics & Design, CRC Press,
Taylor & Francis Group, LLC., Boca Raton, FL, 2014.
[4] I. Kang, M. J. Schulz, J. H. Kim, V. Shanov, D. Shi. “A carbon nanotube strain sensor for
structural health monitoring,” Smart Materials and Structures, 15 (2006) 737-748.
[5] L. J. Andrés, M. F. Menéndez, D. Gómez, A. L. Martínez, J. P. Kettle, A. Menéndez, B. Ruiz. “Rapid
synthesis of ultra-long silver nanowires for tailor-made transparent conductive electrodes:
proof of concept in organic solar cells,” Nanotechnology, vol. 26, 2015.
[6] A. Mandal. (2012, November 4). Properties of Nanoparticles [Online]. Available: http://
http://www.news-medical.net/health/Properties-of-Nanoparticles.aspx. Last accessed: 24
July 2015.
[7] Berkeley School of Information. (2014, March 5). Moore’s Law and Computer Processing
Power [Online]. Available: http://datascience.berkeley.edu/moores-law-processing-power/.
[8] G. James, D. Witten, T. Hastie, R. Tibshirani. An Inroduction to Statistical Learning with
Applications in R, Springer Science+Business Media, New York, 2013.
[9] J. Kim, B. Kim, S. Savarese. “Comparing Image Classification Methods: K-Nearest Neighbor and
Support Vector Machines,” Proc. 6th WSEAS International Conference on Circuits, Systems,
Signals and Telecommunications, Cambridge, pp. 133-138, 2012.
[10] K. Kim, K. Utashiro, Y. Abe, M. Kawamura. “Structural Properties of Zinc Oxide Nanorods
Grown on Al-Doped Zinc Oxide Seed Layer and Their Applications in Dye-Sensitized Solar
Cells,” Materials, 7(4):2522{2533, 2014.
[11] C. Thelander, P.Agarwal, S. Brongersma, J. Eymery, L. F. Feiner, A. Forchel, M. Scheffler, W.
Riess, B.J. Ohlsson, U. Gösele, L. Samuelson. “Nanowire-based one dimensional electronics,”
Materials Today, vol. 9, no. 10 (pp. 28-35), October 2006.
[12] C. D. Manning, P. Raghavan, H. Schütze. An Introduction to Information Retrieval, Cambridge
University Press, Cambridge, England, 1 April 2009.
[13] X. Wu et, V. Kumar, J. R. Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. J. McLachlan, A Ng. B. Liu, P.
S. Yu, Z. Zhou, M. Steinback, D. J. Hand, D. Steinberg. “Top 10 algorithms in data mining,”
Knowledge and Information Systems, no. 14:1-37, 2008.

Berger 16
[14] Y. Liu, H. Zhang, Y. Wu. “Hard or Soft Classification? Large-margin Unified Machines,” North
Carolina State University, Raleigh, North Carolina, 10 January 2011.
[15] SciKit-Learn. (2014). RBF SVM Parameters [Online]. Available: http://scikit-
learn.org/stable/auto_examples/svm/plot_rbf_parameters.html. Last accessed: 24 July
2015.
[16] T. Lindeberg. (2012). Scale Invariant Feature Transform [Online]. Scholarpedia. Available:
http://www.scholarpedia.org/article/Scale_Invariant_Feature_Transform. Last accessed: 24
July 2015.
[17] H. Gao W. Cai, P. Shimpi, H. Lin, P. Gao. “(La,Sr)CoO 3/ZnO nanofilm–nanorod diode arrays
for photo-responsive moisture and humidity detection,” Journal of Physics D: Applied
Physics, 43(27):272002, 2010.
[18] C. Hennig. (2014, October 2) Flexible procedures for clustering [Online]. CRAN Repository.
Available: https://cran.r-project.org/web/packages/fpc/fpc.pdf.
[19] S. Weston. “Nesting Foreach Loops,” Revolution Analytics, Redmond, Washington, 10 April
2014.
[20] R. Wang, H. Tan, Z. Zhao, G. Zhang, L. Song, W. Dong, Z. Sun. “Stable ZnO@TiO2 core/shell
nanorod arrays with exposed high energy facets for self-cleaning coatings with anti-reflective
properties,” Journal of Material Chemistry A, 2:7313–7318, 2014.
[21] Y. Li, J. Kubota, K. Domen, “A Protocol for Fabrication of Barium-doped Tantalum Nitride
Nanorod Arrays,” Protocol Exchange, Nature Publising Group, doi 10.1038/protex.2013.080,
2013.
[22] Y. Luo, L. Wang, Y. Zou, X. Sheng, L. Chang, D. Yang. “Electrochemically Deposited Cu2O on
TiO2 Nanorod Arrays for Photovoltaic Application,” Electrochemical and Solid-State Letters,
15(2):H34{H36, 2011.

TBerger_FinalReport

Related slideshows

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (7)

Similar to TBerger_FinalReport

Similar to TBerger_FinalReport (20)

More from Thaddeus Berger

More from Thaddeus Berger (8)

TBerger_FinalReport