Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Machine learning algorithms laiki

The document provides an overview of various machine learning algorithms, including Linear Regression, Logistic Regression, Support Vector Machines (SVM), Naïve Bayes, K-Nearest Neighbors (KNN), Decision Trees, and Artificial Neural Networks (ANN). Each algorithm is described with its applications, types, and working principles, highlighting their use in classification and regression tasks. Additionally, it includes practical examples and implementation details using the Iris dataset in Python.

Uploaded by

sitemba4
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Machine learning algorithms laiki

The document provides an overview of various machine learning algorithms, including Linear Regression, Logistic Regression, Support Vector Machines (SVM), Naïve Bayes, K-Nearest Neighbors (KNN), Decision Trees, and Artificial Neural Networks (ANN). Each algorithm is described with its applications, types, and working principles, highlighting their use in classification and regression tasks. Additionally, it includes practical examples and implementation details using the Iris dataset in Python.

Uploaded by

sitemba4
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 123

Machine learning algorithms

Regression
SVM
Naïve bayes
KNN
ANN
Decision trees
Linear regression
• Linear regression is one of the easiest and most
popular Machine Learning algorithms. It is a
statistical method that is used for predictive
analysis. Linear regression makes predictions for
continuous/real or numeric variables such
as sales, salary, age, product price, etc.
• Linear regression algorithm shows a linear
relationship between a dependent (y) and one or
more independent (y) variables, hence called as
linear regression.
Some popular applications of linear regression are:
o Analyzing trends and sales estimates
o Salary forecasting
o Real estate prediction
o Arriving at ETAs in traffic.
Types
• Simple linear regression (one predictor)
• Multiple linear regression (multiple predictor)
Logistic Regression

o A supervised learning algorithm which is used to solve the


classification problems.

o Logistic regression algorithm works with the categorical variable


such as 0 or 1, Yes or No, True or False, Spam or not spam, etc.
• It is a predictive analysis algorithm which works on the concept of
probability
• Logistic regression uses sigmoid function or logistic function
which is a complex cost function.
•f(x)= Output between the 0 and 1 value.
•x= input to the function
•e= base of natural logarithm.
When we provide the input values (data) to the function, it gives the S-
curve as follows
Types of logistic regression

o Binary(0/1, pass/fail)
o Multi(cats, dogs, lions)
o Ordinal(low, medium, high)
Support vector machine
Support Vector Machine or SVM is one of the
most popular Supervised Learning algorithms,
which is used for Classification as well as
Regression problems. However, primarily, it is
used for Classification problems in Machine
Learning.
• The goal of the SVM algorithm is to create the
best line or decision boundary that can
segregate n-dimensional space into classes so
that we can easily put the new data point in
the correct category in the future. This best
decision boundary is called a hyperplane.
o Kernel: It is a function used to map a lower-dimensional data into higher dimensional data.

o Hyperplane: In general SVM, it is a separation line between two classes, but in SVR, it is a

line which helps to predict the continuous variables and cover most of the datapoints.

o Boundary line: Boundary lines are the two lines apart from hyperplane, which creates a

margin for datapoints.

o Support vectors: Support vectors are the datapoints which are nearest to the hyperplane

and opposite class.


• SVM algorithm can be used for Face
detection, image classification, text
categorization,
Types
• Linear SVM: Linear SVM is used for linearly separable
data, which means if a dataset can be classified into
two classes by using a single straight line, then such
data is termed as linearly separable data, and classifier
is used called as Linear SVM classifier.
• Non-linear SVM: Non-Linear SVM is used for non-
linearly separated data, which means if a dataset
cannot be classified by using a straight line, then such
data is termed as non-linear data and classifier used is
called as Non-linear SVM classifier.
Linear SVM
Non linear SVM
Naïve Bayes Classifier Algorithm

• Naïve Bayes algorithm is a supervised learning algorithm,


which is based on Bayes theorem and used for solving
classification problems.
• It is mainly used in text classification that includes a high-
dimensional training dataset.
• Naïve Bayes Classifier is one of the simple and most
effective Classification algorithms which helps in building
the fast machine learning models that can make quick
predictions.
• It is a probabilistic classifier, which means it predicts on
the basis of the probability of an object.
• Some popular examples of Naïve Bayes Algorithm are spam
filtration, Sentimental analysis, and classifying articles.
• Naïve: It is called Naïve because it assumes that
the occurrence of a certain feature is
independent of the occurrence of other features.
Such as if the fruit is identified on the bases of
color, shape, and taste, then red, spherical, and
sweet fruit is recognized as an apple. Hence each
feature individually contributes to identify that it
is an apple without depending on each other.
• Bayes: It is called Bayes because it depends on
the principle of Bayes' Theorem.
NB example
Suppose we have a dataset of weather
conditions and corresponding target variable
"Play". So using this dataset we need to decide that
whether we should play or not on a particular day
according to the weather conditions. So to solve
this problem, we need to follow the below steps:
• Convert the given dataset into frequency tables.
• Generate Likelihood table by finding the
probabilities of given features.
• Now, use Bayes theorem to calculate the
posterior probability.
Applying the theorem
Advantages of Naïve Bayes Classifier:
• Naïve Bayes is one of the fast and easy ML algorithms to predict a class of datasets.
• It can be used for Binary as well as Multi-class Classifications.
• It performs well in Multi-class predictions as compared to the other Algorithms.
• It is the most popular choice for text classification problems.
Disadvantages of Naïve Bayes Classifier:
• Naive Bayes assumes that all features are independent or unrelated, so it cannot
learn the relationship between features.
Applications of Naïve Bayes Classifier:
• It is used for Credit Scoring.
• It is used in medical data classification.
• It can be used in real-time predictions because Naïve Bayes Classifier is an eager
learner.
• It is used in Text classification such as Spam filtering and Sentiment analysis.
K-Nearest Neighbor(KNN) Algorithm

• K-Nearest Neighbour is one of the simplest Machine


Learning algorithms based on Supervised Learning
technique.
• K-NN algorithm assumes the similarity between the new
case/data and available cases and put the new case into the
category that is most similar to the available categories.
• K-NN algorithm stores all the available data and classifies a
new data point based on the similarity. This means when
new data appears then it can be easily classified into a well
suite category by using K- NN algorithm.
• K-NN algorithm can be used for Regression as well as for
Classification but mostly it is used for the Classification
problems.
K-NN is a non-parametric algorithm, which means it does not make any assumption on
underlying data.
It is also called a lazy learner algorithm because it does not learn from the training set
immediately instead it stores the dataset and at the time of classification, it performs an
action on the dataset.

Example: Suppose, we have an image of a creature that looks similar to cat and dog, but
we want to know either it is a cat or dog. So for this identification, we can use the KNN
algorithm, as it works on a similarity measure. The KNN model will find the similar
features of the new data set to the cats and dogs images and based on the most similar
features it will put it in either cat or dog category.
Suppose there are two categories, i.e., Category A and Category B,
and we have a new data point x1
Working of KNN
• Step-1: Select the number K of the neighbors
• Step-2: Calculate the Euclidean distance of K
number of neighbors
• Step-3: Take the K nearest neighbors as per the
calculated Euclidean distance.
• Step-4: Among these k neighbors, count the
number of the data points in each category.
• Step-5: Assign the new data points to that
category for which the number of the neighbor is
maximum.
Decision Tree Classification Algorithm

• Decision Tree is a Supervised learning technique that can


be used for both classification and Regression problems,
but mostly it is preferred for solving Classification
problems. It is a tree-structured classifier, where internal
nodes represent the features of a dataset, branches
represent the decision rules and each leaf node represents
the outcome.
• In a Decision tree, there are two nodes, which are
the Decision Node and Leaf Node. Decision nodes are used
to make any decision and have multiple branches, whereas
Leaf nodes are the output of those decisions and do not
contain any further branches.
• Eg J48.
Tree building
• In order to build a tree, we use the CART
algorithm, which stands for Classification and
Regression Tree algorithm.
• A decision tree simply asks a question, and
based on the answer (Yes/No), it further split
the tree into subtrees.
How it works
• In a decision tree, for predicting the class of the given dataset, the
algorithm starts from the root node of the tree.
Steps
• Step-1: Begin the tree with the root node, says S, which contains
the complete dataset.
• Step-2: Find the best attribute in the dataset using Attribute
Selection Measure (ASM).
• Step-3: Divide the S into subsets that contains possible values for
the best attributes.
• Step-4: Generate the decision tree node, which contains the best
attribute.
• Step-5: Recursively make new decision trees using the subsets of
the dataset created in step -3. Continue this process until a stage is
reached where you cannot further classify the nodes and called the
final node as a leaf node
Attribute Selection Measures(ASM)

• While implementing a Decision tree, the main


issue arises that how to select the best attribute
for the root node and for sub-nodes. So, to solve
such problems there is a technique which is
called as Attribute selection measure or ASM.
• By this measurement, we can easily select the
best attribute for the nodes of the tree. There are
two popular techniques for ASM, which are:
– Information Gain
– Gini Index
K means Clustering
• K-Means Clustering is an Unsupervised
Learning algorithm, which groups the
unlabeled dataset into different clusters. Here
K defines the number of pre-defined clusters
that need to be created in the process, as if
K=2, there will be two clusters, and for K=3,
there will be three clusters, and so on.
How it works
• Step-1: Select the number K to decide the number of clusters.
• Step-2: Select random K points or centroids. (It can be other
from the input dataset).
• Step-3: Assign each data point to their closest centroid, which
will form the predefined K clusters.
• Step-4: Calculate the variance and place a new
centroid(average) of each cluster.
• Step-5: Repeat the third steps, which means reassign each data
point to the new closest centroid of each cluster.
• Step-6: If any reassignment occurs, then go to step-4 else go to
FINISH.
• Step-7: The model is ready.
Hands on in python
IRIS dataset.
Problem: predicting the species of a flower
based on the petal/sepal width and size
Iris data set
The numeric parameters which the dataset
contains are Sepal width, Sepal length, Petal
width and Petal length. In this data we will be
predicting the species of the flowers based on
these parameters. We will be building a machine
learning project to determine the species of the
flower.
NB. This is not a image-classifier algorithm
Data set extract
load ML libraries and algorithms
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import model_selection
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import LinearRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.svm import SVC
Loading the data using pandas

url = "/iris.data.txt"
names = ['sepal-length', 'sepal-width', 'petal-
length', 'petal-width', 'class']
dataset = pd.read_csv(url, names=names)
Summarize the data and perform
analysis
• Dimensions of data set: Find out how many
rows and columns our dataset has using the
shape property
# shape
print(dataset.shape)
Result: (150,5), Which means our dataset has
150 rows and 5 columns
To check the first 20 rows of our dataset

print(dataset.head(20))
statistical summary
Class distribution
Visual data analysis

Box plot is a percentile-based graph, which divides the data into


four quartiles of 25% each
Histogram
Scatter plot

shows correlation with respect to other features


Splitting the Data

• Two data set


-Training dataset
-Testing dataset
Since the the dataset is small (150 records) , we will
use 120 (80%) records for training the model and
30 (20%) records to evaluate the model.
array = dataset.values
X = array[:,0:4]
Y = array[:,4]
x_train, x_test, y_train, y_test = model_selection.train_test_split(X, Y, test_size=0.2,
random_state=7)
Evaluating the model and training the
Model
• K – Nearest Neighbour (KNN)

• Support Vector Machine (SVM)

• Random forest

• Logistic Regression
KNN Model
model = KNeighborsClassifier()
model.fit(x_train,y_train)
predictions = model.predict(x_test)
print(accuracy_score(y_test, predictions))
SVM
model = SVC()
model.fit(x_train,y_train)
predictions = model.predict(x_test)
print(accuracy_score(y_test, predictions))
Random forest
model = RandomForestClassifier(n_estimators=10)
model.fit(x_train,y_train)
predictions = model.predict(x_test)
print(accuracy_score(y_test, predictions))
Logistic regression
model = LogisticRegression()
model.fit(x_train,y_train)
predictions = model.predict(x_test)
print(accuracy_score(y_test, predictions))
Model accuracy
K – Nearest Neighbour (KNN) = 0.9

Support Vector Machine (SVM)= 0.93333333

Randomforest=0.8666666666666667

Logistic Regression= 0.8


Add a confusion matrix to the models
Artificial neuron network(ANN)
• The term "Artificial Neural Network" is derived
from Biological neural networks that develop the
structure of a human brain. Similar to the human
brain that has neurons interconnected to one
another, artificial neural networks also have
neurons that are interconnected to one another
in various layers of the networks. These neurons
are known as nodes.
Biological Neural Network.
Artificial neuron networks (ANN)
BNN vs ANN
Artificial neural network architecture

• Artificial Neural Network primarily consists of


three layers:
• Input Layer:
• As the name suggests, it accepts inputs in several different
formats provided by the programmer.
• Hidden Layer:
• The hidden layer presents in-between input and output
layers. It performs all the calculations to find hidden
features and patterns.
• Output Layer:
• The input goes through a series of transformations using
the hidden layer, which finally results in output that is
conveyed using this layer.
The artificial neural network takes input and computes the
weighted sum of the inputs and includes a bias. This
computation is represented in the form of a transfer function.
Advantages of Artificial Neural
Network (ANN)
• Parallel processing capability:
Artificial neural networks have a numerical value that can perform more than one task
simultaneously.
• Storing data on the entire network:
Data that is used in traditional programming is stored on the whole network, not on a
database. The disappearance of a couple of pieces of data in one place doesn't
prevent the network from working.
• Capability to work with incomplete knowledge:
After ANN training, the information may produce output even with inadequate data.
The loss of performance here relies upon the significance of missing data.
• Having a memory distribution:
For ANN is to be able to adapt, it is important to determine the examples and to
encourage the network according to the desired output by demonstrating these
examples to the network. The succession of the network is directly proportional to the
chosen instances, and if the event can't appear to the network in all its aspects, it can
produce false output.
Having fault tolerance:
• Extortion of one or more cells of ANN does not prohibit it from generating output,
and this feature makes the network fault-tolerance.
Disadvantages of ANN
• Assurance of proper network structure:
There is no particular guideline for determining the structure of artificial neural
networks. The appropriate network structure is accomplished through experience,
trial, and error.
• Unrecognized behavior of the network:
It is the most significant issue of ANN. When ANN produces a testing solution, it does
not provide insight concerning why and how. It decreases trust in the network.
• Hardware dependence:
Artificial neural networks need processors with parallel processing power, as per their
structure. Therefore, the realization of the equipment is dependent.
• Difficulty of showing the issue to the network:
ANNs can work with numerical data. Problems must be converted into numerical
values before being introduced to ANN. The presentation mechanism to be resolved
here will directly impact the performance of the network. It relies on the user's
abilities.
• The duration of the network is unknown:
The network is reduced to a specific value of the error, and this value does not give us
optimum results.
How they work?
• Artificial Neural Network can be best represented
as a weighted directed graph, where the artificial
neurons form the nodes. The association
between the neurons outputs and neuron inputs
can be viewed as the directed edges with
weights.
• The Artificial Neural Network receives the input
signal from the external source in the form of a
pattern and image in the form of a vector. These
inputs are then mathematically assigned by the
notations x(n) for every n number of inputs.
Activation function
• The activation function(mathematical
equations) refers to the set of transfer
functions used to determine the desired
output.
• Sigmoid function (0-1)
• Hyperbolic tangent(tanh) (-1 , 1)
• ReLU (Rectified Linear Unit) (0- infinity)
Sigmoid

The main reason why we use sigmoid function is because it exists between (0
to 1). Therefore, it is especially used for models where we have to predict the
probability as an output. Since probability of anything exists only between the
range of 0 and 1, sigmoid is the right choice.
Hyperbolic tangent
The range of the tanh function is from (-1 to 1).
tanh is also sigmoidal (s - shaped)
The advantage is that the
negative inputs will be
mapped strongly negative
and the zero inputs will be
mapped near zero in the
tanh graph.
ReLU
• The ReLU is the most used activation function in the world
right now.Since, it is used in almost all the convolutional
neural networks or deep learning. the ReLU is half rectified
(from bottom). F(z) is zero when z is less than zero and f(z) is
equal to z when z is above or equal to zero.
Types of Artificial Neural Network

Feedback ANN (recurrent)


• In this type of ANN, the output returns into the network to
accomplish the best-evolved results internally. The feedback
networks feed information back into itself and are well suited to
solve optimization issues.
Feed-Forward ANN:
• A feed-forward network is a basic neural network comprising of an
input layer, an output layer, and at least one layer of a neuron.
Through assessment of its output by reviewing its input, the
intensity of the network can be noticed based on group behavior of
the associated neurons, and the output is decided. The primary
advantage of this network is that it figures out how to evaluate and
recognize input patterns.
Deep learning
Deep learning is based on the branch of
machine learning, which is a subset of artificial
intelligence.
Deep learning is a collection of statistical
techniques of machine learning for learning
feature hierarchies that are actually based on
artificial neural networks.
It basically neural networks with multiple
hidden layers.
Example of Deep Learning
1st hidden layer will determine the face feature,
i.e., it will fixate on eyes, nose, and lips, etc.
2nd hidden layer, will actually determine the
correct face
.
. More hidden layers and so on.
as More hidden layers increase, we are able to
solve complex problems.
Deep learning applications
• Self-Driving Cars
In self-driven cars, it is able to capture the images around it by processing
a huge amount of data, and then it will decide which actions should be
incorporated to take a left or right or should it stop. So, accordingly, it will
decide what actions it should take, which will further reduce the accidents
that happen every year.
• Voice Controlled Assistance
When we talk about voice control assistance, then Siri is the one thing
that comes into our mind. So, you can tell Siri whatever you want it to do it
for you, and it will search it for you and display it for you.
• Automatic Image Caption Generation
Whatever image that you upload, the algorithm will work in such a way
that it will generate caption accordingly. If you say blue colored eye, it will
display a blue-colored eye with a caption at the bottom of the image.
• Automatic Machine Translation
With the help of automatic machine translation, we are able to convert
one language into another with the help of deep learning.
Advantages
• It lessens the need for feature engineering.
• It eradicates all those costs that are needless.
• It easily identifies difficult defects.
• It results in the best-in-class performance on
problems.
Disadvantages
• It requires an ample amount of data.
• It is quite expensive to train.
• It does not have strong theoretical groundwork.
Convolutional Neural Network

• Convolutional Neural Network is one of the


main categories to do image classification and
image recognition in neural networks. Scene
labeling, objects detections, and face
recognition, etc., are some of the areas where
convolutional neural networks are widely
used.
• Some neurons fires when exposed to vertices edges and
some when shown horizontal or diagonal edges. CNN
utilizes spatial correlations which exist with the input data.
Each concurrent layer of the neural network connects some
input neurons. This region is called a local receptive field.
The local receptive field focuses on hidden neurons.

• Convolutional Neural Networks has the following 4 layers:


• Convolutional
• ReLU Layer
• Pooling
• Fully Connected
• CNN takes an image as input, which is classified and
process under a certain category such as dog, cat, lion,
tiger, etc. The computer sees an image as an array of
pixels and depends on the resolution of the image.
Based on image resolution, it will see as h * w * d,
where h= height w= width and d= dimension. For
example, An RGB image is 6 * 6 * 3 array of the matrix,
and the grayscale image is 4 * 4 * 1 array of the matrix.
• In CNN, each input image will pass through a sequence
of convolution layers along with pooling, fully
connected layers, filters (Also known as kernels). After
that, we will apply the Soft-max function to classify an
object with probabilistic values 0 and 1.
Convolution Layer

Convolution layer is the first layer to extract


features from an input image. By learning image
features using a small square of input data, the
convolutional layer preserves the relationship
between pixels. It is a mathematical operation
which takes two inputs such as image matrix
and a kernel or filter.
ReLU
Rectified Linear unit(ReLU) transform functions only activates a node if the input is
above a certain quantity. While the data is below zero, the output is zero, but when
the input rises above a certain threshold. It has a linear relationship with the
dependent variable.
In this layer, we remove every negative value from the filtered images and replaces
them with zeros.
It is happening to avoid the values from adding up to zero.
Pooling layer
• Pooling layer plays a vital role in pre-processing of any image.
Pooling layer reduces the number of the parameter when the image
is too large. Pooling is "downscaling" of the image achieved from
previous layers. It can be compared to shrink an image to reduce
the image's density. Spatial pooling is also called downsampling and
subsampling, which reduce the dimensionality of each map but
remains essential information. These are the following types of
spatial pooling.
• We do this by implementing the following 4 steps:
– Pick a window size (usually 2 or 3)
– Pick a stride (usually 2)
– Walk your window across your filtered images
– From each window, take the maximum value
Max pooling
• Max pooling is a sample-based discretization
process. The main objective of max-pooling is to
downscale an input representation, reducing its
dimension and allowing for the assumption to be
made about feature contained in the sub-region
binned.
• Max pooling is complete by applying a max filter
in non-overlapping sub-regions of initial
representation.
Fully Connected (Dense) Layer

The fully connected layer (dense layer) is a layer


where the input from other layers will be
depressed into the vector. It will transform the
output into any desired number of classes into
the network.
• In the above diagram, the map matrix is
converted into the vector such as x1, x2, x3...
xn with the help of a fully connected layer. We
will combine features to create any model and
apply activation function like
as softmax or sigmoid to classify the outputs
as a car, dog, truck, etc.
CNN use case
Model training
Training of CNN in TensorFlow

• The MNIST database (Modified National Institute of


Standard Technology database) is an extensive
database of handwritten digits, which is used for
training various image processing systems. It was
created by "reintegrating" samples from the original
dataset of the MNIST.
• MNIST is the equivalent Hello World of image analysis.
It consists of hand written numbers, 0-9, in 28x28 pixel
squares.
Each gray-scale pixel contains an integer 0-255 to
indicate darkness, with 0 white and 255 black.
There are about 60,000 training records, and about
10,000 test records.
Softmax regression in tensorflow
• There are only ten possibilities of a
TensorFlow MNIST to be from 0 to 9. Our aim
is to look at an image and say with the
particular probability that a given image is a
particular digit. Softmax is used when there is
a possibility as the regression gives us values
between 0 and 1 that sum up to 1. Therefore,
our approach should be simple
Code execution
• executing our code in Google Colab (an online
editor of machine learning).
• Link
: https://colab.research.google.com
Or jupyter notebook
Steps in training CNN
• Step 1: Upload Dataset
• Step 2: The Input layer
• Step 3: Convolutional layer
• Step 4: Pooling layer
• Step 5: Convolutional layer and Pooling Layer
• Step 6: Dense layer
• Step 7: Logit Layer
CNN & TF
Step 1: Upload Dataset

• The MNIST dataset is available with scikit for


learning in this URL (Unified Resource Locator).
We can download it and store it in our
downloads. We can upload it with fetch_mldata
('MNIST Original').
• Create a test/train set
• We need to split the dataset into train_test_split.
• Scale the features
• Finally, we scale the function with the help
of MinMax Scaler.
Load code(jupyter)
After load

x_train contains 60k arrays of 28x28.


The y_train vector contains the corresponding
labels for these.
x_test contains 10k arrays of 28x28.
The y_test vector contains the corresponding
labels for these.
Matplotlib visualization
import matplotlib.cm as cm import
matplotlib.pyplot as plt
plt.imshow(x_train[55].reshape(28, 28),
cmap=cm.Greys)
NB there are many colormaps e.g Purples, Blues
etc
Plotting a bunch of records
Use Matplotlib for this task (18 records)
images = x_train[0:18] fig, axes = plt.subplots(3,
6, figsize=[9,5]) for i, ax in enumerate(axes.flat):
ax.imshow(x_train[i].reshape(28, 28),
cmap=cm.Greys) ax.set_xticks([])
ax.set_yticks([]) plt.show
Distribution of training data labels

counts = np.bincount(y_train) nums =


np.arange(len(counts)) plt.bar(nums, counts)
plt.show() print(counts)
Applying Keras/TensorFlow neural
network
Use tensorflow to train the model with 60k training records,
compile the model, and classify 10k test records with 98%
accuracy.
Create the model
Build the keras model by stacking layers into the network. Our
model here has four layers:
• Flatten reshapes the data into a 1-dimensional array
• Dense tells the model to use output arrays of shape (*, 512)
and sets rectified linear activation function.
• Dropout applies dropout to the input to help avoid
overfitting.
• The next Dense line condenses the output into probabilities
for each of the 10 digits.
Compile the model

• Adam is an optimization algorithm that uses stochastic


gradient descent to update network weights.
• Sparse categorical cross entropy is a loss function that is
required to compile the model. The loss function measures
how accurate the model is during training. We want to
minimize this function to steer the model in the right
direction.
• A metric is a function that is used to judge the performance
of your model. We're using accuracy of our predictions as
compared to y_test as our metric.
Lastly, we fit our training data into the model, with several
training repetitions (epochs), then evaluate our test data.
output
CIFAR-10 and CIFAR-100 Dataset in
TensorFlow
• The CIFAR-10 (Canadian Institute for Advanced
Research) and CIFAR-100 are labeled subsets of the 80
million tiny images dataset. They were collected by Alex
Krizhevsky, Geoffrey Hinton and Vinod Nair. The dataset is
divided into five training batches and only one test batch,
each with 10000 images.
• The test batch contains 1000 randomly-selected images
from each class. The training batches contain the remaining
images in a random order, but some training batches
contain the remaining images in a random order, but some
training batches contain more images from one class to
another. Between them, the training batches contain
exactly 5000 images for each class.
Generate predictions for test set

The predictions are in the form of a list of 10 floats,


with probabilities for each value. We can get the
prediction by picking the index of the list item with
the highest probability. And we can visualize that
item to verify our prediction.
predictions = model.predict(x_test)
print(predictions[88])
print(np.argmax(predictions[88]))
plt.imshow(x_test[88].reshape(28, 28),
cmap=cm.Greys)

You might also like