Machine Learning: Bilal Khan
Machine Learning: Bilal Khan
Machine Learning: Bilal Khan
Bilal Khan
Machine Learning
Machine learning is a growing technology which enables computers to learn
automatically from past data. Machine learning uses various algorithms
for building mathematical models and making predictions using historical
data or information. Currently, it is being used for various tasks such as image
recognition, speech recognition, email filtering, Facebook auto-
tagging, recommender system, and many more.
This machine learning tutorial gives you an introduction to machine learning along
with the wide range of machine learning techniques such
as Supervised, Unsupervised, and Reinforcement learning. You will learn
about regression and classification models, clustering methods, hidden Markov
models, and various sequential models.
What is Machine Learning
Machine learning is an application of artificial intelligence (AI) that provides systems the ability to
automatically learn and improve from experience without being explicitly programmed. Machine
learning focuses on the development of computer programs that can access data and use it to learn
for themselves.
The process of learning begins with observations or data, such as examples, direct experience, or
instruction, in order to look for patterns in data and make better decisions in the future based on the
examples that we provide. The primary aim is to allow the computers learn automatically without
human intervention or assistance and adjust actions accordingly.
We can train machine learning algorithms by providing them the huge amount of
data and let them explore the data, construct the models, and predict the required
output automatically. The performance of the machine learning algorithm depends
on the amount of data, and it can be determined by the cost function. With the
help of machine learning, we can save both time and money.
The importance of machine learning can be easily understood by its uses cases,
Currently, machine learning is used in self-driving cars, cyber fraud
detection, face recognition, and friend suggestion by Facebook, etc. Various
top companies such as Netflix and Amazon have build machine learning models
that are using a vast amount of data to analyze the user interest and recommend
product accordingly.
Following are some key points which show the importance of Machine
Learning:
The very first scenario in which we want a machine to learn and take data-driven decisions, can
be the domain where there is a lack of human expertise. The examples can be navigations in
unknown territories or spatial planets.
Dynamic scenarios
There are some scenarios which are dynamic in nature i.e. they keep changing over time. In
case of these scenarios and behaviors, we want a machine to learn and take data-driven
decisions. Some of the examples can be network connectivity and availability of infrastructure
in an organization.
“A computer program is said to learn from experience E with respect to some class of tasks T
and performance measure P, if its performance at tasks in T, as measured by P, improves with
experience E.”
The above definition is basically focusing on three parameters, also the main components of
any learning algorithm, namely Task(T), Performance(P) and experience (E). In this context, we
can simplify this definition as −
Task(T)
From the perspective of problem, we may define the task T as the real-world problem to be
solved. The problem can be anything like finding best house price in a specific location or to
find best marketing strategy etc. On the other hand, if we talk about machine learning, the
definition of task is different because it is difficult to solve ML based tasks by conventional
programming approach.
A task T is said to be a ML based task when it is based on the process and the system must
follow for operating on data points. The examples of ML based tasks are Classification,
Regression, Structured annotation, Clustering, Transcription etc.
Experience (E)
As name suggests, it is the knowledge gained from data points provided to the algorithm or
model. Once provided with the dataset, the model will run iteratively and will learn some
inherent pattern. The learning thus acquired is called experience(E). Making an analogy with
human learning, we can think of this situation as in which a human being is learning or gaining
some experience from various attributes like situation, relationships etc. Supervised,
unsupervised and reinforcement learning are some ways to learn or gain experience. The
experience gained by out ML model or algorithm will be used to solve the task T.
Performance (P)
An ML algorithm is supposed to perform task and gain experience with the passage of time.
The measure which tells whether ML algorithm is performing as per expectation or not is its
performance (P). P is basically a quantitative metric that tells how a model is performing the
task, T, using its experience, E. There are many metrics that help to understand the ML
performance, such as accuracy score, F1 score, confusion matrix, precision, recall, sensitivity
etc.
Challenges in Machines Learning
While Machine Learning is rapidly evolving, making significant strides with cybersecurity and
autonomous cars, this segment of AI as whole still has a long way to go. The reason behind is
that ML has not been able to overcome number of challenges. The challenges that ML is facing
currently are −
Quality of data − Having good-quality data for ML algorithms is one of the biggest challenges.
Use of low-quality data leads to the problems related to data preprocessing and feature
extraction.
No clear objective for formulating business problems − Having no clear objective and well-
defined goal for business problems is another key challenge for ML because this technology is
not that mature yet.
Curse of dimensionality − Another challenge ML model faces is too many features of data
points. This can be a real hindrance.
Supervised Learning
Machine Learning
Classification of
Unsupervised
Learning
Reinforcement
Learning
1) Supervised Learning
The system creates a model using labeled data to understand the datasets and
learn about each data, once the training and processing are done then we test the
model by providing a sample data to check whether it is predicting the exact
output or not.
The goal of supervised learning is to map input data with the output data. The
supervised learning is based on supervision, and it is the same as when a student
learns things in the supervision of the teacher. The example of supervised learning
is spam filtering.
• Classification
• Regression
2) Unsupervised Learning
The training is provided to the machine with the set of data that has not been
labeled, classified, or categorized, and the algorithm needs to act on that data
without any supervision. The goal of unsupervised learning is to restructure the
input data into new features or a group of objects with similar patterns.
• Clustering
• Association
3) Reinforcement Learning
The robotic dog, which automatically learns the movement of his arms, is an
example of Reinforcement learning.
Machine learning Life Cycle
Machine learning has given the computer systems the abilities to automatically
learn without being explicitly programmed. But how does a machine learning
system work? So, it can be described using the life cycle of machine learning.
Machine learning life cycle is a cyclic process to build an efficient machine learning
project. The main purpose of the life cycle is to find a solution to the problem or
project.
Machine learning life cycle involves seven major steps, which are given below:
(See Next Slide)
The most important thing in the complete
process is to understand the problem and
to know the purpose of the problem.
Therefore, before starting the life cycle,
we need to understand the problem
because the good result depends on the
better understanding of the problem.
Data Gathering is the first step of the machine learning life cycle. The goal of this
step is to identify and obtain all data-related problems.
In this step, we need to identify the different data sources, as data can be
collected from various sources such as files, database, internet, or mobile
devices. It is one of the most important steps of the life cycle. The quantity and
quality of the collected data will determine the efficiency of the output. The more
will be the data, the more accurate will be the prediction.
By performing the above task, we get a coherent set of data, also called as
a dataset. It will be used in further steps.
2. Data preparation
After collecting the data, we need to prepare it for further steps. Data preparation
is a step where we put our data into a suitable place and prepare it to use in our
machine learning training.
In this step, first, we put all data together, and then randomize the ordering of
data.
Data exploration
It is used to understand the nature of data that we have to work with. We need to
understand the characteristics, format, and quality of data.
A better understanding of data leads to an effective outcome. In this, we find
Correlations, general trends, and outliers.
Data pre-processing
Now the next step is preprocessing of data for its analysis.
3. Data Wrangling
Data wrangling is the process of cleaning and converting raw data into a useable
format. It is the process of cleaning the data, selecting the variable to use, and
transforming the data in a proper format to make it more suitable for analysis in
the next step. It is one of the most important steps of the complete process.
Cleaning of data is required to address the quality issues.
It is not necessary that data we have collected is always of our use as some of the
data may not be useful. In real-world applications, collected data may have various
issues, including:
• Missing Values
• Duplicate data
• Invalid data
• Noise
Now the cleaned and prepared data is passed on to the analysis step. This step
involves:
The aim of this step is to build a machine learning model to analyze the data using
various analytical techniques and review the outcome. It starts with the
determination of the type of the problems, where we select the machine learning
techniques such as Classification, Regression, Cluster analysis, Association,
etc. then build the model using prepared data, and evaluate the model.
Hence, in this step, we take the data and use machine learning algorithms to build
the model.
5. Train Model
Now the next step is to train the model, in this step we train our model to improve
its performance for better outcome of the problem.
We use datasets to train the model using various machine learning algorithms.
Training a model is required so that it can understand the various patterns, rules,
and, features.
6. Test Model
Once our machine learning model has been trained on a given dataset, then we
test the model. In this step, we check for the accuracy of our model by providing a
test dataset to it.
Testing the model determines the percentage accuracy of the model as per the
requirement of project or problem.
7. Deployment
The last step of machine learning life cycle is deployment, where we deploy the
model in the real-world system.
If the above-prepared model is producing an accurate result as per our
requirement with acceptable speed, then we deploy the model in the real system.
But before deploying the project, we will check whether it is improving its
performance using available data or not. The deployment phase is similar to
making the final report for a project.