A Preliminary Idea On Machine Learning
A Preliminary Idea On Machine Learning
A Preliminary Idea On Machine Learning
AVIJIT BOSE
DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING
MCKVIE,LILUAH,HOWRAH-711204
WHAT IS MACHINE LEARNING-
BACKGROUND
• In 1997 IBM Deep Blue which was a Supercomputer defeated
World chess Champion Garry Kasparov based on AI.
• It was the first time that the name of “AI” became widely
popular among non academic domains also.
• Some concepts of AI were now taken along with some
portions of inferential statistics and mathematical modeling
which gave rise to Machine Learning.
• Pioneer in the area of Machine Learning – Tom Mitchell.
• Some of the areas that can be covered with Machine Learning-
classification problems, clustering problems, Prediction ,
Cognitive Modeling.
• In the next slide we cover human learning
Human Modeling/Learning- Inspiration
for Machine Learning
• Learning under Expert Guidance- Directly from Experts.
• Learning Guided by Knowledge Gain from Experts-we build
our own notion based on what we have learnt from some
experts in the past.
• Learning by Self /Self Learning- doing a certain job by
ourselves may be multiple times before being successful.
Definition of Machine Learning
• Tom M Mitchell as mentioned in the first slide defines machine
Learning as “ A Computer Program is said to learn from experience E
with respect to some class of Task T and Performance measure P if
its performance at tasks in T as measured by P improves with
experience E”.
• How do Machines Learn?
(a) Data Input
(b) Data Abstraction
(c) Generalization
So the question that arises is that whatever data is made as input it
should be processed well and we come to the concept of feature
engineering. Then it is passed through the underlying algorithm which
is the second phase and next the abstracted representation is
generalized to form a framework for making decision.
Different Types of Learning
• Supervised Learning:- A Machine predicts the class of
unknown objects based on prior class related information of
similar objects.(Predictive Learning)
• Unsupervised Learning:- A machine finds patterns in unknown
objects by grouping similar objects together.(Descriptive
Learning)
• Reinforcement Learning:- A machine learns to act on its own
to achieve the given goal. (Still in Research phase – Success
has been achieved by Google).
Difference Between Classification and
Regression
• Classification:- When we are trying to predict a categorical or
nominal variable it is known as classification problem.
• Regression:- When we try to predict a real valued variable the
problem falls under the category of regression.
When will supervised Learning fail?
Supervised Learning will fail when the quantity and quality of
data is not up to the mark.
Image Showing Supervised Learning
Supervised Learning
• Naïve Bayes
• Decision Tree
• K –Nearest Neighbor
• ML can save up to 52% life of patients who are suffering from
breast cancer.
Linear Regression
• The objective is to predict numerical features like real estate
value, stock price, temperature etc.
• It follows statistical least square method.
• A straight line relationship is fitted between the predictor
variable and the target variable.
• In case of simple linear regression only one predictor variable
is used where as MLR multiple predictor variables are being
used.
How Linear Regression looks like?
A typical Linear Regression
• Y= 𝛽 + λ𝑋
• Y= Target output
• X= Predictor variable
Unsupervised
Supervised
Hence, 57% are the students who like English also like Mathematics
Applying Bayes Theorem
𝑃 𝐸𝑓𝑓𝑒𝑐𝑡𝑠 𝐶𝑎𝑢𝑠𝑒 𝑃(𝐶𝑎𝑢𝑠𝑒)
• P(Cause|Effects)= 𝑃(𝐸𝑓𝑓𝑒𝑐𝑡𝑠)
• Question: what is the probability that a patient has diseases
meningitis with a stiff neck?
• Given Data:
A doctor is aware that disease meningitis causes a patient to have a stiff
neck, and it occurs 80% of the time. He is also aware of some more facts,
which are given as follows:
• The Known probability that a patient has meningitis disease is 1/30,000.
• The Known probability that a patient has a stiff neck is 2%.
Ans:- Let a be the proposition that patient has stiff neck and b be the
proposition that patient has meningitis. , so we can calculate the following
as:
• P(a|b) = 0.8
• P(b) = 1/30000
• P(a)= .02
1
.8∗( )
30000
P(B|A)= 0.02 =0.001333333
Hence, we can assume that 1 patient out of 750 patients has meningitis
disease with a stiff neck.
Bayesian Network Graph
Joint Probability Distribution
• If we have variables x1, x2, x3,....., xn, then the probabilities of
a different combination of x1, x2, x3.. xn, are known as Joint
probability distribution.
• P[x1, x2, x3,....., xn], it can be written as the following way in
terms of the joint probability distribution.
• = P[x1| x2, x3,....., xn]P[x2, x3,....., xn]
• = P[x1| x2, x3,....., xn]P[x2|x3,....., xn]....P[xn-1|xn]P[xn].
• In general for each variable Xi, we can write the equation as:
• P(Xi|Xi-1,........., X1) = P(Xi |Parents(Xi ))
Types of Stochastic Process
• A real stochastic process is a collection of random variables {Xt, t€T}
defined on a probability space.
• Four types of Stochastic Process
• a) Discrete State & Discrete time Stochastic process.[Markov Chain]
S={0,1,2,….} T={0,1,2,3,….} Ex: Discount in motor insurance
depending on road accident . S={0%,10%,20%} &T={ 0,1,2,…}
b) Discrete State and continuous time Stochastic Process.[MDP]
S={ 0,1,2,….} T= { t: 0 ≤t ≤∞} Ex: Car parking in the time interval (0,t)
and no. of cars S=(0,1,2,….)
• Continuous state and discrete time Stochastic process
T={ 0,1,2,…..} and S={ x: 0 ≤ x≤∞} Ex: Share price for an asset at the
close of trading on day with T ={ 0,1,2,…..} and S={ x: 0 ≤ x≤∞}
If 90% of the drivers in the community are in the low-risk category this
year, what is the probability that a driver chosen at random from the
community will be in the low-risk category the next year? The year
after next ? (answer 0.96, 0.972 from matrices)
Contd..
Stationary matrix
• When we computed the fourth state matrix of a previous problem
we saw that the numbers appeared to approaching fixed values.
Recall,
• We would find if we calculated the 5th , 6th and kth state matrix, we
would find that they approach a limiting matrix of [0.975 0.025] This
final matrix is called a stationary matrix.
• The stationary matrix for a Markov chain with transition matrix P has
the property that SP = S { To prove that the matrix [ 0.975 0.025] is
the stationary matrix, we need to show that SP = S
Conclusion
• However Stationary matrix is always not achievable. There
were fewer topics which could not be covered.
1. ANN –SLP,MLP, etc.
2. SVM,KNN
3. Deep Learning, CNN (Research Phase)
Anyway Machine Learning is vast like ocean and I know very
little of the pond of the ocean. I hope that with my little
knowledge I have at least given some introduction about the
topics.
THANK YOU