Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
1K views

Machine Learning (15Cs73) : Text Book Tom M. Mitchell, Machine Learning, India Edition 2013, Mcgraw Hill

This document provides information about a Machine Learning course including the textbook, instructor details, objectives, syllabus, and content from various modules and chapters. The syllabus outlines 5 modules that will cover topics such as concept learning, decision trees, neural networks, Bayesian classification, performance analysis, and reinforcement learning. The content examines definitions of machine learning and examples of successful applications. It also differentiates between supervised, unsupervised, and reinforcement learning techniques.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1K views

Machine Learning (15Cs73) : Text Book Tom M. Mitchell, Machine Learning, India Edition 2013, Mcgraw Hill

This document provides information about a Machine Learning course including the textbook, instructor details, objectives, syllabus, and content from various modules and chapters. The syllabus outlines 5 modules that will cover topics such as concept learning, decision trees, neural networks, Bayesian classification, performance analysis, and reinforcement learning. The content examines definitions of machine learning and examples of successful applications. It also differentiates between supervised, unsupervised, and reinforcement learning techniques.
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 78

Machine Learning

(15CS73)
Text Book
Tom M. Mitchell, Machine Learning,
India Edition 2013, McGraw Hill

Harshavardhana Doddamani, Assistant


December 29, 2021 1
Professor, Dept Of CSE, SJCIT
Machine Learning
[As per Choice Based Credit System (CBCS) scheme]
(Effective from the academic year 2016 – 2017)
SEMESTER - VII
Subject Code 15CS73 IA Marks 20
Number of Lecture Hours/Week 03 Exam Marks 80
CREDITS - 04
Course Objectives : This course will enable students to
•Define machine learning and problems relevant to machine
learning.
•Differentiate supervised, unsupervised and reinforcement
learning
•Apply neural networks, Bayes classifier and k nearest neighbor,
for problems appear in machine learning.
•Perform statistical analysis of machine learning techniques.

Harshavardhana Doddamani, Assistant


December 29, 2021 2
Professor, Dept Of CSE, SJCIT
Syllabus
Module1 : Well posed learning problems, Designing a Learning system, Perspective
and Issues in Machine Learning. Concept Learning: Concept learning task, Concept
learning as search, Find-S algorithm, Version space, Candidate Elimination
algorithm, Inductive Bias.
Module2: Decision tree representation, Appropriate problems for decision tree
learning, Basic decision tree learning algorithm, hypothesis space search in
decision tree learning, Inductive bias in decision tree learning, Issues in decision
tree learning
Module3: Artificial Neural Networks: Introduction, Neural Network representation,
Appropriate problems, Perceptron's, Backpropagation algorithm.
Module4: Introduction, Bayes theorem, Bayes theorem and concept learning, ML and
LS error hypothesis, ML for predicting probabilities, MDL principle, Naive Bayes
classifier, Bayesian belief networks, EM algorithm
Module5: Motivation, Estimating hypothesis accuracy, Basics of sampling theorem,
General approach for deriving confidence intervals, Difference in error of two
hypothesis, Comparing learning algorithms. Instance Based Learning: Introduction,
k- nearest neighbor learning, locally weighted regression, radial basis function,
cased-based reasoning, Reinforcement Learning: Introduction, Learning Task, Q
Learning .

Harshavardhana Doddamani, Assistant


December 29, 2021 3
Professor, Dept Of CSE, SJCIT
Module 1
Chapter1
Syllabus
• 1. Introduction
• 2. Well Posed Problems
• 3. Designing a Learning System
• 4. Perspective and Issues in Machine Learning
• 5. Summary

Harshavardhana Doddamani, Assistant


December 29, 2021 4
Professor, Dept Of CSE, SJCIT
Learning & Adaptation
• ”Modification of a behavioral tendency by expertise.”
(Webster 1984)
• ”A learning machine, broadly defined is any device whose
actions are influenced by past experiences.” (Nilsson 1965)
• ”Any change in a system that allows it to perform better the
second time on repetition of the same task or on another
task drawn from the same population.” (Simon 1983)
• ”An improvement in information processing ability that
results from information processing activity.” (Tanimoto
1990)

Harshavardhana Doddamani, Assistant


December 29, 2021 5
Professor, Dept Of CSE, SJCIT
1.Introduction
How to program computers to learn?

Learning: Improving automatically with experience


Example: Computers learning from medical records
which treatments are most effective for new

diseases
Added value: Better understanding of human learning
abilities
Harshavardhana Doddamani, Assistant
December 29, 2021 6
Professor, Dept Of CSE, SJCIT
What is Machine Learning ?
• Machine learning is an application of artificial intelligence (AI) that
provides systems the ability to automatically learn and improve from
experience without being explicitly programmed.
• Machine learning focuses on the development of computer
programs
that can access data and use it learn for themselves.
• The process of learning begins with observations or data, such as
examples, direct experience, or instruction, in order to look for patterns
in data and make better decisions in the future based on the examples
that we provide.
• The primary aim is to allow the computers learn automatically
without human intervention or assistance and adjust actions accordingly.

Harshavardhana Doddamani, Assistant


December 29, 2021 7
Professor, Dept Of CSE, SJCIT
Types of Machine Learning Techniques

1. Shallow Learning
• Algorithms with Few Layers
• Better for Less Complex and Smaller Data sets
• Eg: Logistic Regression and Support vector Machines
2. Deep Learning
• New technique that uses many layers of neural network
( a model based on the structure of human brain)
• Useful when the target function is very complex and
data sets are very large.
Harshavardhana Doddamani, Assistant
December 29, 2021 8
Professor, Dept Of CSE, SJCIT
Classification of Machine Learning Algorithms

1. Supervised Learning
• X and Y
• Given an observation X what is the best label for Y
2. Unsupervised Learning
•X
• Given a set of X cluster or summarize them
3. Semi Supervised Learning
4. Reinforcement Learning
• Determine what to do based on Rewards and
punishments
Harshavardhana Doddamani, Assistant
December 29, 2021 9
Professor, Dept Of CSE, SJCIT
Terminologies
• Labeled data: Data consisting of a set of training
examples, where each example is a pair consisting of
an input and a desired output value (also called the
supervisory signal, labels, etc)
• Classification: The goal is to predict discrete values,
e.g. {1,0}, {True, False}, {spam, not spam}.
• Regression: The goal is to predict continuous values,
e.g. home prices.

Harshavardhana Doddamani, Assistant


December 29, 2021 10
Professor, Dept Of CSE, SJCIT
Table 1.1 Some successful applications of machine learning.

Learning to recognize spoken words SPHINX (Lee 1989)

Learning to drive an autonomous vehicle ALVINN (Pomerleau


1989)

Learning to classify new astronomical structures (Fayyad et al


1995)

Learning to play world-class backgammon (Tesauro 1992, 1995)

Harshavardhana Doddamani, Assistant


December 29, 2021 11
Professor, Dept Of CSE, SJCIT
Table 1.2 Some disciplines of their influence
on machine learning
1. Artificial Intelligence

2. Bayesian methods

3. Computational complexity theory

4. Control theory

5. Information theory

6. Philosophy

7. Philosophy & neurobiology

8. Statistics

Harshavardhana Doddamani, Assistant


December 29, 2021 12
Professor, Dept Of CSE, SJCIT
1.1 Well – Posed Learning Problems
Definition
A computer program is said to learn from
experience E with respect to some class of
tasks T and performance measure P, if its
performance at tasks in T, as measured by P,
improves with experience.

Harshavardhana Doddamani, Assistant


December 29, 2021 13
Professor, Dept Of CSE, SJCIT
Example1 : A Checkers Learning Problem

•Task T : Playing
checkers
• Performance Measure P :
Percentage of games won
against opponent
•Training Experience E :
Playing practice games
against itself

Harshavardhana Doddamani, Assistant


December 29, 2021 14
Professor, Dept Of CSE, SJCIT
Example2: A Handwriting Recognition Learning Problem

• Task T : Recognizing
and classifying
handwritten words
• Performance Measure P :
Percentage of words
correctly classified
• Training Experience E :
A database of
handwritten words with
given classification
Harshavardhana Doddamani, Assistant
December 29, 2021 15
Professor, Dept Of CSE, SJCIT
Example3 : A Robot driving learning program

• Task T : Driving on public


four lane highways using
vision sensors
• Performance Measure P :
Average distance handled
before error ( as judged by
human overseas)
• Training Experience E : A
sequences of images and
steering commands recorded
while observing a human
driver .
Harshavardhana Doddamani, Assistant
December 29, 2021 16
Professor, Dept Of CSE, SJCIT
1.2 DESIGNING A LEARNING SYSTEM

Steps to design a learning system


Problem Description
1.2.1 Choosing the Training Experience
1.2.2 Choosing the Target Function
1.2.3 Choosing a Representation for the Target Function
1.2.4 Choosing a Function Approximation Algorithm
1.2.4.1 ESTIMATING TRAINING VALUES
1.2.4.2 ADJUSTING THE WEIGHTS
1.2.4.3 The Final Design
Harshavardhana Doddamani, Assistant
December 29, 2021 17
Professor, Dept Of CSE, SJCIT
Problem Description: A Checker Learning
Problem
• Task T: Playing Checkers
• Performance Measure P: Percent of games won
against opponents
• Training Experience E: Games Played against
itself

Harshavardhana Doddamani, Assistant


December 29, 2021 18
Professor, Dept Of CSE, SJCIT
1.2.2 Choosing the Training Experience

• Will the training experience provide direct or indirect feedback?


• Direct Feedback: system learns from examples of individual
checkers board states and the correct move for each
• Indirect Feedback: Move sequences and final outcomes of various
games played
• Credit assignment problem: Value of early states must be
inferred from the outcome
• Degree to which the learner controls the sequence of training examples
• Teacher selects informative boards and gives correct move
• Learner proposes board states that it finds particularly confusing.
Teacher provides correct moves
• Learner controls board states and (indirect) training classifications

Harshavardhana Doddamani, Assistant


December 29, 2021 19
Professor, Dept Of CSE, SJCIT
(continued…)

• How well the training experience represents the


distribution of examples over which the final system
performance P will be measured
• If training the checkers program consists only of
experiences played against itself, it may never
encounter crucial board states that are likely to be
played by the human checkers champion
• Most theory of machine learning rests on the
assumption that the distribution of training examples is
identical to the distribution of test examples
Harshavardhana Doddamani, Assistant
December 29, 2021 20
Professor, Dept Of CSE, SJCIT
(continued…)

In order to complete the design of the learning


system, we must now choose
• The exact type of knowledge to be learned
• A representation for this target knowledge
• A learning mechanism

Harshavardhana Doddamani, Assistant


December 29, 2021 21
Professor, Dept Of CSE, SJCIT
1.2.2 Choosing the Target Function
• Assume that you can determine legal moves
• Program needs to learn the best move from among legal moves
• Defines large search space known a priori
• target function: ChooseMove : B → M
• ChooseMove is difficult to learn given indirect training
• Alternative target function
• An evaluation function that assigns a numerical score to any given board state

• V : B → ( where is the set of real numbers)


• V(b) for an arbitrary board state b in B
• if b is a final board state that is won, then V(b) = 100
• if b is a final board state that is lost, then V(b) = -100
• if b is a final board state that is drawn, then V(b) = 0
• if b is not a final state, then V(b) = V(b '), where b' is the best final board state that can be achieved starting
from b and playing optimally until the end of the game

Harshavardhana Doddamani, Assistant


December 29, 2021 22
Professor, Dept Of CSE, SJCIT
(continued…)

• V(b) gives a recursive definition for board state b


• Not usable because not efficient to compute except is 3
first three trivial cases
• nonoperational definition
• Goal of learning is to discover an operational
description of V
• Learning the target function is often called
function approximation
• Referred to as
Harshavardhana Doddamani, Assistant
December 29, 2021 23
Professor, Dept Of CSE, SJCIT
1.2.3 Choosing a Representation for the Target
Function
• Choice of representations involve trade offs
• Pick a very expressive representation to allow close approximation
to the ideal target function V
• More expressive, more training data required to choose among
alternative hypotheses

• Use linear combination of the following board features:


• x1: the number of black pieces on the board
• x2: the number of red pieces on the board
• x3: the number of black kings on the board
• x4: the number of red kings on the board
• x5: the number of black pieces threatened by red (i.e. which can be captured on red's next turn)
• x6: the number of red pieces threatened by black
(b) = w0 + w1x1 + w2x2 + w3 x3 + w4x4 + w5x5 + w6x6

Harshavardhana Doddamani, Assistant


December 29, 2021 24
Professor, Dept Of CSE, SJCIT
Partial Design of Checkers Learning Program

• A checkers learning problem:


• Task T: playing checkers
• Performance measure P: percent of games won
in the world tournament
• Training experience E: games played against
itself • Target Function: V: Board →
• Target function representation
(b) = w0 + w1x1 + w2x2 + w3 x3 + w4x4 + w5x5 + w6x6
Harshavardhana Doddamani, Assistant
December 29, 2021 25
Professor, Dept Of CSE, SJCIT
1.2.4 Choosing a Function Approximation
Algorithm
To learn we require a set of training examples
describing the board b and the training value
Vtrain(b)

• Ordered pair <b,Vtrain(b)>


<<x1 = 3, x2 = 0, x3 = 1, x4 = 0, x5 = 0, x6 = 0>, + 100>

Harshavardhana Doddamani, Assistant


December 29, 2021 26
Professor, Dept Of CSE, SJCIT
1.2.4.1 Estimating Training Values
• Need to assign specific scores to intermediate
board states
• Approximate intermediate board state b using
the learner's current approximation of the next
board state following b
Vtrain (b) ← ( Successor(b)) … (1.1)
• Simple and successful approach
• More accurate for states closer to end states
Harshavardhana Doddamani, Assistant
December 29, 2021 27
Professor, Dept Of CSE, SJCIT
Adjusting the Weights
• Choose the weights wi to best fit the set of training
examples
• Minimize the squared error E between the train
values and the values predicted by the hypothesis

• Require an algorithm that


• will incrementally refine weights as new training examples become
available
• will be robust to errors in these estimated training values
• Least Mean Squares (LMS) is one such algorithm
Harshavardhana Doddamani, Assistant
December 29, 2021 28
Professor, Dept Of CSE, SJCIT
LMS Weight Update Rule

For each training example < b,Vtrain(b) >


• Use the current weights to calculate (b)
• For each weight wi, update it as
wi ← wi + η (Vtrain(b) - (b)) xi
Here η is a small constant (e., 0.1)

Harshavardhana Doddamani, Assistant


December 29, 2021 29
Professor, Dept Of CSE, SJCIT
1.2.5 Final Design

Figure 1.1 Final design of the checkers learning program

Harshavardhana Doddamani, Assistant


December 29, 2021 30
Professor, Dept Of CSE, SJCIT
Harshavardhana Doddamani, Assistant
Figure29,1.2
December 2021Summary of choices in designing
Professor, Dept Ofthe checkers learning program
CSE, SJCIT
31
1.3 Perspectives in Machine Learning
• One useful perspective on machine learning is that it involves searching a
very large space of possible hypotheses to determine one that best fits the
observed data and any prior knowledge held by the learner.
• For example, consider the space of hypotheses that could in principle be
output by the above checkers learner. This hypothesis space consists of all
evaluation functions that can be represented by some choice of values for
the weights w0 through w6. The learner's task is thus to search through this
vast space to locate the hypothesis that is most consistent with
• The available training examples. The LMS algorithm for fitting weights
achieves this goal by iteratively tuning the weights, adding a correction to
each weight each time the hypothesized evaluation function predicts a
value that differs from the training value. This algorithm works well when
the hypothesis representation considered by the learner defines a
continuously parameterized space of potential hypotheses.

Harshavardhana Doddamani, Assistant


December 29, 2021 32
Professor, Dept Of CSE, SJCIT
1.3.1 Issues in Machine Learning
• What algorithms exist for learning general target functions from specific training
examples? In what settings will particular algorithms converge to the desired function,
given sufficient training data? Which algorithms perform best for which types of
problems and representations?
• How much training data is sufficient? What general bounds can be found to relate the
confidence in learned hypotheses to the amount of training experience and the
character of the learner's hypothesis space?
• When and how can prior knowledge held by the learner guide the process of
generalizing from examples? Can prior knowledge be helpful even when it is only
approximately correct?
• What is the best strategy for choosing a useful next training experience, and how does
the choice of this strategy alter the complexity of the learning problem?
• What is the best way to reduce the learning task to one or more function approximation
problems? Put another way, what specific functions should the system attempt to
learn? Can this process itself be automated?
• How can the learner automatically alter its representation to improve its ability to
represent and learn the target function?

Harshavardhana Doddamani, Assistant


December 29, 2021 33
Professor, Dept Of CSE, SJCIT
Tutorials
1. Define Machine Learning. Discuss with examples Why Machine Learning
is Important.
2. Discuss with examples some useful applications of machine learning.
3. Explain how some areas/disciplines have influenced the Machine learning.
4. Define Learning Program for a given Problem. Describe the following
problems with respect to Tasks, Performance and Experience:
1. Checkers Learning Problems
2. Handwritten Recognition Problem
3. Robot Driving Learning Problem
5. Describe in detail all the steps involved in designing a Learning Systems
6. Discuss the Perspective and Issues in Machine Learning.

Harshavardhana Doddamani, Assistant


December 29, 2021 34
Professor, Dept Of CSE, SJCIT
Module 1 Chapter 2

Harshavardhana Doddamani, Assistant


December 29, 2021 35
Professor, Dept Of CSE, SJCIT
Syllabus
2.Concept Learning :
• Introduction
• Concept Learning Task
• Concept Learning as Search
• Find S –Algorithm
• Version Space
• Candidate Elimination Algorithm
• Inductive Bias
Harshavardhana Doddamani, Assistant
December 29, 2021 36
Professor, Dept Of CSE, SJCIT
2.1 Introduction
A Concept is a subset of objects or events defined over a larger set
[Example: The concept of a bird is the subset of all objects (i.e., the set of
all things or all animals) that belong to the category of bird.]

Alternatively, a concept is a boolean-valued function defined over


this larger set [Example: a function defined over all animals whose
value is true for birds and false for every other animal].

Harshavardhana Doddamani, Assistant


December 29, 2021 37
Professor, Dept Of CSE, SJCIT
What is Concept-Learning?

• Given a set of examples labeled as members or


non-members of a concept, concept-learning
consists of automatically inferring the general
definition of this concept.
• In other words, concept-learning consists of
approximating a boolean-valued function from
training examples of its input and output.

Harshavardhana Doddamani, Assistant


December 29, 2021 38
Professor, Dept Of CSE, SJCIT
2.2 A CONCEPT LEARNING TASK

• Consider the example task of learning the target


concept "days on which my friend Aldo enjoys his
favorite water sport.“
• Table describes a set of example days, each
represented by a set of attributes.
• The attribute Enjoy Sport indicates whether or not
Aldo enjoys his favorite water sport on this day.
• The task is to learn to predict the value of Enjoy
Sport for an arbitrary day, based

Harshavardhana Doddamani, Assistant


December 29, 2021 39
Professor, Dept Of CSE, SJCIT
Continued…

Table 2.1 Positive & negative training examples for the target concept EnjoySport.

• Chosen Hypothesis Representation:


• Conjunction of constraints on each attribute where:
• “?” means “any value is acceptable”
• “warm” specific a single value required for the attribute
• “0” means “no value is acceptable”

Harshavardhana Doddamani, Assistant


December 29, 2021 40
Professor, Dept Of CSE, SJCIT
Continued…

• Example of a hypothesis: <?, Cold, High, ?, ?, ?> (If the air


temperature is cold and the humidity high then it is a good
day for water sports)
• Goal: To infer the “best” concept-description from the set of
all possible hypotheses (“best” means “which best
generalizes to all (known or unknown) elements of the
instance space” concept-learning is an ill-defined task)
• The most general hypothesis-that every day is a good day
for water sports, positive example-is represented by
<?, ?, ?, ?, ?, ?> and
• The most specific possible hypothesis-that no day is a
positive example-is represented by <0,0,0,0,0,0>

Harshavardhana Doddamani, Assistant


December 29, 2021 41
Professor, Dept Of CSE, SJCIT
2.2.1 Terminology and Notation

The set of items over which the concept is defined is called the set of
instances(denoted by X)
•The concept to be learned is called the Target Concept (denoted by c: X--
> {0,1})
•The set of Training Examples is a set of instances, x, along with their
target concept value c(x).
•Members of the concept (instances for which c(x)=1) are called positive
examples.
•Nonmembers of the concept (instances for which c(x)=0) are called
negative examples.
•H represents the set of all possible hypotheses. H is determined by the
human designer’s choice of a hypothesis representation.
•The goal of concept-learning is to find a hypothesis h:X --> {0,1} such
that h(x)=c(x) for all x in X.

Harshavardhana Doddamani, Assistant


December 29, 2021 42
Professor, Dept Of CSE, SJCIT
Table 2.2 The EnjoySport concept learning task.

Harshavardhana Doddamani, Assistant


December 29, 2021 43
Professor, Dept Of CSE, SJCIT
Example of a Concept Learning

•Concept:Good Days for Water Sports (values: Yes, No)


•Attributes/Features:
•Sky(values: Sunny, Cloudy, Rainy)
•AirTemp(values: Warm, Cold)
•Humidity(values: Normal, High)
•Wind (values: Strong, Weak)
•Water(Warm, Cool)
•Forecast(values: Same, Change) class
•Example of a Training Point:
<Sunny, Warm, High, Strong, Warm, Same, Yes>

Harshavardhana Doddamani, Assistant


December 29, 2021 44
Professor, Dept Of CSE, SJCIT
2.2.2 The Inductive Learning Hypothesis

The inductive learning hypothesis : Any


hypothes is found to approximate the target
function well over a sufficiently large set of
training examples will also approximate the
target function well over other unobserved
examples.

Harshavardhana Doddamani, Assistant


December 29, 2021 45
Professor, Dept Of CSE, SJCIT
Number of Instances, Concepts, Hypotheses

•Sky: Sunny, Cloudy, Rainy


•AirTemp: Warm, Cold
•Humidity: Normal, High
•Wind: Strong, Weak
•Water: Warm, Cold
•Forecast: Same, Change
#distinct instances : 3*2*2*2*2*2 = 96
#distinct concepts : 296
#syntactically distinct hypotheses : 5*4*4*4*4*4=5120
#semantically distinct hypotheses : 1+4*3*3*3*3*3=973

Harshavardhana Doddamani, Assistant


December 29, 2021 46
Professor, Dept Of CSE, SJCIT
2.3 CONCEPT LEARNING AS SEARCH

• Concept Learning can be viewed as the task of


searching through a large space of hypotheses
implicitly defined by the hypothesis representation.
• Selecting a Hypothesis Representation is an important
step since it restricts (or biases) the space that can be
searched. [For example, the hypothesis “If the air
temperature is cold or the humidity high then it is a
good day for water sports” cannot be expressed in
our chosen representation.]

Harshavardhana Doddamani, Assistant


December 29, 2021 47
Professor, Dept Of CSE, SJCIT
Definition: Let hj and hk be boolean-valued functions defined over X.
Then hj is more-general-than-or-equal-to hk iff For all x in X,
[(hk(x) = 1) --> (hj(x)=1)]

Example:
•h1= <Sunny,?,?, Strong,?,?>
•h2= <Sunny,?,?,?,?,?>

Every instance that are classified as positive by h1will also be classified as positive
by h2in our example data set. Therefore h2is more general than h1.
•We also use the ideas of “strictly”-more-general-than, and more-specific-
than(illustration [Mitchell, p. 25])

Harshavardhana Doddamani, Assistant


December 29, 2021 48
Professor, Dept Of CSE, SJCIT
Harshavardhana Doddamani, Assistant
December 29, 2021 49
Professor, Dept Of CSE, SJCIT
Harshavardhana Doddamani, Assistant
December 29, 2021 50
Professor, Dept Of CSE, SJCIT
Instance, Hypotheses and ”more general”

Harshavardhana Doddamani, Assistant


December 29, 2021 51
Professor, Dept Of CSE, SJCIT
2.3 Find-S, a Maximally Specific Hypothesis
Learning Algorithm
• Problem1: Implement and demonstrate the FIND-S algorithm
for finding the most specific hypothesis based on a given set of
training data samples. Read the training data from a .CSV file.
Algorithm:
1. Initialize h to the most specific hypothesis in H
2. For each positive training instance x
For each attribute constraint ai in h
If the constraint ai in h is satisfied by x then do nothing
else replace ai in h by the next more general constraint that is
satisfied by x
3. Output hypothesis h
Harshavardhana Doddamani, Assistant
December 29, 2021 52
Professor, Dept Of CSE, SJCIT
Illustration:
Step1: Find S

Harshavardhana Doddamani, Assistant


December 29, 2021 53
Professor, Dept Of CSE, SJCIT
Step2: Find S

Harshavardhana Doddamani, Assistant


December 29, 2021 54
Professor, Dept Of CSE, SJCIT
Step2: Find S

Harshavardhana Doddamani, Assistant


December 29, 2021 55
Professor, Dept Of CSE, SJCIT
Iteration 3 and Step 4: Find S

Harshavardhana Doddamani, Assistant


December 29, 2021 56
Professor, Dept Of CSE, SJCIT
Properties of Find-S

• Hypothesis space described by conjunctions of


attributes
• Find-S will output the most specific hypothesis
within H that is consistent with the positive
training examples
• The output hypothesis will also be consistent
with the negative examples, provided the
target concept is contained in H.

Harshavardhana Doddamani, Assistant


December 29, 2021 57
Professor, Dept Of CSE, SJCIT
Complaints about Find-S

• Can’t tell if the learner has converged to the target


concept, in the sense that it is unable to determine
whether it has found the only hypothesis
consistent with the training examples.
• Can’t tell when training data is inconsistent, as
it ignores negative training examples.
• Why prefer the most specific hypothesis?
• What if there are multiple maximally specific
hypothesis?
Harshavardhana Doddamani, Assistant
December 29, 2021 58
Professor, Dept Of CSE, SJCIT
2.4 Version Spaces

• A hypothesis h is consistent with a set of training


examples D of target concept if and only if h(x)=c(x)
for each training example <x, c(x) > in D.
Consistent(h,D) := <x,c(x) >  D h(x)=c(x)
• The version space, VSH, D , with respect to hypothesis
space H, and training set D, is the subset of
hypotheses from Hconsistent with all training
examples:
VSH,D= {h  H | Consistent(h, D) }

Harshavardhana Doddamani, Assistant


December 29, 2021 59
Professor, Dept Of CSE, SJCIT
List-Then Eliminate Algorithm

1.VersionSpace ← a list containing every


hypothesis in H
2. For each training example <x, c(x)>
remove from Version Space any hypothesis
that is inconsistent with the training example
h(x) ≠ c(x)
3. Output the list of hypotheses in VersionSpace

Harshavardhana Doddamani, Assistant


December 29, 2021 60
Professor, Dept Of CSE, SJCIT
Example Version Space

Harshavardhana Doddamani, Assistant


December 29, 2021 61
Professor, Dept Of CSE, SJCIT
Representing Version Spaces
• The general boundary ,G, of version space VSH, D
is the set of maximally general members.
• The specific boundary, S, of version space VSH, D
is the set of maximally specific members.
• Every member of the version space lies between these
boundaries
• VSH, D = {h  H| (s  S) ( g  G) (g h  s)
• where x  y means x is more general or equal than y

Harshavardhana Doddamani, Assistant


December 29, 2021 62
Professor, Dept Of CSE, SJCIT
2.5 Candidate Elimination Algorithm
• The CANDIDATE-ELIMINATION algorithm computes the
version space containing all hypotheses from H that are
consistent with an observed sequence of training examples.
• •It begins by initializing the version space to the set of all
hypotheses in H; that is, by initializing the G boundary set
to contain the most general hypothesis in H
Go ← {(?, ?, ?, ?, ?, ?)}
• and initializing the S boundary set to contain the most
specific (least general) hypothesis
So ← {(ø, ø, ø, ø, ø, ø)}

Harshavardhana Doddamani, Assistant


December 29, 2021 63
Professor, Dept Of CSE, SJCIT
Candidate Elimination Algorithm

Harshavardhana Doddamani, Assistant


December 29, 2021 64
Professor, Dept Of CSE, SJCIT
Harshavardhana Doddamani, Assistant
December 29, 2021 65
Professor, Dept Of CSE, SJCIT
Trace - 1

Harshavardhana Doddamani, Assistant


December 29, 2021 66
Professor, Dept Of CSE, SJCIT
Trace - 2

Harshavardhana Doddamani, Assistant


December 29, 2021 67
Professor, Dept Of CSE, SJCIT
Trace - 3

Harshavardhana Doddamani, Assistant


December 29, 2021 68
Professor, Dept Of CSE, SJCIT
Final Version Space

Harshavardhana Doddamani, Assistant


December 29, 2021 69
Professor, Dept Of CSE, SJCIT
REMARKS ON VERSION SPACES
AND CANDIDATE-ELIMINATION

• Will the CANDIDATE-ELIMINATION


Algorithm Converge to the Correct
Hypothesis?
• 2.What Training Example Should the Learner
Request Next?
• 3.How Can Partially Learned Concepts Be
Used?
Harshavardhana Doddamani, Assistant
December 29, 2021 70
Professor, Dept Of CSE, SJCIT
2.6 Inductive Bias
Inductive Bias I: A Biased Hypothesis Space
Day Sky AirTemp Humidity Wind Water Forecast WaterSport
1 Sunny Warm Normal Strong Cool Change Yes
2 Cloudy Warm Normal Strong Cool Change Yes
3 Rainy Warm Normal Strong Cool Change No

Given our previous choice of the hypothesis space representation, no


hypothesis is consistent with the above database: we have BIASEDthe
learner to consider only conjunctive hypotheses

Harshavardhana Doddamani, Assistant


December 29, 2021 71
Professor, Dept Of CSE, SJCIT
Inductive Bias II: An Unbiased
Learner
• Idea: Choose H that expresses every teachable
concept, that means H is the set of all possible
subsets of X called the power set P(X)
•|X|=96, |P(X)|= 296 ~ 1028 distinct concepts
•H = disjunctions, conjunctions, negations
•e.g. <Sunny Warm Normal ? ? ?> v <? ? ? ? ? Change>
•H surely contains the target concept.

Harshavardhana Doddamani, Assistant


December 29, 2021 72
Professor, Dept Of CSE, SJCIT
Inductive Bias II: An Unbiased
Learner
• In order to solve the problem caused by the bias of the
hypothesis space, we can remove this bias and allow the
hypotheses to represent every possible subset of instances.
The previous database could then be expressed as:
<Sunny, ?,?,?,?,?> v <Cloudy,?,?,?,?,?,?>
However, such an unbiased learner is not able to
generalize beyond the observed examples!!!! All the
non-observed examples will be well-classified by half the
hypotheses of the version space and misclassified by the
other half.

Harshavardhana Doddamani, Assistant


December 29, 2021 73
Professor, Dept Of CSE, SJCIT
Inductive Bias II: An Unbiased
Learner
What are S and G in this case?
Assume positive examples (x1, x2, x3) and
negative examples (x4, x5)
S : { (x1v x2v x3) } G : { ┐(x4v x5) }
The only examples that are classified are the training examples
themselves. In other words in order to learn the target concept
one would have to present every single instance in X as a
training example.
Each unobserved instance will be classified positive by
precisely half the hypothesis in VS and negative by the other
half.
Harshavardhana Doddamani, Assistant
December 29, 2021 74
Professor, Dept Of CSE, SJCIT
Inductive Bias III: The Futility of
Bias-Free Learning
• Fundamental Property of Inductive Learning A
learner that makes no a priori assumptions regarding
the identity of the target concept has no rational basis
for classifying any unseen instances.
• We constantly have recourse to inductive biases
Example: we all know that the sun will rise
tomorrow. Although we cannot deduce that it will do
so based on the fact that it rose today, yesterday, the
day before, etc., we do take this leap of faith or use
this inductive bias, naturally!
Harshavardhana Doddamani, Assistant
December 29, 2021 75
Professor, Dept Of CSE, SJCIT
Inductive Bias
Consider:
•Concept learning algorithm L
•Instances X, target concept c
•Training examples Dc={<x, c(x)>}
•Let L(Xi, Dc) denote the classification assigned to instance xi by L after training
on Dc.
Definition:
• The inductive bias of L is any minimal set of assertions B such that for
any target concept c and corresponding training data Dc
• (Xi  X)[B  Dc  Xi] |--L(Xi, Dc)
• Where A |--B means that A logically entails B.
• Inductive bias of CANDIDATE-ELIMINATION algorithm. The target
concept c is contained in the given hypothesis space H.

Harshavardhana Doddamani, Assistant


December 29, 2021 76
Professor, Dept Of CSE, SJCIT
Inductive Systems and Equivalent
Deductive Systems

Harshavardhana Doddamani, Assistant


December 29, 2021 77
Professor, Dept Of CSE, SJCIT
Ranking Inductive Learners according
to their Biases
• Rote-Learner: This system
simply memorizes the training
data and their classification---
No generalization is involved.
• Candidate-Elimination: New
instances are classified only if
all the hypotheses in the version
space agree on the classification
• Find-S: New instances are
classified using the most
specific hypothesis consistent
with the training data

Harshavardhana Doddamani, Assistant


December 29, 2021 78
Professor, Dept Of CSE, SJCIT

You might also like