0% found this document useful (0 votes)

55 views

What Is Learning?: CS 391L: Machine Learning

Herbert Simon defined learning as any process that improves performance from experience. There are two main types of machine learning tasks: classification, which assigns objects or events to categories, and problem solving/planning/control, which performs actions to achieve goals. Studying machine learning can help develop intelligent systems, understand human and biological learning, and discover new knowledge from large datasets. The key aspects of defining a learning task are specifying the task, performance metric, and type of experience used to improve performance.

Uploaded by

Jim

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

55 views

What Is Learning?: CS 391L: Machine Learning

Uploaded by

Jim

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

What is Learning?

• Herbert Simon: “Learning is any process by

which a system improves performance from
CS 391L: Machine Learning experience.”
Introduction • What is the task?
– Classification
– Problem solving / planning / control

Raymond J. Mooney
University of Texas at Austin
1 2

Classification Problem Solving / Planning / Control

• Assign object/event to one of a given finite set of • Performing actions in an environment in order to
categories.
– Medical diagnosis achieve a goal.
– Credit card applications or transactions – Solving calculus problems
– Fraud detection in e-commerce
– Worm detection in network packets – Playing checkers, chess, or backgammon
– Spam filtering in email – Balancing a pole
– Recommended articles in a newspaper
– Recommended books, movies, music, or jokes
– Driving a car or a jeep
– Financial investments – Flying a plane, helicopter, or rocket
– DNA sequences
– Controlling an elevator
– Spoken words
– Handwritten letters – Controlling a character in a video game
– Astronomical images – Controlling a mobile robot

3 4

Why Study Machine Learning?

Measuring Performance
Engineering Better Computing Systems
• Classification Accuracy • Develop systems that are too difficult/expensive to
construct manually because they require specific detailed
• Solution correctness skills or knowledge tuned to a specific task (knowledge
engineering bottleneck).
• Solution quality (length, efficiency) • Develop systems that can automatically adapt and
customize themselves to individual users.
• Speed of performance – Personalized news or mail filter
– Personalized tutoring
• Discover new knowledge from large databases (data
mining).
– Market basket analysis (e.g. diapers and beer)
– Medical text mining (e.g. migraines to calcium channel blockers to
magnesium)

5 6

1
Why Study Machine Learning? Why Study Machine Learning?
Cognitive Science The Time is Ripe
• Computational studies of learning may help us • Many basic effective and efficient
understand learning in humans and other
biological organisms. algorithms available.
– Hebbian neural learning • Large amounts of on-line data available.
• “Neurons that fire together, wire together.”
– Human’s relative difficulty of learning disjunctive • Large amounts of computational resources
concepts vs. conjunctive ones.
– Power law of practice available.
log(perf. time)

log(# training trials)

7 8

Related Disciplines Defining the Learning Task

• Artificial Intelligence Improve on task, T, with respect to
• Data Mining performance metric, P, based on experience, E.
• Probability and Statistics T: Playing checkers
• Information theory P: Percentage of games won against an arbitrary opponent
E: Playing practice games against itself
• Numerical optimization
• Computational complexity theory T: Recognizing hand-written words
P: Percentage of words correctly classified
• Control theory (adaptive) E: Database of human-labeled images of handwritten words
• Psychology (developmental, cognitive)
• Neurobiology T: Driving on four-lane highways using vision sensors
P: Average distance traveled before a human-judged error
• Linguistics E: A sequence of images and steering commands recorded while
• Philosophy observing a human driver.

T: Categorize email messages as spam or legitimate.

P: Percentage of email messages correctly classified.
9 10
E: Database of emails, some with human-given labels

Designing a Learning System Sample Learning Problem

• Choose the training experience • Learn to play checkers from self-play
• Choose exactly what is too be learned, i.e. the
target function.
• We will develop an approach analogous to
• Choose how to represent the target function. that used in the first machine learning
• Choose a learning algorithm to infer the target system developed by Arthur Samuels at
function from the experience. IBM in 1959.
Learner

Environment/
Experience Knowledge

Performance
Element 11 12

2
Training Experience Source of Training Data
• Direct experience: Given sample input and output • Provided random examples outside of the learner’s
pairs for a useful target function. control.
– Checker boards labeled with the correct move, e.g. – Negative examples available or only positive?
extracted from record of expert play • Good training examples selected by a “benevolent
• Indirect experience: Given feedback which is not teacher.”
– “Near miss” examples
direct I/O pairs for a useful target function.
– Potentially arbitrary sequences of game moves and their
• Learner can query an oracle about class of an
final game results. unlabeled example in the environment.
• Credit/Blame Assignment Problem: How to assign • Learner can construct an arbitrary example and
query an oracle for its label.
credit blame to individual moves given only
indirect feedback? • Learner can design and run experiments directly
in the environment without any human guidance.

13 14

Training vs. Test Distribution Choosing a Target Function

• Generally assume that the training and test • What function is to be learned and how will it be
used by the performance system?
examples are independently drawn from the
• For checkers, assume we are given a function for
same overall distribution of data. generating the legal moves for a given board position
– IID: Independently and identically distributed and want to decide the best move.
– Could learn a function:
• If examples are not independent, requires
ChooseMove(board, legal-moves) → best-move
collective classification. – Or could learn an evaluation function, V(board) → R,
• If test distribution is different, requires that gives each board position a score for how favorable it
is. V can be used to pick a move by applying each legal
transfer learning. move, scoring the resulting board position, and choosing
the move that results in the highest scoring board position.

15 16

Ideal Definition of V(b) Approximating V(b)

• If b is a final winning board, then V(b) = 100 • Computing V(b) is intractable since it
• If b is a final losing board, then V(b) = –100 involves searching the complete exponential
• If b is a final draw board, then V(b) = 0 game tree.
• Otherwise, then V(b) = V(b´), where b´ is the • Therefore, this definition is said to be non-
highest scoring final board position that is achieved operational.
starting from b and playing optimally until the end • An operational definition can be computed
of the game (assuming the opponent plays in reasonable (polynomial) time.
optimally as well). • Need to learn an operational approximation
– Can be computed using complete mini-max search of the to the ideal evaluation function.
finite game tree.

17 18

3
Representing the Target Function Linear Function for Representing V(b)
• Target function can be represented in many ways: • In checkers, use a linear approximation of the
lookup table, symbolic rules, numerical function, evaluation function.
)
neural network. V (b) = w0 + w1 ⋅ bp(b) + w2 ⋅ rp(b) + w3 ⋅ bk (b) + w4 ⋅ rk (b) + w5 ⋅ bt (b) + w6 ⋅ rt (b)
• There is a trade-off between the expressiveness of – bp(b): number of black pieces on board b
a representation and the ease of learning. – rp(b): number of red pieces on board b
• The more expressive a representation, the better it – bk(b): number of black kings on board b
will be at approximating an arbitrary function; – rk(b): number of red kings on board b
however, the more examples will be needed to – bt(b): number of black pieces threatened (i.e. which can
learn an accurate function. be immediately taken by red on its next turn)
– rt(b): number of red pieces threatened

19 20

Obtaining Training Values Temporal Difference Learning

• Direct supervision may be available for the • Estimate training values for intermediate (non-
target function. terminal) board positions by the estimated value of
their successor in an actual game trace.
– < <bp=3,rp=0,bk=1,rk=0,bt=0,rt=0>, 100> )
(win for black) Vtrain (b) = V (successor( b))
where successor(b) is the next board position
• With indirect feedback, training values can
where it is the program’s move in actual play.
be estimated using temporal difference
• Values towards the end of the game are initially
learning (used in reinforcement learning
more accurate and continued training slowly
where supervision is delayed reward). “backs up” accurate values to earlier board
positions.

21 22

Learning Algorithm Least Mean Squares (LMS) Algorithm

• Uses training values for the target function to • A gradient descent algorithm that incrementally
induce a hypothesized definition that fits these updates the weights of a linear function in an
examples and hopefully generalizes to unseen attempt to minimize the mean squared error
examples. Until weights converge :
• In statistics, learning to approximate a continuous For each training example b do :
function is called regression. 1) Compute the absolute error : )
• Attempts to minimize some measure of error (loss error (b) = Vtrain (b) − V (b)
function) such as mean squared error: 2) For each board feature, fi, update its weight, wi :

∑ )
[Vtrain (b) − V (b)]2 wi = wi + c ⋅ f i ⋅ error (b)
E = b∈B for some small constant (learning rate) c
B
23 24

4
LMS Discussion Lessons Learned about Learning
• Intuitively, LMS executes the following rules: • Learning can be viewed as using direct or indirect
– If the output for an example is correct, make no change. experience to approximate a chosen target
– If the output is too high, lower the weights proportional function.
to the values of their corresponding features, so the • Function approximation can be viewed as a search
overall output decreases
through a space of hypotheses (representations of
– If the output is too low, increase the weights
functions) for one that best fits a set of training
proportional to the values of their corresponding
features, so the overall output increases. data.
• Under the proper weak assumptions, LMS can be • Different learning methods assume different
proven to eventetually converge to a set of weights hypothesis spaces (representation languages)
that minimizes the mean squared error. and/or employ different search techniques.

25 26

Various Function Representations Various Search Algorithms

• Numerical functions • Gradient descent
– Linear regression
– Perceptron
– Neural networks
– Support vector machines – Backpropagation
• Symbolic functions • Dynamic Programming
– Decision trees – HMM Learning
– Rules in propositional logic
– PCFG Learning
– Rules in first-order predicate logic
• Instance-based functions • Divide and Conquer
– Nearest-neighbor – Decision tree induction
– Case-based – Rule learning
• Probabilistic Graphical Models
– Naïve Bayes • Evolutionary Computation
– Bayesian networks – Genetic Algorithms (GAs)
– Hidden-Markov Models (HMMs) – Genetic Programming (GP)
– Probabilistic Context Free Grammars (PCFGs) – Neuro-evolution
– Markov networks

27 28

Evaluation of Learning Systems History of Machine Learning

• Experimental • 1950s
– Samuel’s checker player
– Conduct controlled cross-validation experiments to – Selfridge’s Pandemonium
compare various methods on a variety of benchmark • 1960s:
datasets. – Neural networks: Perceptron
– Gather data on their performance, e.g. test accuracy, – Pattern recognition
training-time, testing-time. – Learning in the limit theory
– Analyze differences for statistical significance. – Minsky and Papert prove limitations of Perceptron
• 1970s:
• Theoretical – Symbolic concept induction
– Analyze algorithms mathematically and prove theorems – Winston’s arch learner
about their: – Expert systems and the knowledge acquisition bottleneck
• Computational complexity – Quinlan’s ID3
– Michalski’s AQ and soybean diagnosis
• Ability to fit training data
– Scientific discovery with BACON
• Sample complexity (number of training examples needed to – Mathematical discovery with AM
learn an accurate function)

29 30

5
History of Machine Learning (cont.) History of Machine Learning (cont.)
• 1980s: • 2000s
– Advanced decision tree and rule learning – Support vector machines
– Explanation-based Learning (EBL) – Kernel methods
– Learning and planning and problem solving
– Graphical models
– Utility problem
– Analogy – Statistical relational learning
– Cognitive architectures – Transfer learning
– Resurgence of neural networks (connectionism, backpropagation) – Sequence labeling
– Valiant’s PAC Learning Theory – Collective classification and structured outputs
– Focus on experimental methodology – Computer Systems Applications
• 1990s • Compilers
– Data mining • Debugging
– Adaptive software agents and web applications • Graphics
– Text learning • Security (intrusion, virus, and worm detection)
– Reinforcement learning (RL) – Email management
– Inductive Logic Programming (ILP) – Personalized assistants that learn
– Ensembles: Bagging, Boosting, and Stacking – Learning in robotics and vision
– Bayes Net learning
31 32

Soft Computing MCQ
69% (77)
Soft Computing MCQ
62 pages
Ai&ml Unit 4
No ratings yet
Ai&ml Unit 4
21 pages
Unit 1 1
No ratings yet
Unit 1 1
26 pages
Unit 1.2 Desigining A Learning System
No ratings yet
Unit 1.2 Desigining A Learning System
15 pages
Course. Introduction To Machine Learning Lecture 1. Introduction To ML
No ratings yet
Course. Introduction To Machine Learning Lecture 1. Introduction To ML
46 pages
ML1
No ratings yet
ML1
28 pages
ML Unit-I
No ratings yet
ML Unit-I
121 pages
Effective Applications of Learning: Speech Recognition
No ratings yet
Effective Applications of Learning: Speech Recognition
52 pages
Unit 1 ML
No ratings yet
Unit 1 ML
14 pages
Unit 4
No ratings yet
Unit 4
45 pages
Unti 1 ML
No ratings yet
Unti 1 ML
26 pages
Module 1 Notes PDF
No ratings yet
Module 1 Notes PDF
26 pages
Svit Dept of Computer Science and Engineering Machine Learning B.Tech Iiiyr
No ratings yet
Svit Dept of Computer Science and Engineering Machine Learning B.Tech Iiiyr
53 pages
ML Unit-I Chapter-I Introduction
No ratings yet
ML Unit-I Chapter-I Introduction
36 pages
ml notes
No ratings yet
ml notes
47 pages
Unit 1 ML
No ratings yet
Unit 1 ML
60 pages
Unit 1 1
No ratings yet
Unit 1 1
64 pages
Designing A Learning System: DR - Chandrika.J Professor CSE Course Faculty
No ratings yet
Designing A Learning System: DR - Chandrika.J Professor CSE Course Faculty
22 pages
ML Chapter-1
No ratings yet
ML Chapter-1
39 pages
Module 1
No ratings yet
Module 1
28 pages
ML - Unit 1 - Part I
No ratings yet
ML - Unit 1 - Part I
24 pages
Ecs 403 ML Module I
No ratings yet
Ecs 403 ML Module I
33 pages
ML Unit-1
No ratings yet
ML Unit-1
61 pages
Machine Learning 2D5362
No ratings yet
Machine Learning 2D5362
60 pages
Module 1 Concept Learning Notes
No ratings yet
Module 1 Concept Learning Notes
26 pages
UNIT 1 Machine Learning MTech
No ratings yet
UNIT 1 Machine Learning MTech
167 pages
Machine Learning
No ratings yet
Machine Learning
111 pages
Learning
No ratings yet
Learning
35 pages
ML Module1 Chapter1
No ratings yet
ML Module1 Chapter1
38 pages
Module 1
No ratings yet
Module 1
27 pages
Introduction To ML,: Module-I
No ratings yet
Introduction To ML,: Module-I
48 pages
Video Tutorial: Machine Learning 17CS73
100% (2)
Video Tutorial: Machine Learning 17CS73
27 pages
Designing A Learning System
No ratings yet
Designing A Learning System
23 pages
ML Module Notes
No ratings yet
ML Module Notes
139 pages
Unit 1
No ratings yet
Unit 1
14 pages
ML-1
No ratings yet
ML-1
86 pages
ML Lec 03 Machine Learning Process
No ratings yet
ML Lec 03 Machine Learning Process
42 pages
Eid 403 ML Module I Lecture Notes
No ratings yet
Eid 403 ML Module I Lecture Notes
26 pages
Machine Learning Notes-1 (ML Design)
No ratings yet
Machine Learning Notes-1 (ML Design)
7 pages
Module 1
No ratings yet
Module 1
27 pages
M01 Machine Learning
No ratings yet
M01 Machine Learning
25 pages
Module 2 PDF
No ratings yet
Module 2 PDF
26 pages
Machine Learning (UNIT-1 - PART ONE)
No ratings yet
Machine Learning (UNIT-1 - PART ONE)
24 pages
Machine Learning (Unit-1)
No ratings yet
Machine Learning (Unit-1)
24 pages
ML Unit 1 CS
100% (2)
ML Unit 1 CS
102 pages
MACHINE LEARNING TECHNIQUES - PPSX
No ratings yet
MACHINE LEARNING TECHNIQUES - PPSX
26 pages
1 Introduction To Machine Learning
No ratings yet
1 Introduction To Machine Learning
20 pages
Unit 1
No ratings yet
Unit 1
15 pages
ML Unit-1
No ratings yet
ML Unit-1
70 pages
Machine Learning Unit 1
No ratings yet
Machine Learning Unit 1
56 pages
ML First Unit
No ratings yet
ML First Unit
70 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
26 pages
ML Unit I Notes
No ratings yet
ML Unit I Notes
27 pages
Unit 1 Machine Learning Notes
No ratings yet
Unit 1 Machine Learning Notes
19 pages
ML UNIT-1 NOTES
No ratings yet
ML UNIT-1 NOTES
15 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
ML Unit 1
No ratings yet
ML Unit 1
35 pages
Lecture Series On Machine Learning: Ravi Gupta G. Bharadwaja Kumar
No ratings yet
Lecture Series On Machine Learning: Ravi Gupta G. Bharadwaja Kumar
77 pages
Mitchell Machine Learning
No ratings yet
Mitchell Machine Learning
37 pages
Using the Standards - Data Analysis & Probability, Grade 5
From Everand
Using the Standards - Data Analysis & Probability, Grade 5
MathQueue
No ratings yet
data science course training in india hyderabad: innomatics research labs
From Everand
data science course training in india hyderabad: innomatics research labs
innomatics research labs
No ratings yet
hw1 Sols PDF
No ratings yet
hw1 Sols PDF
5 pages
Deep Learning
No ratings yet
Deep Learning
1 page
AI ML Notes 2
No ratings yet
AI ML Notes 2
151 pages
Real-Time Vehicular Accident Prevention System Using Deep
No ratings yet
Real-Time Vehicular Accident Prevention System Using Deep
16 pages
02u Handout
No ratings yet
02u Handout
37 pages
21CS54 QB Test3
No ratings yet
21CS54 QB Test3
2 pages
Lui 2014
No ratings yet
Lui 2014
7 pages
Linear Discriminant Functions: Minimum Squared Error Procedures: Ho-Kashyap Procedures
No ratings yet
Linear Discriminant Functions: Minimum Squared Error Procedures: Ho-Kashyap Procedures
22 pages
soft computing roadmap
No ratings yet
soft computing roadmap
3 pages
Pnas 1907373117
No ratings yet
Pnas 1907373117
6 pages
(IJCST-V12I1P8) :gautam Saikia, Mutum Bidyarani Devi
No ratings yet
(IJCST-V12I1P8) :gautam Saikia, Mutum Bidyarani Devi
6 pages
Urban Land-Use Classification Using Machine Learning Classifiers Comparative Evaluation and Post-Classification Multi-Feature Fusion Approach
No ratings yet
Urban Land-Use Classification Using Machine Learning Classifiers Comparative Evaluation and Post-Classification Multi-Feature Fusion Approach
22 pages
December: 6.034 Final Examination Solutions
No ratings yet
December: 6.034 Final Examination Solutions
26 pages
AI010 804L01 Neural Networks
No ratings yet
AI010 804L01 Neural Networks
41 pages
Artificial Neural Networks (ch7)
No ratings yet
Artificial Neural Networks (ch7)
12 pages
Artificial Intellegance
0% (1)
Artificial Intellegance
14 pages
Deep Learning Notes
No ratings yet
Deep Learning Notes
200 pages
(Ebook) Classical and Quantum Computing: with C++ and Java Simulations by Yorick Hardy, Willi-Hans Steeb ISBN 9783034883665, 9783764366100, 3034883668, 3764366109 All Chapters Instant Download
100% (3)
(Ebook) Classical and Quantum Computing: with C++ and Java Simulations by Yorick Hardy, Willi-Hans Steeb ISBN 9783034883665, 9783764366100, 3034883668, 3764366109 All Chapters Instant Download
71 pages
connectionist
No ratings yet
connectionist
34 pages
Charotar University of Science and Technology Faculty of Technology and Engineering
No ratings yet
Charotar University of Science and Technology Faculty of Technology and Engineering
10 pages
A Driving Decision
No ratings yet
A Driving Decision
39 pages
Instant Access to Artificial Intelligence in Medicine: Applications, Limitations and Future Directions Manda Raz (Editor) ebook Full Chapters
100% (1)
Instant Access to Artificial Intelligence in Medicine: Applications, Limitations and Future Directions Manda Raz (Editor) ebook Full Chapters
52 pages
Yu 2023 AI Psychology Oxford
No ratings yet
Yu 2023 AI Psychology Oxford
29 pages
Topic 07-Part1 Introduction To Deep Neural Networks
No ratings yet
Topic 07-Part1 Introduction To Deep Neural Networks
27 pages
18183_Network_Inversion_and_it
No ratings yet
18183_Network_Inversion_and_it
9 pages
Nature Inspired Algorithms Big Data Frameworks
No ratings yet
Nature Inspired Algorithms Big Data Frameworks
436 pages
Assignment 1 (
No ratings yet
Assignment 1 (
2 pages
Artificial Neural Network - ..
100% (1)
Artificial Neural Network - ..
15 pages
ANN - Session 3 CO 1
No ratings yet
ANN - Session 3 CO 1
8 pages