Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
90 views

Machine Learning Lecture

Uploaded by

aadarsh821101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views

Machine Learning Lecture

Uploaded by

aadarsh821101
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

History of Machine Learning

• 1960’s and 70’s: Models of human learning


– High-level symbolic descriptions of knowledge, e.g., logical expressions
or graphs/networks, e.g., (Karpinski & Michalski, 1966) (Simon & Lea,
1974).
– Winston’s (1975) structural learning system learned logic-based
structural descriptions from examples.

• Minsky Papert, 1969


• 1970’s: Genetic algorithms
– Developed by Holland (1975)
• 1970’s - present: Knowledge-intensive learning
– A tabula rasa approach typically fares poorly. “To acquire new
knowledge a system must already possess a great deal of initial
knowledge.” Lenat’s CYC project is a good example.
History of Machine Learning (cont’d)
• 1970’s - present: Alternative modes of learning (besides
examples)
– Learning from instruction, e.g., (Mostow, 1983) (Gordon &
Subramanian, 1993)
– Learning by analogy, e.g., (Veloso, 1990)
– Learning from cases, e.g., (Aha, 1991)
– Discovery (Lenat, 1977)
– 1991: The first of a series of workshops on Multistrategy Learning
(Michalski)
• 1970’s – present: Meta-learning
– Heuristics for focusing attention, e.g., (Gordon & Subramanian, 1996)
– Active selection of examples for learning, e.g., (Angluin, 1987),
(Gasarch & Smith, 1988), (Gordon, 1991)
– Learning how to learn, e.g., (Schmidhuber, 1996)
History of Machine Learning (cont’d)

• 1980 – The First Machine Learning Workshop was held at Carnegie-Mellon


University in Pittsburgh.
• 1980 – Three consecutive issues of the International Journal of Policy
Analysis and Information Systems were specially devoted to machine
learning.
• 1981 - Hinton, Jordan, Sejnowski, Rumelhart, McLeland at UCSD
– Back Propagation alg. PDP Book
• 1986 – The establishment of the Machine Learning journal.
• 1987 – The beginning of annual international conferences on machine
learning (ICML). Snowbird ML conference
• 1988 – The beginning of regular workshops on computational learning
theory (COLT).
• 1990’s – Explosive growth in the field of data mining, which involves the
application of machine learning techniques.
Bottom line from History

• 1960 – The Perceptron (Minsky Papert)

• 1960 – “Bellman Curse of Dimensionality”


• 1980 – Bounds on statistical estimators (C. Stone)
• 1990 – Beginning of high dimensional data (Hundreds
variables)

• 2000 – High dimensional data (Thousands variables)


A Glimpse in to the future

• Today status:
– First-generation algorithms:
– Neural nets, decision trees, etc.

• Future:
– Smart remote controls, phones, cars
– Data and communication networks,
software
Type of models
• Supervised learning
– Given access to classified data
• Unsupervised learning
– Given access to data, but no classification
– Important for data reduction
• Control learning
– Selects actions and observes
consequences.
– Maximizes long-term cumulative return.
Some Issues in Machine Learning

• What algorithms can approximate functions


well, and when?
• How does number of training examples
influence accuracy?
• How does complexity of hypothesis
representation impact it?
• How does noisy data influence accuracy?
More Issues in Machine Learning

What are the theoretical limits of learnability?


• How can prior knowledge of learner help?
• What clues can we get from biological learning
systems?

• How can systems alter their own


representations?
Complexity vs. Generalization

• Hypothesis complexity versus observed error.


• More complex hypothesis have lower observed
error on the training set,
• Might have higher true error (on test set).
Nearest Neighbor Methods

Classify using near examples.

Assume a “structured space” and a “metric”

- + +
? -
+
- +
-
Separating Hyperplane

sign
Perceptron: sign( Σ x iw i )
Find w1 .... wn
Σ
Limited representation
w1 wn

x1 xn
Neural Networks

Sigmoidal gates:
a= Σ x iw i and
-a
output = 1/(1+ e )

x1 xn

Learning by “Back Propagation” of errors


Decision Trees

x1 > 5

+1
x6 > 2

+1 -1
Decision Trees

Top Down construction:


Construct the tree greedy,
using a local index function.
Ginni Index : G(x) = x(1-x), Entropy H(x) ...

Bottom up model selection:


Prune the decision Tree
while maintaining low observed error.
Decision Trees
• Limited Representation

• Highly interpretable

• Efficient training and retrieval algorithm

• Smart cost/complexity pruning

• Aim: Find a small decision tree with

a low observed error.


Support Vector Machine

n dimensions m dimensions
Support Vector Machine

Project data to a high dimensional


space.
Use a hyperplane in the LARGE space.

Choose a hyperplane with a large


MARGIN.
+ -

+ -
+
-
+
Reinforcement Learning
• Main idea: Learning with a Delayed Reward

• Uses dynamic programming and supervised learning

• Addresses problems that can not be addressed by


regular supervised methods
• E.g., Useful for Control Problems.

• Dynamic programming searches for optimal policies.


Genetic Programming

A search Method. Example: decision trees

Local mutation operations Change a node in a tree

Cross-over operations Replace a subtree by another tree

Keeps the “best” candidates Keep trees with low observed error
Unsupervised learning: Clustering
Unsupervised learning: Clustering
Data Science Vs. Machine Learning and AI

Artificial Intelligence Machine Learning Data Science

Includes Machine Learning. Subset of Artificial Intelligence. Includes various Data Operations.

Artificial Intelligence combines large amounts of data


Machine Learning uses efficient programs that can use Data Science works by sourcing, cleaning, and processing
through iterative processing and intelligent algorithms to
data without being explicitly told to do so. data to extract meaning out of it for analytical purposes.
help computers learn automatically.

Some of the popular tools that AI uses are- The popular tools that Machine Learning makes use of
Some of the popular tools used by Data Science are-1.
1. TensorFlow2. Scikit Learn are-1. Amazon Lex2. IBM Watson Studio3. Microsoft
SAS2. Tableau3. Apache Spark4. MATLAB
3. Keras Azure ML Studio

Data Science deals with structured and unstructured


Artificial Intelligence uses logic and decision trees. Machine Learning uses statistical models.
data.

Chatbots, and Voice assistants are popular applications of Recommendation Systems such as Spotify, and Facial Fraud Detection and Healthcare analysis are popular
AI. Recognition are popular examples. examples of Data Science.

You might also like