Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
25 views

Lect3 Machine Learning

Machine learning involves programming computers to optimize performance using example data or past experience. There is no need for machines to "learn" simple tasks like calculating payroll, but learning is useful when human expertise does not exist, humans cannot explain their expertise, solutions change over time, or solutions need to be adapted to particular cases. Traditional programming involves writing specific instructions for a computer to follow, while machine learning involves providing example data for a computer to use to generate its own algorithms or programs to solve problems.

Uploaded by

Amrin Mulani
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views

Lect3 Machine Learning

Machine learning involves programming computers to optimize performance using example data or past experience. There is no need for machines to "learn" simple tasks like calculating payroll, but learning is useful when human expertise does not exist, humans cannot explain their expertise, solutions change over time, or solutions need to be adapted to particular cases. Traditional programming involves writing specific instructions for a computer to follow, while machine learning involves providing example data for a computer to use to generate its own algorithms or programs to solve problems.

Uploaded by

Amrin Mulani
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 27

Why “Learn”?

• Machine learning is programming computers to optimize a


performance criterion using example data or past
experience.
• There is no need to “learn” to calculate payroll
• Learning is used when:
– Human expertise does not exist (navigating on Mars),
– Humans are unable to explain their expertise (speech
recognition)
– Solution changes in time (routing on a computer network)
– Solution needs to be adapted to particular cases (user
biometrics)

1
Traditional Programming

Data
Computer Output
Program

Machine Learning

Data
Computer Program
Output
Magic?
No, more like gardening

• Seeds = Algorithms
• Nutrients = Data
• Gardener = You
• Plants = Programs
What is machine learning?
• A branch of artificial intelligence, concerned with the
design and development of algorithms that allow
computers to evolve behaviors based on empirical data.

• As intelligence requires knowledge, it is necessary for


the computers to acquire knowledge.
Learning system model

Testing

Input Learning
Samples Method

System

Training
What is Machine Learning?
• Machine Learning
– Study of algorithms that
– improve their performance
– at some task
– with experience
• Optimize a performance criterion using example data or past
experience.
• Role of Statistics: Inference from a sample
• Role of Computer science: Efficient algorithms to
– Solve the optimization problem
– Representing and evaluating the model for inference

6
Machine learning…
• It is very hard to write programs that solve problems like
recognizing a face.
– We don’t know what program to write because we don’t
know how our brain does it.
– Even if we had a good idea about how to do it, the
program might be horrendously complicated.
• Instead of writing a program by hand, we collect lots of
examples that specify the correct output for a given input.
• A machine learning algorithm then takes these examples and
produces a program that does the job.
– The program produced by the learning algorithm may look
very different from a typical hand-written program. It may
contain millions of numbers.
– If we do it right, the program works for new cases as well
as the ones we trained it on.
A classic example of a task that requires machine
learning: It is very hard to say what makes a 2
Some more examples of tasks that are
best solved by using a learning algorithm

• Recognizing patterns:
– Facial identities or facial expressions
– Handwritten or spoken words
– Medical images
• Generating patterns:
– Generating images or motion sequences
• Recognizing anomalies:
– Unusual sequences of credit card transactions
– Unusual patterns of sensor readings in a nuclear
power plant or unusual sound in your car engine.
• Prediction:
– Future stock prices or currency exchange rates
Some web-based examples of machine
learning
• The web contains a lot of data. Tasks with very big
datasets often use machine learning
– especially if the data is noisy or non-stationary.
• Spam filtering, fraud detection:
– The enemy adapts so we must adapt too.
• Recommendation systems:
– Lots of noisy data. Million dollar prize!
• Information retrieval:
– Find documents or images with similar content.
• Data Visualization:
– Display a huge database in a revealing way
ML in a Nutshell
• Tens of thousands of machine learning
algorithms
• Hundreds new every year
• Every machine learning algorithm has
three components:
– Representation
– Evaluation
– Optimization
Representation
• Decision trees
• Sets of rules / Logic programs
• Instances
• Graphical models (Bayes/Markov nets)
• Neural networks
• Support vector machines
• Model ensembles
• Etc.
Evaluation
• Accuracy
• Precision and recall
• Squared error
• Likelihood
• Posterior probability
Optimization
• Combinatorial optimization
– E.g.: Greedy search
• Convex optimization
– E.g.: Gradient descent
• Constrained optimization
– E.g.: Linear programming
Types of Learning
• Supervised (inductive) learning
– Training data includes desired outputs
• Unsupervised learning
– Training data does not include desired outputs
• Semi-supervised learning
– Training data includes a few desired outputs
• Reinforcement learning
– Rewards from sequence of actions
Supervised learning
suppose you are given an basket filled with different kinds of fruits. Now the first step
is to train the machine with all different fruits one by one like this:

•If shape of object is rounded and depression at top


having color Red then it will be labelled as –Apple.
•If shape of object is long curving cylinder having
color Green-Yellow then it will be labelled as –
Banana.

Since machine has already learnt the things from


previous data and this time have to use it wisely. It
will first classify the fruit with its shape and color,
and would confirm the fruit name as BANANA
and put it in Banana category.
Supervised learning classification
• Classification: A classification problem is
when the output variable is a category, such
as “Red” or “blue” or “disease” and “no
disease”.
• Regression: A regression problem is when
the output variable is a real value, such as
“dollars” or “weight”.
It is called supervised learning because the process of an learning(from the training dataset)
can be thought of as a teacher who is supervising the entire learning process. Thus, the
“learning algorithm” iteratively makes predictions on the training data and is corrected by
the “teacher”, and the learning stops when the algorithm achieves an acceptable level of
performance(or the desired accuracy).
Supervised Learning: Uses
Supervised Learning algorithm learns from a
known data-set(Training Data) which has
labels to make predictions
• Prediction of future cases: Use the rule to predict
the output for future inputs
• Knowledge extraction: The rule is easy to
understand
• Compression: The rule is simpler than the data it
explains
• Outlier detection: Exceptions that are not covered
by the rule, e.g., fraud

18
Supervised Learning : Regression
and Classification
Classification
Regression

• Predict the price


of house
Unsupervised learning
• Unsupervised learning is the training of machine using information that is
neither classified nor labeled and allowing the algorithm to act on that
information without guidance.
• the task of machine is to group unsorted information according to
similarities, patterns and differences without any prior training of data.

machine has no any idea about the features of dogs


and cat

it can categorize them according to their


similarities, patterns and differences
Unsupervised Learning
• Learning “what normally happens”
• No output
• Clustering: Grouping similar instances
• Other applications: Summarization,
Association Analysis
• Example applications
– Customer segmentation in CRM
– Image compression: Color quantization
– Bioinformatics: Learning motifs
21
unsupervised learning
clustering
SUPERVISED UNSUPERVISED
LEARNING LEARNING

Uses Known and Uses Unknown Data as


Input Data
Labeled Data as input input

Computational Less Computational


Very Complex
Complexity Complexity

Uses Real Time


Real Time Uses off-line analysis
Analysis of Data

Number of Classes are Number of Classes are


Number of Classes
known not known

Accurate and Reliable Moderate Accurate and


Accuracy of Results
Results Reliable Results
ML in Practice
• Understanding domain, prior knowledge, and
goals
• Data integration, selection, cleaning,
pre-processing, etc.
• Learning models
• Interpreting results
• Consolidating and deploying discovered
knowledge
• Loop
Test Your Knowledge
Which of the following term is appropriate to the below figure ?

a) Large Data
b) Big Data
c) Dark Data
d) None of the mentioned

Point out the correct statement:


a) Machine learning focuses on prediction, based on known properties learned from
the training data
b) Data Cleaning focuses on prediction, based on known properties learned from the
training data.
c) Representing data in a form which both mere mortals can understand and get
valuable insights is as much a science as much as it is art
d) None of the Mentioned

Test Your Knowledge
Which of the following is the top most important thing in data science ?
a) answer
b) question
c) data
d) none of the Mentioned
• Which of the following approach should be used if you can’t fix the variable ?
a) randomize it b) non stratify it c) generalize it d) none of the Mentioned
Which of the following statement will import pandas?
a) import pandas as pd
b) import panda as py
c) import pandaspy as pd
d) all of the Mentioned
• Which of the following object you get after reading CSV file?
a) DataFrame
b) Character Vector
c) Panel
d) All of the Mentioned
• Which of the following library is similar to Pandas ?
a) NumPy
b) RPy
c) OutPy
d) None of the Mentioned
• Data that summarize all observations in a category are called __________ data.
a) frequency
b) summarized
c) raw
d) None of the Mentioned
• Point out the correct statement:
a) Primary data is original source of data
b) Secondary data is original source of data
c) Questions are obtained after data processing steps
d) None of the Mentioned
Test Your Knowledge
• Which of the following characteristic of big data is relatively more concerned to data science ?
a) Velocity
b) Variety
c) Volume
d) None of the Mentioned
• Which of the following analytical capabilities are provide by information management company ?
a) Stream Computing
b) Content Management
c) Information Integration
d) All of the Mentioned
• Point out the wrong statement:
a) The big volume indeed represents Big Data
b) The data growth and social media explosion have changed how we look at the data
c) Big Data is just about lots of data
d) All of the Mentioned

Which of the following language should be replaced with the question mark in the
below figure ?

a) Java
b) PHP
c) COBOL
d) None of the mentioned

You might also like