Lecture 17&18 - Introduction To Machine Learning
Lecture 17&18 - Introduction To Machine Learning
MACHINE LEARNING
What is Machine Learning?
Learn to improve algorithms from data.
Why?
• Human expertise does not exist (ex: complex medical processes we don’t fully
understand)
• Humans are unable to explain their expertise (speech recognition)
• Solution change or adapt in time (routing on a computer network)
2
Example 1: Digit Recognition
3
Classical “Expert” Approach
4
Problems with Expert Rules
5
ML Approach: Learn from Data
?
6
Example 2: Face Detection
7
Machine Learning in Many Fields
• Retail: Market basket analysis, Customer relationship management
(CRM)
• Finance: Credit scoring, fraud detection
• Manufacturing: Control, robotics, troubleshooting
• Medicine: Medical diagnosis
• Telecommunications: Spam filters, intrusion detection
• Bioinformatics: Motifs, alignment
• Web mining: Search engines
• ...
8
Definition of Machine Learning
• Arthur Samuel (1959). Machine Learning: Field of study that gives
computers the ability to learn without being explicitly programmed.
Savings
Low risk
High risk
Income
12
Classification
Object recognition
13
Regression
Happiness Score
•
Income
14
Regression
Colorize B&W images automatically
15
Unsupervised Learning
•
16
Reinforcement Learning
Move:
…Bd4
17
Reinforcement
learning
Learning to play Break Out
18
What ML is Doing Today?
• Autonomous driving
• Very difficult games
• Machine translation
19
Why Now?
• Machine learning is an old field
• Much of the pioneering statistical work dates to the 1950s
• So what is new now?
• Big Data:
• Massive storage. Large data centers
• Massive connectivity
• Sources of data from Internet and elsewhere
• Computational advances
• Distributed machines, clusters
• GPUs and hardware
Google Tensor Processing Unit (TPU)
20
The machine learning framework
• Apply a prediction function to a feature representation of the
image to get the desired output:
f( ) = “apple”
f( ) = “tomato”
f( ) = “cow”
Slide credit: L. Lazebnik
The machine learning framework
y = f(x)
output prediction Image
function feature
Testing
Image Learned
Prediction
Features model
Test Image Slide credit: D. Hoiem and L. Lazebnik
Supervised Learning
x2
x1
Unsupervised Learning
x2
x1
Linear Regression
Getting Data
• Data from UCI dataset library:
https://archive.ics.uci.edu/ml/index.php
33
Python Packages
• Python has many powerful packages
• This demo uses three key packages
• Pandas:
• Used for reading and writing data files
• Loads data into dataframes
• Numpy
• Numerical operations including linear algebra
• Data is stored in ndarray structure
• We convert from dataframes to ndarray
• Matplotlib:
• MATLAB-like plotting and visualization
34
Visualizing the Data
• When possible, look at data before
doing anything
• Python has MATLAB-like plotting
• Matplotlib module
35
Data
•
36
Linear Model
•
Regression line
37
Least Squares Model Fitting
•
38
Finding Parameters via Optimization
A general ML recipe
General ML problem Simple linear regression
• Find a model with
parameters
• Get data
• Pick a loss function
• Measures goodness of fit
model to data
• Function of the parameters
39
Linear Regression
• Single variable linear regression
• single variable means there is only one independent variable (x)
• Multi-variable linear regression
• there are multi independent variable (x1,x2,x…)
Code in python
• Problem:
predicting the price of a house
based on the area of house.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model
Load dataset
• Here, read_csv is a method used to load our dataset in the
file house_prices.csv.
df = pd.read_csv("house_prices.csv")
Plot
• plot a scatter plot to see the distribution of our house prices
data.
plt.xlabel("Area(sq.ft.)")
plt.ylabel("Price($)")
plt.scatter(df.Area, df.Price)
Training model
• we will fit our data using fit method. Fitting the data means we
are training the linear regression model using the available data
in our dataset.
reg = linear_model.LinearRegression()
reg.fit(df[['Area']], df.Price)
Prediction
• Once we execute the above code, linear regression model will
fit our data and will be ready to predict the price of houses for
given data.
test_x=[[3400]]
test_y=reg.predict(test_x)
Draw best fit line
• Now let’s see the linear regression line on our scatter plot. For
that we need to add another line in Python as shown below.
plt.xlabel("Area(sq.ft.)")
plt.ylabel("Price($)")
plt.scatter(df.Area, df.Price)
plt.plot(df.Area, reg.predict(df[['Area']]), color='blue')
Draw predicted output
• Draw predicted value on plot
• predict the price of a house with 3000 sq. ft. area, 3 bedrooms,
and 40 years of age.