0% found this document useful (0 votes)

6 views

Lecture 1 - Introduction

Uploaded by

sasith.wickrama

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

Lecture 1 - Introduction

Uploaded by

sasith.wickrama

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 49

Mathematics required for the module

• Calculus (Differential equations)

• Statistics
• Linear Algebra
• Matrix operations
• Cartesian geometry

Programming languages : Octave (Matlab open source),

Python
Tools: Jupyter Notebook (Anaconda Distribution),
Octave, Azure ML Studio
1
Machine Learning – An
Introduction

2
About me
• Dharshana Kasthurirathna
• Main building, 6th floor, Malabe SLIIT campus
• dharshana.k@sliit.lk
Agenda

• Why ML?

• Introduction to ML

• Types of Machine learning

• Common Issues and Challenges

4
About
• Subfield of Artificial Intelligence (AI)/Application of optimization

• Name is derived from the concept that it deals with

“construction and study of systems that can learn from data”
• Can be seen as building blocks to make computers learn to
behave more intelligently
• It is an applied field of study. There are various techniques with
various implementations.
• Supports Other fields in CS (Vision, Data Mining, OCR, NLP, etc.)

5
Why now?
• Flood of available data (especially with the
advent of the Internet)
• Increasing computational power (e.g. Multi-
core)
• Growing progress in available algorithms and
theory developed by researchers
• Increasing support from industries
• Cloud computing

6
In other words…

“A computer program is said to learn

from experience (E) with some class of
tasks (T) and a performance measure (P)
if its performance at tasks in T as
measured by P improves with E”

7
ML vs Traditional programming
Traditional Programming

Data
Computer Output
Program

Machine Learning Learning Algorithm

Data
Computer Program
Output
8
Motivating Example
Learning to Filter Spam

Example: Spam Filtering

Spam - is all email the user does not
want to receive and has not asked to
receive
T: Identify Spam Emails
P:
% of spam emails that were filtered
% of ham/ (non-spam) emails that
were incorrectly filtered-out
E: a database of emails that were
labelled by users 9
Terminology
• Features
– The number of features or distinct traits that can be used to describe
each item in a quantitative manner.
• Samples
– A sample is an item to process (e.g. classify). It can be a document, a
picture, a sound, a video, a row in database or CSV file, or whatever you
can describe with a fixed set of quantitative traits.
• Feature vector
– is an n-dimensional vector of numerical features that represent some
object.
• Feature extraction
– Preparation of feature vector
– transforms the data in the high-dimensional space to a space of
fewer dimensions.
• Training/Evolution set
– Set of data to discover potentially predictive relationships. 10
The Learning Process

Model Model
Learning Testin
g

11
The Learning Process in our Example

Model Model
Learning Testin
g

● Number of recipients
● Size of message
● Number of attachments
● Number of "re's" in the
subject line
Email Server …
12
Data Set
Target
Input Attributes
Attribute

Email Type Customer Country Email Number of

Type (IP) Length (K) new
Recipients
Ham Gold Germany 2 0
Ham Silver Germany 4 1
Spam Bronze Nigeria 2 5
Instances

Spam Bronze Russia 4 2

Ham Bronze Germany 4 3
Ham Silver USA 1 0
Spam Silver USA 2 4

Numeric Nominal Ordinal 13

Model Learning

Learner Classifier
Database
Inducer Classification Model
Training Set
Induction Algorithm

14
Model Testing

Database Learner
Training Set Inducer
Classifier
Induction Algorithm
Classification Model

15
Workflow

16
Categories

• Supervised Learning

• Unsupervised Learning

• Reinforcement Learning

• Semi-Supervised Learning

• Self-supervised learning

• Bayesian learning
17
Use-Cases
• Spam Email Detection (Classification)
• Image Search (Similarity/Classification)
• Clustering (KMeans) : Amazon
Recommendations
• Autonomous driving/flying : Reinforcement
learning

continued…
18
Supervised Learning (Classification)
• the correct classes (labels) of the training data
are known

Credit: http://us.hudson.com/legal/blog/postid/513/predictive-analytics-artificial-intelligence-science-fiction-e-discovery-truth 19
Supervised learning examples

• A Bank may have borrower details (age,

income, gender, etc.) of the past (features)
• Also it may have details of the borrowers
who defaulted in the past (labels)
• Based on the above, can train a classifier
to learn the patterns of borrowers who are
likely to default on their payments
20
Supervised learning

• Used when the dataset has classes/labels

• Includes a ‘training’ phase with the dataset and a

‘testing’ phase to validate the accuracy of the
classifier
• Algorithms – Regression, Support Vector Machines,
Neural Networks, Convolutional Neural Networks,
Decision Trees, Logistic Regression, Random Forest,
Naïve Bayesian, etc. 21
Supervised learning
• Regression – Predict continuous variables (salary,
rent)
• Binary classification (facial recognition, whether
a tumor is benign or malignant)
• Multi-class classification (the type of a vehicle,
the stage of progression of a cancer – level 1,2,3)

22
Linear regression
“Predictor”:
40
Evaluate line:
Target y

return r
20

0
0 10 20
Feature x

• Define form of function f(x) explicitly

• Find a good f(x) within that family
23
Unsupervised learning

• Used when the dataset does not have the labels

(classes)
• Used to group/cluster the data into clusters,
which may then be used for decision making,
making recommendations, classification, etc.
• Algorithms – K-means, Self Organizing Maps,
Deep belief Networks, etc. 24
Unsupervised Learning/Clustering
• The correct classes of the training data are not known

Credit: http://us.hudson.com/legal/blog/postid/513/predictive-analytics-artificial-intelligence-science-fiction-e-discovery-truth 25
Unsupervised learning examples

• A Supermarket may store each buyer’s

basket content details (features)
• There are NO grouping (labels)

• Need to group the buyers based on

their buying patterns in order to best
use the shelf space (recommendation)
26
Unsupervised learning/Clustering
• K-means clustering
• Self organizing maps
• Deep Belief Networks

27
K-means clustering example

28
Reinforcement Learning
• Allows the machine or software agent to learn its behavior
based on feedback from the environment.
• This behavior can be learnt once and for all, or keep on
adapting as time goes by.

Credit: http://us.hudson.com/legal/blog/postid/513/predictive-analytics-artificial-intelligence-science-fiction-e-discovery-truth 29
Reinforcement learning

• Can be used when there’s no data available

• A reward function is used to measure the reward for a given

action
• Based on the reward values, a probability distribution can be
obtained for a given set of functions
• This can continued over time and also can be deployed in
both single/multi-agent systems
• Algorithms – Actor Critic learning, Q learning, Monte-carlo
30
methods, etc.
Reinforcement learning examples
• A group of robots have been deployed in an unknown
territory
• The objective is for them to collaboratively find the
navigation path to reach a particular destination/goal
• Can use reinforcement learning where achieving the
goal/getting closer to the goal gives a positive reward.
Negative reward otherwise
• Can share the information among robots (multi-agent
system) 31
Semi-Supervised Learning
• A Mix of Supervised and Unsupervised learning

Credit: http://us.hudson.com/legal/blog/postid/513/predictive-analytics-artificial-intelligence-science-fiction-e-discovery-truth 32
Semi-supervised learning

• Labeled data is expensive/difficult to get

• Unlabeled data is cheap/easier to get

• The idea is to use smaller amount of labelled

data with larger amount of unlabeled data to
creating the training/testing datasets
• Algorithms - Self Training, Generative models

Semi-Supervised Support Vector Machines,

etc.
33
Semi-supervised learning applications

• Web page classification

• Speech to text conversion
• Video/image generation

34
Bayesian learning

• Naïve basian, Multinomial bayesian, Bayesian

networks, Hidden markov model
• Applications: Sentiment analysis, medical
diagnosis
• Needs some initial knowledge
35
Dimensionality Reduction
• Too many attributes!!!
• Which attributes to choose for the feature
vector?
• 3D picture –> 2D Image
• Dimensionality Reduction
• Principal Component Analysis (PCA) is a
commonly used technique

36
Deep Learning
• ML vs DL

37
Dimensionality Reduction - Challenges

While dimensionality reduction is an important tool in machine

learning/data mining, we must always be aware that it can distort
the data in misleading ways.

Above is a two dimensional projection of an intrinsically three

dimensional world…. 38
Original photographer unknown
See also www.cs.gmu.edu/~jessica/DimReducDanger.htm (c) eamonn keogh 39
Ensemble Learning
• Often, multiple classifiers need to be
combined to solve a real-world problem.

40
Machine learning on Big Data and GPGPU computing

• Use large unstructured data sets for learning

(Call records, Social media data, etc.)
• Two main approaches
• Use a Big Data Platform (e.g. Apache Hadoop,
Apache Spark)
• Use a Cloud based Big Data Analytics platform
(Amazon AWS Services, Microsoft Azure ML)
• GPUs to speed up the learning (particularly in
Deep learning)
41
Things to consider in Selecting a ML Algorithm
• If there’s an algorithmic way instead of ML, use it!!! (ML is
messy)
• Refer the literature!!!
• Try different ML algorithms (no single algorithm is the best)
• Check the dataset against the usage/strength of each algorithm
(e.g. RNNs, ARIMA is good in time-series predictions)
• Be mindful of ‘external factors’ (e.g. seasonal effects, RL if you
don’t have data, Clustering if you have unlabeled data, etc.)
• Test your algorithm(s) with test data and select the best
performing one for production (include the test results in your
thesis/publications)
• No algorithm will be perfect! (There will be an error. The
objective is to keep the error at an acceptable rate)
42
Popular Frameworks/Tools
• Scikit-learn - Python (Anaconda Python Distribution)
• R (R studio)
• Matlab/Octave (can export DLLs)
• Weka (Java based)
• Java OpenNLP/Python NLTK (Natural language processing +
ML)
• Apache Spark (part of the Apache Hadoop platform)
• Google Tensorflow (Python library for Deep neural networks)
• Apache Keras (Python library of neural networks)
• Theano (Python library for Multicore processing of DNNs)
• Amazon AWS Services/Microsoft Azure ML (Cloud based ML)

43
Commonly used python libraries
• NumPy
- Matrix algebra
• Pandas
- Data Frames, Series
• Matpotlib
- Visualization

44
GPT

• https://dida.do/blog/chatgpt-reinforcement-le
arning

• https://arxiv.org/pdf/2203.02155.pdf

45
Resources
• Coursera – Andrew Ng. Machine Learning
• Udacity – Intro to Machine Learning,
Reinforcement Learning
• Python Machine Learning – Sebastian Raschka
• Advance Machine Learning with Python – John
Hearty
• Machine Learning – Tom Mitchell
• Many more!!!

46
Resources
• Python
• https://www.youtube.com/watch?v=LHBE6Q9
XlzI&t=81s
• https://www.youtube.com/watch?v=r-uOLxNr
Nk8

• Mathematics
• https://www.youtube.com/watch?v=0z6AhrO
SrRs
47
Textbooks
• Pattern Recognition and Machine Lea
rning - Bishop
• Machine Learning – A probabilistic perspectiv
e – Murphy
• Deep Learning – Ian Goodfellow
• Elements of Statistical Learning – Hastie
• Reinforcement Learning – An Introduction – S
utton
-
http://incompleteideas.net/book/RLbook2020
.pdf 48
Questions ?

CS 2 3 4 Aml
No ratings yet
CS 2 3 4 Aml
70 pages
Introduction To Machine Learning: Jaime S. Cardoso
100% (1)
Introduction To Machine Learning: Jaime S. Cardoso
52 pages
Mastering Objectoriented Python
From Everand
Mastering Objectoriented Python
Steven F. Lott
5/5 (2)
ppt4dl
No ratings yet
ppt4dl
81 pages
CS464 Ch1 Intro Fall2020
No ratings yet
CS464 Ch1 Intro Fall2020
83 pages
Lecture 17&18 - Introduction To Machine Learning
No ratings yet
Lecture 17&18 - Introduction To Machine Learning
51 pages
Introduction To Pattern Recognition
No ratings yet
Introduction To Pattern Recognition
46 pages
AI-Lecture 8 (Machine Learning Overview)
No ratings yet
AI-Lecture 8 (Machine Learning Overview)
42 pages
WEEK 01 Merged
No ratings yet
WEEK 01 Merged
606 pages
Machine Learning
No ratings yet
Machine Learning
74 pages
Introduction To Machine Learning
100% (1)
Introduction To Machine Learning
119 pages
Presentation on ML - Copy
No ratings yet
Presentation on ML - Copy
469 pages
Week 01
No ratings yet
Week 01
37 pages
(PDF) Introduction To Machine Learning PDF
No ratings yet
(PDF) Introduction To Machine Learning PDF
94 pages
Studi Kasus Machine Learning Dan Data Mining
No ratings yet
Studi Kasus Machine Learning Dan Data Mining
42 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
151 pages
Machine Learning
No ratings yet
Machine Learning
40 pages
Fintech ML Using Azure
No ratings yet
Fintech ML Using Azure
51 pages
Lec 2 Basics of machine learning (1)
No ratings yet
Lec 2 Basics of machine learning (1)
35 pages
ML 01
No ratings yet
ML 01
44 pages
Urban Analytics - Theory US 603
No ratings yet
Urban Analytics - Theory US 603
47 pages
Lecture 01 - Machine Learning Basics Revision
No ratings yet
Lecture 01 - Machine Learning Basics Revision
80 pages
Dr. Mujiono - MachineLearningApplicationsCyberSecurity-Final-MS
No ratings yet
Dr. Mujiono - MachineLearningApplicationsCyberSecurity-Final-MS
28 pages
Deep Generative Models
No ratings yet
Deep Generative Models
55 pages
AI321: Theoretical Foundations of Machine Learning: Dr. Motaz El-Saban
No ratings yet
AI321: Theoretical Foundations of Machine Learning: Dr. Motaz El-Saban
44 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
24 pages
Previous Lecture
No ratings yet
Previous Lecture
43 pages
L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
46 pages
03 Machine Learning Overview
No ratings yet
03 Machine Learning Overview
24 pages
ML1-Introduction To Machine Learning
No ratings yet
ML1-Introduction To Machine Learning
46 pages
Chapter 5 - Machine Learning Basics
No ratings yet
Chapter 5 - Machine Learning Basics
58 pages
Thinklance Data Science (1)
No ratings yet
Thinklance Data Science (1)
14 pages
03 Regression
No ratings yet
03 Regression
35 pages
Machine Learning-Lecture 01
No ratings yet
Machine Learning-Lecture 01
28 pages
AI Lec-03
No ratings yet
AI Lec-03
23 pages
F.Y.B.Sc Data Science (CBCS)
No ratings yet
F.Y.B.Sc Data Science (CBCS)
14 pages
Data Science and Gen AI LLMs Syllabus
No ratings yet
Data Science and Gen AI LLMs Syllabus
7 pages
Lec1 -Introduction
No ratings yet
Lec1 -Introduction
55 pages
Day 2
No ratings yet
Day 2
58 pages
ML Intro 23
No ratings yet
ML Intro 23
11 pages
Presentation of AI ML Session 1
No ratings yet
Presentation of AI ML Session 1
131 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
21CSC305P ML_ Unit 1-E.pptx
No ratings yet
21CSC305P ML_ Unit 1-E.pptx
137 pages
Machine Learning
No ratings yet
Machine Learning
33 pages
1
No ratings yet
1
42 pages
CSE602 - Data Warehousing & Data Mining
No ratings yet
CSE602 - Data Warehousing & Data Mining
6 pages
Master Machine Learning in Just 30 Days Version01
No ratings yet
Master Machine Learning in Just 30 Days Version01
25 pages
Lecture Notes 1 2 Intro Python
No ratings yet
Lecture Notes 1 2 Intro Python
13 pages
Machine Learning
No ratings yet
Machine Learning
28 pages
OOP - Object Oriented Paradigm
No ratings yet
OOP - Object Oriented Paradigm
32 pages
ENG6500 1 IntroductionToMLDL Part1
No ratings yet
ENG6500 1 IntroductionToMLDL Part1
63 pages
Lec1 Intoduction
No ratings yet
Lec1 Intoduction
34 pages
23ECE205 FoDS 13 Introduction To ML
No ratings yet
23ECE205 FoDS 13 Introduction To ML
41 pages
L02 Fundamentals of ML
No ratings yet
L02 Fundamentals of ML
39 pages
Quantitative Analyst
0% (1)
Quantitative Analyst
12 pages
CH11
No ratings yet
CH11
36 pages
Classification
No ratings yet
Classification
53 pages
Week 1:: Data Structure and Algorithm
No ratings yet
Week 1:: Data Structure and Algorithm
66 pages
Lecture02
No ratings yet
Lecture02
26 pages
KLS'S Vishwanathrao Deshpande Institute of Technology, Haliyal
No ratings yet
KLS'S Vishwanathrao Deshpande Institute of Technology, Haliyal
17 pages
Python Note
No ratings yet
Python Note
83 pages
Lec2 - IoT Eco System
No ratings yet
Lec2 - IoT Eco System
37 pages
1732869290817
No ratings yet
1732869290817
25 pages
Lecture 07 IoT Business Architecture
No ratings yet
Lecture 07 IoT Business Architecture
58 pages
Lab 2 and 3
No ratings yet
Lab 2 and 3
5 pages
Lec3 - IoT - Perception Layer - Sensors
No ratings yet
Lec3 - IoT - Perception Layer - Sensors
43 pages
Usage of Reed Switch
No ratings yet
Usage of Reed Switch
14 pages
Labsheet
No ratings yet
Labsheet
5 pages
Amazon Pass4sure Aws-Solution-Architect-Associate Vce Dumps 2023-Jun-02 by Bishop 190q Vce
No ratings yet
Amazon Pass4sure Aws-Solution-Architect-Associate Vce Dumps 2023-Jun-02 by Bishop 190q Vce
27 pages
Data Warehouse PHD Thesis
100% (2)
Data Warehouse PHD Thesis
6 pages
Overview of EmoThreat: Emotions and Threat Detection in Urdu at FIRE 2022-T4-1
No ratings yet
Overview of EmoThreat: Emotions and Threat Detection in Urdu at FIRE 2022-T4-1
11 pages
Paper: Ps 604.2 (E2) : Image Processing and Pattern Recognition
No ratings yet
Paper: Ps 604.2 (E2) : Image Processing and Pattern Recognition
5 pages
2016 Syllaus
No ratings yet
2016 Syllaus
27 pages
Sanskrit Dependency Parsing
No ratings yet
Sanskrit Dependency Parsing
20 pages
University of Gondar: August 2011 E.C Gondar, Ethiopia
No ratings yet
University of Gondar: August 2011 E.C Gondar, Ethiopia
10 pages
Diet Recommendation System
No ratings yet
Diet Recommendation System
5 pages
Soft Max
No ratings yet
Soft Max
6 pages
1994 2006
No ratings yet
1994 2006
148 pages
Study Scheme & Syllabus Of: IK Gujral Punjab Technical University
No ratings yet
Study Scheme & Syllabus Of: IK Gujral Punjab Technical University
28 pages
Fooling LIME and SHAP
No ratings yet
Fooling LIME and SHAP
14 pages
Lazy Learners Unit 2
No ratings yet
Lazy Learners Unit 2
26 pages
Application of Data Mining Techniques To Predict Adult Mortality Thecase of Butajira Rural Health Program Butajira Ethiopia 2157 7420 1000197
No ratings yet
Application of Data Mining Techniques To Predict Adult Mortality Thecase of Butajira Rural Health Program Butajira Ethiopia 2157 7420 1000197
10 pages
Discriminant Analysis
100% (1)
Discriminant Analysis
16 pages
AI Model Test Paper 3 (1)
No ratings yet
AI Model Test Paper 3 (1)
8 pages
Logistic Regression - Exercises
No ratings yet
Logistic Regression - Exercises
8 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
Complete
No ratings yet
Complete
27 pages
Student Performance Analysis System SPAS
No ratings yet
Student Performance Analysis System SPAS
7 pages
Conference Poster 4
No ratings yet
Conference Poster 4
1 page
Cs3491 Artificial Intelilgence and Machine Learning
No ratings yet
Cs3491 Artificial Intelilgence and Machine Learning
27 pages
Report Diabetics
No ratings yet
Report Diabetics
8 pages
Activation Function
No ratings yet
Activation Function
44 pages
supervised and unsupervised learing
100% (1)
supervised and unsupervised learing
7 pages
SQC Module 9
No ratings yet
SQC Module 9
22 pages
Skinput Technology
No ratings yet
Skinput Technology
18 pages
Lab 12 Introduction To Rapidminer/Weka.: Objective
No ratings yet
Lab 12 Introduction To Rapidminer/Weka.: Objective
24 pages
Identification of Crop Pests Using Machine Learning Classifier System
No ratings yet
Identification of Crop Pests Using Machine Learning Classifier System
9 pages
Easy Neural Networks With FANN
No ratings yet
Easy Neural Networks With FANN
6 pages
Machine Learning
No ratings yet
Machine Learning
27 pages