Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
100% found this document useful (1 vote)
93 views

Week 1 Introduction To ML

This document provides an overview of a machine learning course, including: - The course will take a case study approach and cover topics like linear regression, logistic regression, decision trees, and model evaluation over 5 weeks. - Python will be the primary programming language used, along with libraries like Pandas, NumPy, SciPy, and Scikit-Learn. - Interactive tools like Spyder and Jupyter Notebook will also be utilized. - The introduction covers what machine learning and data science are, different machine learning tasks like supervised/unsupervised/reinforcement learning, and popular machine learning algorithms and applications.

Uploaded by

Jaurel Kouam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
93 views

Week 1 Introduction To ML

This document provides an overview of a machine learning course, including: - The course will take a case study approach and cover topics like linear regression, logistic regression, decision trees, and model evaluation over 5 weeks. - Python will be the primary programming language used, along with libraries like Pandas, NumPy, SciPy, and Scikit-Learn. - Interactive tools like Spyder and Jupyter Notebook will also be utilized. - The introduction covers what machine learning and data science are, different machine learning tasks like supervised/unsupervised/reinforcement learning, and popular machine learning algorithms and applications.

Uploaded by

Jaurel Kouam
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Machine Learning Course

Ing 4 SI, 2022

Pr. Khadija SLIMANI

1 03/01/2022
What is this course about

 Learn about Data Science


 Learn about machine learning and its applications

 How to build machine learning systems

 How the algorithms behind them work

 How to use those algorithms

2
Course planning

 A Case study approach:


 Course

 Practical work (case study)

Week 1 Week 1 Week 3 Week 4 Week 5

Assignment

3
Course overview …..

1. Week 1 : Introduction to Data Science and Machine Learning

2. Week 2: Univariate & Multivariate Linear Regression

3. Week 3: Logistic Regression (Classification)

4. Week 4: Decision Trees (Regression & Classification)

5. Week 5: Model evaluation (overfitting, bias-variance, crossfolding, ...)

4
Course overview

1. Week 1 : Introduction to Data Science and Machine Learning

1. Introduction to Data Science

2. Introduction to Machine Learning

3. Machine Learning Tools


2. ….

5
1.1
Introduction to Data Science

6
The Era of Big Data
 90% of the information ever generated was generated in the last two
years?

 This growing torrent of data + growing storage and computation


capacity (cloud) ⇒ Big Data Era
7
What is Data Science ?
 It goes back a little further than 2004, which is where the Google search
term history begins

 Data Science is not just limited to tech companies


 Almost every company is turning to data science to better understand how
to build products, serve customers and leverage new opportunities

 Data Science is used in multiple disciplines: computer science,


behavioural sciences, law & business, etc..

 All of these actors need data-driven methodologies to aid in their


discovery:
 From statistical analysis, machine learning, & text mining to information
visualization

8
What is Data Science ?
 Data Science is an umbrella term and it's basically the marriage of
many different fields.

9
What is Data Science ?
 Definition of Data Science according to “Drew Conway”

10
What is Data Science ?

11
1.2
Introduction to Machine
Learning

12
What is Machine Learning ?
 Artificial Intelligence (AI) and Machine Learning (ML) are the part of
computer science that are correlated with each other.

 These two technologies are the most trending technologies which are
used for creating intelligent systems.

13
What is Machine Learning ?
 Researchers interested in artificial intelligence wanted to see if
computers could learn from data.

 ML is not a new science: many machine learning algorithms have


been around for a long time

14
What is Machine Learning ?
 BUT, it is a science that’s gaining fresh momentum: the ability to
automatically apply complex mathematical calculations to big data –over and
over, faster and faster – is a recent development

15

15
What is Machine Learning ?
 Google trends for the term “Machine Learning”
Definition of Machine Learning
 Machine learning is the subfield of computer science that "gives computers the
ability to learn without being explicitly programmed" (Arthur Samuel, 1959)

 A more modern definition by Tom Mitchell: "A computer program is said to


learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by
P, improves with experience E.“

 Example: playing checkers.


 E = the experience of playing many games of checkers
 T = the task of playing checkers.
 P = the probability that the program will win the next game
Old View of Machine Learning
Machine Learning in Intelligent Applications
The pipeline of Machine Learning
Types of Machine Learning
 Machine learning tasks are typically classified into three broad
categories.

 Depending on the nature of the learning "signal" or "feedback"


available to a learning system
Supervised Learning
 The program is given a data set and already know what our correct
output should look like
 Having the idea that there is a relationship between the input and the output

 The goal is to learn a general rule that maps inputs to outputs.


Unsupervised Learning
 No labels are given to the learning algorithm, leaving it on its own to
find structure in its input.

 Unsupervised learning can be a goal in itself (discovering hidden


patterns in data) or a means towards an end (feature learning)
Reinforcement Learning
 A computer program interacts with a dynamic environment in which it
must perform a certain goal, without a teacher explicitly telling it
whether it has come close to its goal.
 Learning to drive a car (Google Car)
 Learning to play a game by playing against an opponent (AlphaGo)
Machine Learning Algorithms
Machine Learning Applications
Machine Learning in this course

 Regression
Machine Learning in this course
 Classification (Logistic Regression)

 Classification (Decision Trees)


Machine Learning in this course
Machine Learning in this course
Machine Learning

Reinforcement
Supervised learning Unsupervised learning
learning

Dimensionality
Regression Classification Clustering Q-learning
reduction

K-means Hierarchical Fuzzy means SVD PCA ICA

Logistic Random
SVM k-NN
regression forest
1.3
Machine Learning Tools

31
Machine Learning Tools
Python
 Python is a high level language

 It is optimized for reading by people instead of machines

 Python is also an interpreted language which means it is not compiled


into machine code

 It is commonly used in an interactive fashion

 Java & C: write code, compile and run, and then watch the output

 Python: write and run line by line with the interpreter


Python
 This is very useful for tasks that require a lot of investigations (data
cleaning) versus those that require a lot of design !

 Different from C++ and java, Python is dynamically typedlanguage


(like javascript) : you declare the variable and assign a value to it
directly !
 This enables to quickly set the variable type and content
Why Python for Machine Learning ?
 Python is easy to learn
 Now the language of choice for 8 of 10 top US computer science programs
(Philip Guo, CACM)

 Full featured
 Not just a statistics language, but has full capabilities for data acquisition,
cleaning, databases, high performance computing, and more

 Strong Data Science Libraries


 The SciPy Ecosystem
Tools to be used in this Course
 Programming language to be used in this course: Python

 Libraries:
 Pandas
 Numpy
 Scipy
 Scikit-Learn

 Interactive tools:
 Spyder: IDE for python
 Jupyter Notebook: A web application that allows to:
 create and share documents that contain live code, equations, visualizations and
explanatory text. Uses include: data cleaning and transformation, numerical simulation,
statistical modeling, machine learning and much more.
Pandas

 Created in 2008 by Wes McKinney

 Open source New BSD license

 100 different contributors

 https://pandas.pydata.org/pandas-docs/stable/
Pandas Series
Pandas DataFrame
Pandas DataFrame
Thank you for your
attention

41
Practical work

LAB1: Back in 15min!

42

You might also like