Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Choosing a
ML technique
to solve your need
RALUCA APOSTOL
Nestor
Co-founder & Chief Product Officer
Passion for data - science, analysis & interpretation
Our chat today :)
• What is Machine Learning?
• What are some popular approaches?
• How to think?
• How to build your model?
• What are the resources?
• Let’s get to action.
Artificial Intelligence
• Anything which is not natural and created by humans is artificial.
• Intelligence means ability to understand, reason, plan, adapt, etc.
• So any code, tech or algorithm that enable machine to mimic,
develop or demonstrate the human cognition or behavior is AI
Applications for AI
AI vs. ML
AI vs. ML
Connecting the dots …
Connecting the dots …
So what is Machine Learning?
• Machine learning is the field of study
that gives computers the ability to
learn without being explicitly
programmed.
• In simple term, Machine Learning
means making prediction based on
data.
The ML Mindset
“ML changes the way you think about a problem. The focus shifts
from a mathematical science to a natural science, running
experiments and using statistics, not logic, to analyze its results."
– Peter Norvig
Google Research Director
The ML Mindset
The ML Mindset
F(a,b) = a + b
a = 1; b = 2;
3
The ML Mindset
F(a,b) = a + b
a = 1; b = 2;
3
F(a,b) = x*a + y*b
a = 1; b = 2;
3
x = ?; y = ?
The steps
• Machine learning is part art and part science. :)
• There is no one solution or one approach that fits all.
• There are several factors that can affect your decision to choose a
machine learning algorithm.
The steps
Know your data
Before you start looking at different ML algorithms, you need to have a
clear picture of your data, your problem and your constraints.
Know your data
Before you start looking at different ML algorithms, you need to have a
clear picture of your data, your problem and your constraints.
Know your data
1. Look at Summary statistics and visualizations
• Percentages can help identify the range for most of the data
• Averages and medians can describe central tendency
• Correlations can indicate strong relationships
2. Visualize the data
• Box plots can identify outliers
• Density plots and histograms show the spread of data
• Scatter plots can describe bivariate relationships
Since the collected data may be in an undesired format, unorganized,
or extremely large, further steps are needed to enhance its quality. The
three common steps for preprocessing data are formatting, cleaning,
and sampling.
Clean your data
1. How do I deal with missing value?
• Missing data affects some models more than others.
• Even for models that handle missing data, they can be sensitive to it (missing data
for certain variables can result in poor predictions)
2. Does the data needs to be aggregated?
Clean your data
Clean your data
Clean your data
3. What do I do with outliers?
• Outliers can be very common in multidimensional data.
• Some models are less sensitive to outliers than others.
Usually tree models are less sensitive to the presence of
outliers. However regression models, or any model that
tries to use equations, could definitely be effected by
outliers
• Outliers can be the result of bad data collection, or they can
be legitimate extreme values.
Clean your data
Sampling
Augment your data
1. Feature engineering is the process of going from raw
data to data that is ready for modeling. It can serve
multiple purposes:
• Make the models easier to interpret (e.g. binning)
• Capture more complex relationships (e.g. NNs)
• Reduce data redundancy and dimensionality (e.g. PCA)
• Rescale variables (e.g. standardizing or normalizing)
2. Different models may have different feature engineering
requirements.
Categorize the problem
1. Categorize by input:
• If you have labelled data, it’s a supervised learning problem.
• If you have unlabelled data and want to find structure, it’s an unsupervised
learning problem.
• If you want to optimize an objective function by interacting with an environment,
it’s a reinforcement learning problem.
2. Categorize by output
• If the output of your model is a number, it’s a regression problem.
• If the output of your model is a class, it’s a classification problem.
• If the output of your model is a set of input groups, it’s a clustering problem.
• Do you want to detect an anomaly ? That’s anomaly detection.
Categorize the problem
Understand your constraints
• What is your data storage capacity? Depending on the storage capacity of your
system, you might not be able to store gigabytes of classification/regression
models or gigabytes of data to clusterize. This is the case, for instance, for
embedded systems.
• Does the prediction have to be fast? In real time applications, it is obviously very
important to have a prediction as fast as possible. For instance, in autonomous
driving, it’s important that the classification of road signs be as fast as possible to
avoid accidents.
• Does the learning have to be fast? In some circumstances, training models quickly
is necessary: sometimes, you need to rapidly update, on the fly, your model with a
different dataset.
Choose the algorithm
Identify the algorithms that are applicable and practical to implement using the tools
at your disposal.
Some of the factors affecting the choice of a model are:
• Whether the model meets the business goals
• How much preprocessing the model needs
• How accurate the model is
• How explainable the model is
• How fast the model is: How long does it take to build a model, and how long does
the model take to make predictions.
• How scalable the model is
Take care at complexity
Making the same algorithm more complex increases the chance of overfitting.
A model is more complex when:
• It relies on more features to learn and predict (e.g. using two features vs ten
features to predict a target)
• It relies on more complex feature engineering (e.g. using polynomial terms,
interactions, or principal components)
• It has more computational overhead (e.g. a single decision tree vs. a random
forest of 100 trees).
ML Algorithms
ML Algorithms
Linear Regression
Regression algorithms can be used for example, when you
want to compute some continuous value as compared to
Classification where the output is categoric.
• Time to go one location to another
• Predicting sales of particular product next month
• Impact of blood alcohol content on coordination
• Predict monthly gift card sales and improve yearly
revenue projections
Logistic Regression
Logistic regression performs binary classification, so the
label outputs are binary. It takes linear combination of
features and applies non-linear function (sigmoid) to it, so
it’s a very small instance of neural network.
• Predicting the Customer Churn
• Credit Scoring & Fraud Detection
• Measuring the effectiveness of marketing
campaigns
Decision trees
Single trees are used very rarely, but in composition with
many others they build very efficient algorithms such as
Random Forest or Gradient Tree Boosting.
• Investment decisions
• Customer churn
• Banks loan defaulters
• Build vs Buy decisions
• Sales lead qualifications
K-means
Sometimes you don’t know any labels and your goal is to
assign labels according to the features of objects. This is
called clusterization task.
If there are questions like how is this organized or
grouping something or concentrating on particular groups
etc. in your problem statement then you should go with
Clustering.
• When there is a large group of users and you want to
divide them into particular groups based on some
common attributes.
Principal component analysis
Principal component analysis provides dimensionality
reduction. Sometimes you have a wide range of features,
probably highly correlated between each other, and
models can easily overfit on a huge amount of data. Then,
you can apply PCA.
Support Vector Machines
Support Vector Machine (SVM) is a supervised
machine learning technique that is widely used in
pattern recognition and classification problems — 
when your data has exactly two classes.
• detecting persons with common diseases
such as diabetes
• hand-written character recognition
• text categorization — news articles by topics
• stock market price prediction
Neural networks
Neural Networks take in the weights of connections
between neurons.
Extremely complex models can be trained and they
can be utilized as a kind of black box, without playing
out an unpredictable complex feature engineering
before training the model.
Object recognition has been as of late enormously
enhanced utilizing Deep Neural Networks. Applied to
unsupervised learning tasks, such as feature
extraction, deep learning also extracts features from
raw images or speech with much less human
intervention.
Training & Evaluation
Training & Evaluation
Parameter Tuning
Prediction
What you need to know ?
It is mandatory to learn a programming language, preferably Python, along with the
required analytical and mathematical knowledge. You can touch on:
• Linear algebra for data analysis: Scalars, Vectors, Matrices, and Tensors
• Mathematical Analysis: Derivatives and Gradients
• Probability theory and statistics
• Multivariate Calculus
• Algorithms and Complex Optimizations
Programming Languages
Python is hands down the best programming language for
Machine Learning applications.
Other programming languages that could to use for Machine
Learning Applications are R, C++, JavaScript, Java, C#, Julia,
Shell, TypeScript, and Scala.
Cloud Support
Test
Interactive:
• http://www.r2d3.us/visual-intro-to-machine-learning-part-1/
Test with Amazon SageMaker steps:
1. Create an AWS account
2. Launch a SageMaker Instance
3. Created a S3 bucket
4. Launch the Jupyter Notebook
5. Explore examples & tutorials
Libraries
• https://towardsdatascience.com/best-python-libraries-for-machine-
learning-and-deep-learning-b0bd40c7e8c (Python)
• https://blog.bitsrc.io/11-javascript-machine-learning-libraries-to-use-
in-your-app-c49772cca46c (Javascript)
• https://www.baeldung.com/java-ai (Java)
• https://www.geeksforgeeks.org/machine-learning-in-c/ (C++)
Datasets
• https://towardsai.net/p/machine-learning/best-free-datasets-for-
machine-learning-and-data-science/stanfordai/3451/
• https://towardsdatascience.com/free-data-sets-for-machine-
learning-73e74554cc21
• https://www.ubuntupit.com/best-machine-learning-datasets-for-
practicing-applied-ml/
Nestor
Co-founder & Chief Product Officer
Raluca Apostol
raluca@nestorup.com

More Related Content

What's hot

Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
MachinePulse
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
Eng Teong Cheah
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Darshan Ambhaikar
 
Machine learning
Machine learningMachine learning
Machine learning
Rohit Kumar
 
Internship project report,Predictive Modelling
Internship project report,Predictive ModellingInternship project report,Predictive Modelling
Internship project report,Predictive Modelling
Amit Kumar
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Bhupender Sharma
 
Barga DIDC'14 Invited Talk
Barga DIDC'14 Invited TalkBarga DIDC'14 Invited Talk
Barga DIDC'14 Invited Talk
Roger Barga
 
Machine learning Presentation
Machine learning PresentationMachine learning Presentation
Machine learning Presentation
Manish Singh
 
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Madhav Mishra
 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?
Marina Santini
 
Machine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesMachine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and Techniques
Rui Pedro Paiva
 
machine learning
machine learningmachine learning
machine learning
soundaryasarya
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
Alia Hamwi
 
Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
David Murgatroyd
 
ML Basics
ML BasicsML Basics
ML Basics
SrujanaMerugu1
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)
Manjunath Sindagi
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
Knoldus Inc.
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Madhav Mishra
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning
Saurabh Kaushik
 
Machine learning
Machine learning Machine learning
Machine learning
Saurabh Agrawal
 

What's hot (20)

Machine Learning and Real-World Applications
Machine Learning and Real-World ApplicationsMachine Learning and Real-World Applications
Machine Learning and Real-World Applications
 
End-to-End Machine Learning Project
End-to-End Machine Learning ProjectEnd-to-End Machine Learning Project
End-to-End Machine Learning Project
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Machine learning
Machine learningMachine learning
Machine learning
 
Internship project report,Predictive Modelling
Internship project report,Predictive ModellingInternship project report,Predictive Modelling
Internship project report,Predictive Modelling
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Barga DIDC'14 Invited Talk
Barga DIDC'14 Invited TalkBarga DIDC'14 Invited Talk
Barga DIDC'14 Invited Talk
 
Machine learning Presentation
Machine learning PresentationMachine learning Presentation
Machine learning Presentation
 
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 3 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?Lecture 1: What is Machine Learning?
Lecture 1: What is Machine Learning?
 
Machine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and TechniquesMachine Learning: Applications, Process and Techniques
Machine Learning: Applications, Process and Techniques
 
machine learning
machine learningmachine learning
machine learning
 
Supervised learning
Supervised learningSupervised learning
Supervised learning
 
Managing machine learning
Managing machine learningManaging machine learning
Managing machine learning
 
ML Basics
ML BasicsML Basics
ML Basics
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)
 
Introduction to Machine learning
Introduction to Machine learningIntroduction to Machine learning
Introduction to Machine learning
 
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
Applied Artificial Intelligence Unit 4 Semester 3 MSc IT Part 2 Mumbai Univer...
 
Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning Engineering Intelligent Systems using Machine Learning
Engineering Intelligent Systems using Machine Learning
 
Machine learning
Machine learning Machine learning
Machine learning
 

Similar to Choosing a Machine Learning technique to solve your need

Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
Roger Barga
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
cloudserviceuit
 
It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!
To Sum It Up
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Knowledge And Skill Forum
 
Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)
SwatiTripathi44
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
Chitrachitrap
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
kprasad8
 
Introduction to data science.pdf
Introduction to data science.pdfIntroduction to data science.pdf
Introduction to data science.pdf
alsaid fathy
 
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
Egyptian Engineers Association
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NET
Dev Raj Gautam
 
Data Mining - The Big Picture!
Data Mining - The Big Picture!Data Mining - The Big Picture!
Data Mining - The Big Picture!
Khalid Salama
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
Valerii Klymchuk
 
Data Science Training in Chandigarh h
Data Science Training in Chandigarh    hData Science Training in Chandigarh    h
Data Science Training in Chandigarh h
asmeerana605
 
Machine learning in Banks
Machine learning in BanksMachine learning in Banks
Machine learning in Banks
Abhishek Upadhyay
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATA
javed75
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
jagan477830
 
Feature engineering
Feature engineeringFeature engineering
Feature engineering
SaurabhWani6
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MAHIRA
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
Roger Barga
 
Module 7: Unsupervised Learning
Module 7:  Unsupervised LearningModule 7:  Unsupervised Learning
Module 7: Unsupervised Learning
Sara Hooker
 

Similar to Choosing a Machine Learning technique to solve your need (20)

Barga Data Science lecture 2
Barga Data Science lecture 2Barga Data Science lecture 2
Barga Data Science lecture 2
 
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptxLesson 1 - Overview of Machine Learning and Data Analysis.pptx
Lesson 1 - Overview of Machine Learning and Data Analysis.pptx
 
It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!It's Machine Learning Basics -- For You!
It's Machine Learning Basics -- For You!
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)Introduction to ML (Machine Learning)
Introduction to ML (Machine Learning)
 
Unit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptxUnit 1-ML (1) (1).pptx
Unit 1-ML (1) (1).pptx
 
AI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptxAI-900 - Fundamental Principles of ML.pptx
AI-900 - Fundamental Principles of ML.pptx
 
Introduction to data science.pdf
Introduction to data science.pdfIntroduction to data science.pdf
Introduction to data science.pdf
 
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
لموعد الإثنين 03 يناير 2022 143 مبادرة #تواصل_تطوير المحاضرة ال 143 من المباد...
 
Machine Learning With ML.NET
Machine Learning With ML.NETMachine Learning With ML.NET
Machine Learning With ML.NET
 
Data Mining - The Big Picture!
Data Mining - The Big Picture!Data Mining - The Big Picture!
Data Mining - The Big Picture!
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
 
Data Science Training in Chandigarh h
Data Science Training in Chandigarh    hData Science Training in Chandigarh    h
Data Science Training in Chandigarh h
 
Machine learning in Banks
Machine learning in BanksMachine learning in Banks
Machine learning in Banks
 
Data Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATAData Science.pptx NEW COURICUUMN IN DATA
Data Science.pptx NEW COURICUUMN IN DATA
 
credit card fraud detection
credit card fraud detectioncredit card fraud detection
credit card fraud detection
 
Feature engineering
Feature engineeringFeature engineering
Feature engineering
 
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
MACHINE LEARNING PRESENTATION (ARTIFICIAL INTELLIGENCE)
 
Barga Galvanize Sept 2015
Barga Galvanize Sept 2015Barga Galvanize Sept 2015
Barga Galvanize Sept 2015
 
Module 7: Unsupervised Learning
Module 7:  Unsupervised LearningModule 7:  Unsupervised Learning
Module 7: Unsupervised Learning
 

Recently uploaded

The Challenge of Interpretability in Generative AI Models.pdf
The Challenge of Interpretability in Generative AI Models.pdfThe Challenge of Interpretability in Generative AI Models.pdf
The Challenge of Interpretability in Generative AI Models.pdf
Sara Kroft
 
Indian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for StartupsIndian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for Startups
AMol NAik
 
Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...
Nohoax Kanont
 
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptxFIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Alliance
 
Mega MUG 2024: Working smarter in Marketo
Mega MUG 2024: Working smarter in MarketoMega MUG 2024: Working smarter in Marketo
Mega MUG 2024: Working smarter in Marketo
Stephanie Tyagita
 
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptxFIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Alliance
 
Jacquard Fabric Explained: Origins, Characteristics, and Uses
Jacquard Fabric Explained: Origins, Characteristics, and UsesJacquard Fabric Explained: Origins, Characteristics, and Uses
Jacquard Fabric Explained: Origins, Characteristics, and Uses
ldtexsolbl
 
Project management Course in Australia.pptx
Project management Course in Australia.pptxProject management Course in Australia.pptx
Project management Course in Australia.pptx
deathreaper9
 
STKI Israeli IT Market Study v2 August 2024.pdf
STKI Israeli IT Market Study v2 August 2024.pdfSTKI Israeli IT Market Study v2 August 2024.pdf
STKI Israeli IT Market Study v2 August 2024.pdf
Dr. Jimmy Schwarzkopf
 
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptxFIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Alliance
 
Getting Started with Azure AI Studio.pptx
Getting Started with Azure AI Studio.pptxGetting Started with Azure AI Studio.pptx
Getting Started with Azure AI Studio.pptx
Swaminathan Vetri
 
FIDO Munich Seminar: FIDO Tech Principles.pptx
FIDO Munich Seminar: FIDO Tech Principles.pptxFIDO Munich Seminar: FIDO Tech Principles.pptx
FIDO Munich Seminar: FIDO Tech Principles.pptx
FIDO Alliance
 
Bài tập tiếng anh lớp 9 - Ôn tập tuyển sinh
Bài tập tiếng anh lớp 9 - Ôn tập tuyển sinhBài tập tiếng anh lớp 9 - Ôn tập tuyển sinh
Bài tập tiếng anh lớp 9 - Ôn tập tuyển sinh
NguynThNhQunh59
 
Top keywords searches on business in AUS
Top keywords searches on business in AUSTop keywords searches on business in AUS
Top keywords searches on business in AUS
riannecreativetwo
 
Easy Compliance is Continuous Compliance
Easy Compliance is Continuous ComplianceEasy Compliance is Continuous Compliance
Easy Compliance is Continuous Compliance
Anchore
 
Leading Bigcommerce Development Services for Online Retailers
Leading Bigcommerce Development Services for Online RetailersLeading Bigcommerce Development Services for Online Retailers
Leading Bigcommerce Development Services for Online Retailers
SynapseIndia
 
Epicor Kinetic REST API Services Overview.pptx
Epicor Kinetic REST API Services Overview.pptxEpicor Kinetic REST API Services Overview.pptx
Epicor Kinetic REST API Services Overview.pptx
Piyush Khalate
 
TribeQonf2024_Dimpy_ShiftingSecurityLeft
TribeQonf2024_Dimpy_ShiftingSecurityLeftTribeQonf2024_Dimpy_ShiftingSecurityLeft
TribeQonf2024_Dimpy_ShiftingSecurityLeft
Dimpy Adhikary
 
Understanding NFT Marketplace Ecosystem.pptx
Understanding  NFT Marketplace Ecosystem.pptxUnderstanding  NFT Marketplace Ecosystem.pptx
Understanding NFT Marketplace Ecosystem.pptx
NFT Space.
 
SuratMeetup-MuleSoft + Salt Security for API Security.pptx
SuratMeetup-MuleSoft + Salt Security for API Security.pptxSuratMeetup-MuleSoft + Salt Security for API Security.pptx
SuratMeetup-MuleSoft + Salt Security for API Security.pptx
nitishjain2015
 

Recently uploaded (20)

The Challenge of Interpretability in Generative AI Models.pdf
The Challenge of Interpretability in Generative AI Models.pdfThe Challenge of Interpretability in Generative AI Models.pdf
The Challenge of Interpretability in Generative AI Models.pdf
 
Indian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for StartupsIndian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for Startups
 
Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...
 
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptxFIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
 
Mega MUG 2024: Working smarter in Marketo
Mega MUG 2024: Working smarter in MarketoMega MUG 2024: Working smarter in Marketo
Mega MUG 2024: Working smarter in Marketo
 
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptxFIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
 
Jacquard Fabric Explained: Origins, Characteristics, and Uses
Jacquard Fabric Explained: Origins, Characteristics, and UsesJacquard Fabric Explained: Origins, Characteristics, and Uses
Jacquard Fabric Explained: Origins, Characteristics, and Uses
 
Project management Course in Australia.pptx
Project management Course in Australia.pptxProject management Course in Australia.pptx
Project management Course in Australia.pptx
 
STKI Israeli IT Market Study v2 August 2024.pdf
STKI Israeli IT Market Study v2 August 2024.pdfSTKI Israeli IT Market Study v2 August 2024.pdf
STKI Israeli IT Market Study v2 August 2024.pdf
 
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptxFIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
 
Getting Started with Azure AI Studio.pptx
Getting Started with Azure AI Studio.pptxGetting Started with Azure AI Studio.pptx
Getting Started with Azure AI Studio.pptx
 
FIDO Munich Seminar: FIDO Tech Principles.pptx
FIDO Munich Seminar: FIDO Tech Principles.pptxFIDO Munich Seminar: FIDO Tech Principles.pptx
FIDO Munich Seminar: FIDO Tech Principles.pptx
 
Bài tập tiếng anh lớp 9 - Ôn tập tuyển sinh
Bài tập tiếng anh lớp 9 - Ôn tập tuyển sinhBài tập tiếng anh lớp 9 - Ôn tập tuyển sinh
Bài tập tiếng anh lớp 9 - Ôn tập tuyển sinh
 
Top keywords searches on business in AUS
Top keywords searches on business in AUSTop keywords searches on business in AUS
Top keywords searches on business in AUS
 
Easy Compliance is Continuous Compliance
Easy Compliance is Continuous ComplianceEasy Compliance is Continuous Compliance
Easy Compliance is Continuous Compliance
 
Leading Bigcommerce Development Services for Online Retailers
Leading Bigcommerce Development Services for Online RetailersLeading Bigcommerce Development Services for Online Retailers
Leading Bigcommerce Development Services for Online Retailers
 
Epicor Kinetic REST API Services Overview.pptx
Epicor Kinetic REST API Services Overview.pptxEpicor Kinetic REST API Services Overview.pptx
Epicor Kinetic REST API Services Overview.pptx
 
TribeQonf2024_Dimpy_ShiftingSecurityLeft
TribeQonf2024_Dimpy_ShiftingSecurityLeftTribeQonf2024_Dimpy_ShiftingSecurityLeft
TribeQonf2024_Dimpy_ShiftingSecurityLeft
 
Understanding NFT Marketplace Ecosystem.pptx
Understanding  NFT Marketplace Ecosystem.pptxUnderstanding  NFT Marketplace Ecosystem.pptx
Understanding NFT Marketplace Ecosystem.pptx
 
SuratMeetup-MuleSoft + Salt Security for API Security.pptx
SuratMeetup-MuleSoft + Salt Security for API Security.pptxSuratMeetup-MuleSoft + Salt Security for API Security.pptx
SuratMeetup-MuleSoft + Salt Security for API Security.pptx
 

Choosing a Machine Learning technique to solve your need

  • 1. Choosing a ML technique to solve your need RALUCA APOSTOL
  • 2. Nestor Co-founder & Chief Product Officer
  • 3. Passion for data - science, analysis & interpretation
  • 4. Our chat today :) • What is Machine Learning? • What are some popular approaches? • How to think? • How to build your model? • What are the resources? • Let’s get to action.
  • 5. Artificial Intelligence • Anything which is not natural and created by humans is artificial. • Intelligence means ability to understand, reason, plan, adapt, etc. • So any code, tech or algorithm that enable machine to mimic, develop or demonstrate the human cognition or behavior is AI
  • 11. So what is Machine Learning? • Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed. • In simple term, Machine Learning means making prediction based on data.
  • 12. The ML Mindset “ML changes the way you think about a problem. The focus shifts from a mathematical science to a natural science, running experiments and using statistics, not logic, to analyze its results." – Peter Norvig Google Research Director
  • 14. The ML Mindset F(a,b) = a + b a = 1; b = 2; 3
  • 15. The ML Mindset F(a,b) = a + b a = 1; b = 2; 3 F(a,b) = x*a + y*b a = 1; b = 2; 3 x = ?; y = ?
  • 16. The steps • Machine learning is part art and part science. :) • There is no one solution or one approach that fits all. • There are several factors that can affect your decision to choose a machine learning algorithm.
  • 18. Know your data Before you start looking at different ML algorithms, you need to have a clear picture of your data, your problem and your constraints.
  • 19. Know your data Before you start looking at different ML algorithms, you need to have a clear picture of your data, your problem and your constraints.
  • 20. Know your data 1. Look at Summary statistics and visualizations • Percentages can help identify the range for most of the data • Averages and medians can describe central tendency • Correlations can indicate strong relationships 2. Visualize the data • Box plots can identify outliers • Density plots and histograms show the spread of data • Scatter plots can describe bivariate relationships
  • 21. Since the collected data may be in an undesired format, unorganized, or extremely large, further steps are needed to enhance its quality. The three common steps for preprocessing data are formatting, cleaning, and sampling. Clean your data
  • 22. 1. How do I deal with missing value? • Missing data affects some models more than others. • Even for models that handle missing data, they can be sensitive to it (missing data for certain variables can result in poor predictions) 2. Does the data needs to be aggregated? Clean your data
  • 25. 3. What do I do with outliers? • Outliers can be very common in multidimensional data. • Some models are less sensitive to outliers than others. Usually tree models are less sensitive to the presence of outliers. However regression models, or any model that tries to use equations, could definitely be effected by outliers • Outliers can be the result of bad data collection, or they can be legitimate extreme values. Clean your data
  • 27. Augment your data 1. Feature engineering is the process of going from raw data to data that is ready for modeling. It can serve multiple purposes: • Make the models easier to interpret (e.g. binning) • Capture more complex relationships (e.g. NNs) • Reduce data redundancy and dimensionality (e.g. PCA) • Rescale variables (e.g. standardizing or normalizing) 2. Different models may have different feature engineering requirements.
  • 28. Categorize the problem 1. Categorize by input: • If you have labelled data, it’s a supervised learning problem. • If you have unlabelled data and want to find structure, it’s an unsupervised learning problem. • If you want to optimize an objective function by interacting with an environment, it’s a reinforcement learning problem. 2. Categorize by output • If the output of your model is a number, it’s a regression problem. • If the output of your model is a class, it’s a classification problem. • If the output of your model is a set of input groups, it’s a clustering problem. • Do you want to detect an anomaly ? That’s anomaly detection.
  • 30. Understand your constraints • What is your data storage capacity? Depending on the storage capacity of your system, you might not be able to store gigabytes of classification/regression models or gigabytes of data to clusterize. This is the case, for instance, for embedded systems. • Does the prediction have to be fast? In real time applications, it is obviously very important to have a prediction as fast as possible. For instance, in autonomous driving, it’s important that the classification of road signs be as fast as possible to avoid accidents. • Does the learning have to be fast? In some circumstances, training models quickly is necessary: sometimes, you need to rapidly update, on the fly, your model with a different dataset.
  • 31. Choose the algorithm Identify the algorithms that are applicable and practical to implement using the tools at your disposal. Some of the factors affecting the choice of a model are: • Whether the model meets the business goals • How much preprocessing the model needs • How accurate the model is • How explainable the model is • How fast the model is: How long does it take to build a model, and how long does the model take to make predictions. • How scalable the model is
  • 32. Take care at complexity Making the same algorithm more complex increases the chance of overfitting. A model is more complex when: • It relies on more features to learn and predict (e.g. using two features vs ten features to predict a target) • It relies on more complex feature engineering (e.g. using polynomial terms, interactions, or principal components) • It has more computational overhead (e.g. a single decision tree vs. a random forest of 100 trees).
  • 35. Linear Regression Regression algorithms can be used for example, when you want to compute some continuous value as compared to Classification where the output is categoric. • Time to go one location to another • Predicting sales of particular product next month • Impact of blood alcohol content on coordination • Predict monthly gift card sales and improve yearly revenue projections
  • 36. Logistic Regression Logistic regression performs binary classification, so the label outputs are binary. It takes linear combination of features and applies non-linear function (sigmoid) to it, so it’s a very small instance of neural network. • Predicting the Customer Churn • Credit Scoring & Fraud Detection • Measuring the effectiveness of marketing campaigns
  • 37. Decision trees Single trees are used very rarely, but in composition with many others they build very efficient algorithms such as Random Forest or Gradient Tree Boosting. • Investment decisions • Customer churn • Banks loan defaulters • Build vs Buy decisions • Sales lead qualifications
  • 38. K-means Sometimes you don’t know any labels and your goal is to assign labels according to the features of objects. This is called clusterization task. If there are questions like how is this organized or grouping something or concentrating on particular groups etc. in your problem statement then you should go with Clustering. • When there is a large group of users and you want to divide them into particular groups based on some common attributes.
  • 39. Principal component analysis Principal component analysis provides dimensionality reduction. Sometimes you have a wide range of features, probably highly correlated between each other, and models can easily overfit on a huge amount of data. Then, you can apply PCA.
  • 40. Support Vector Machines Support Vector Machine (SVM) is a supervised machine learning technique that is widely used in pattern recognition and classification problems —  when your data has exactly two classes. • detecting persons with common diseases such as diabetes • hand-written character recognition • text categorization — news articles by topics • stock market price prediction
  • 41. Neural networks Neural Networks take in the weights of connections between neurons. Extremely complex models can be trained and they can be utilized as a kind of black box, without playing out an unpredictable complex feature engineering before training the model. Object recognition has been as of late enormously enhanced utilizing Deep Neural Networks. Applied to unsupervised learning tasks, such as feature extraction, deep learning also extracts features from raw images or speech with much less human intervention.
  • 46. What you need to know ? It is mandatory to learn a programming language, preferably Python, along with the required analytical and mathematical knowledge. You can touch on: • Linear algebra for data analysis: Scalars, Vectors, Matrices, and Tensors • Mathematical Analysis: Derivatives and Gradients • Probability theory and statistics • Multivariate Calculus • Algorithms and Complex Optimizations
  • 47. Programming Languages Python is hands down the best programming language for Machine Learning applications. Other programming languages that could to use for Machine Learning Applications are R, C++, JavaScript, Java, C#, Julia, Shell, TypeScript, and Scala.
  • 49. Test Interactive: • http://www.r2d3.us/visual-intro-to-machine-learning-part-1/ Test with Amazon SageMaker steps: 1. Create an AWS account 2. Launch a SageMaker Instance 3. Created a S3 bucket 4. Launch the Jupyter Notebook 5. Explore examples & tutorials
  • 50. Libraries • https://towardsdatascience.com/best-python-libraries-for-machine- learning-and-deep-learning-b0bd40c7e8c (Python) • https://blog.bitsrc.io/11-javascript-machine-learning-libraries-to-use- in-your-app-c49772cca46c (Javascript) • https://www.baeldung.com/java-ai (Java) • https://www.geeksforgeeks.org/machine-learning-in-c/ (C++)
  • 52. Nestor Co-founder & Chief Product Officer Raluca Apostol raluca@nestorup.com