0% found this document useful (0 votes)

125 views

Machine Learning Notes

This document provides an overview of machine learning lifecycles and basic terminology. It discusses the 11 steps of a typical machine learning lifecycle including problem definition, data selection, modeling, evaluation and deployment. It also defines common terms like features, datasets, dependent and independent variables. Additionally, it covers topics like data preprocessing, transformation, univariate and multivariate analysis, and model selection.

Uploaded by

Nikhita Nair

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

125 views

Machine Learning Notes

Uploaded by

Nikhita Nair

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

📈

Machine Learning Notes

Machine Learning Lifecyle:
1. Problem Definition: Defining the project requirements and business requirements.
Defining data requirements and modules.

2. Data Selection: Collect and prepare all of the relevant data for from dataset used in
machine learning.

3. Descriptive Statistics: Descriptive statistics are used to describe or summarize the

characteristics of a sample or data set.

4. Exploratory Data Analysis: Analysis of data. Find hidden patterns in the dataset.

5. Data Preprocessing: Data Cleaning, Imputing(Removing missing data) and getting

more useful and relevant data.

6. Data Transformation: Transforming the relevant data into appropriate form.

encoding techniques used(one hot), scaling, e.t.c.

7. Feature Selection: Selection of useful and informative features(attributes) and

eliminating irrelevant feature, optimizing the features. Required features to be used.
Filtering out best features. Subset of data selection.

Machine Learning Notes 1

8. Model Selection: Selection of model based on the variables. Selecting right
algorithm.

9. Model Training: 80-20 rule(training-80,test data-20),working on getting max

accuracy in training stage.

10. Model Evaluation: Model evaluation aims to estimate the generalization accuracy
of a model on future (unseen/out-of-sample) data.

11. Model Deployment: The process of taking a trained ML model and making its
predictions available to users or other systems is known as deployment.

Basic Terminologies:
Feature matrix/Data Matrix:

Matrix of all features

Features/Attributes:

Columns in a dataset

N-dimensional array/Data points:

Rows in a dataset

Dataset:

Set of data used for training model

Dependent/Output(y-axis) variable:

Variable which is output or predicted in a training model

Independent/Input(x-axis)variable:

Variable which is used for input in a training model

Target:

used for predicting

Types of Data:
Continuous variables- Always numeric, continuous and infinite, eg: height, score

Machine Learning Notes 2

Discrete variables- Numeric or categorical, countable and finite, eg: number of
fruits, gender,pincode,etc.

VLOOKUP() in Excel:
VLOOKUP()-merging various tables together, fetching data from multiple tables.

VLOOKUP(search criterion ;array; index; sort)

eg: VLOOKUP(State_ID; userState.A2-An; sort(asc/desc))

Types of Data Analysis:

UNIVARIATE ANALYSIS:

only using one feature

BIVARIATE ANALYSIS:

numeric vs numeric

categoric vs categoric

numeric vs categoric

MULTIVARIATE ANALYSIS:

using multiple features for doing analysis

~min()- it will return the minimum data from a particular dataset

Outlier is any data which is out of the range of your dataset. Anything below or above
the limits will be a outlier.

Upper limit=Q3+1.5IQR
Lower limit=Q1-1.5IQR

avg() used for calculation of mean

median() for calculating of

Coefficient of dispersion based on range: (max-min)/(max+min)

Coefficient of dispersion based on mean deviation: mean deviation/mean

Coefficient of dispersion based on range: (Q3-Q1)/(Q3+Q1)

Machine Learning Notes 3

Quartiles are divided in 4 parts:
Q2=median

Q1=25%, Q2=50%, Q3=75%, Q4=100%

QUARTILE()
IQR(INTER QUARTILE DEVIATION)

Q3-Q1=IQR

QUARTILE DEVIATION=IQR/2

Frequency table
-Divide in form particular ranges

-Frequency(data,classes)

-returns arrays

Pivot table for univariate categorical

pie chart used for 100% data

Bivariate Numeric vs Numeric

Correlation is the how two variables are re

Corelation range 1 to -1

1=two variable highly correlated

-1=highly negatively correlated(inversely)

0=no correlation
R-square is the square of correlation

Trendline is line of best fit

f(x) is the line equation (y=mx+c) in graph

Bivariate categorical vs categorical

Eg gender and state

Machine Learning Notes 4

Bivariate numeric vs categorical

eg: weight and gender

Multivariate: analysis on multiple variables

eg: each state and each gender their average height ,weight

CONCATENATE(col1;" ";col2;...;coln)-concatenating columns like names having more

than 1 word

removing inconsistencies from tables: PROPER(TRIM)-making it proper case and

removing spaces

UPPER()-uppercase and LOWER()- lowercase

combine TRIM with other function for removing extra spaces

Removing duplicates: using advanced filters > no duplication check

Imputation: filling out missing data; using average of a column/median/mode of the data;
if there is col where 70 to 80% NA,then you fill in data, dont use for model
Outliers: Anything below or above the lower and upper limits; UL=Q3+1.5Q1

Normalization: normalizing the data on common format in range of 0 to 1

(X-min)/(max+min)

X-value to be normalized
min(of the X's column)

max(X's column)

max+min>x-min

Standarization:
Regression,Linear regression,correlation

Machine Learning Notes 5

📈 Machine learning using scikit learn
📈 Machine Learning Axioms
📈 Deep Learning-Chorale Prelude + I ngression to DL
📈 Neural Networks and Deep Learning
📈 Convolutional Neural Network
📈 Machine Learning -Exploring the model
📈 Understanding Conversational Systems
Machine Learning Notes 6

The AI Wealth Creation Blueprint PDF
67% (3)
The AI Wealth Creation Blueprint PDF
50 pages
The Age of AI and Our Human Future (Henry Kissinger, Eric Schmidt Etc.) (Z-Library)
100% (8)
The Age of AI and Our Human Future (Henry Kissinger, Eric Schmidt Etc.) (Z-Library)
148 pages
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
88% (8)
Christopher Langan - CTMU, The Cognitive-Theoretic Model of The Universe, A New Kind of Reality Theory
56 pages
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
81% (48)
Gayle Laakmann McDowell - Cracking The Coding Interview - 189 Programming Questions and Solutions (2015, CareerCup)
708 pages
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
100% (10)
Gödel, Escher, Bach - An Eternal Golden Braid (20th Anniversary Edition) by Douglas R. Hofstadter (Charm-Quark) PDF
821 pages
Functional Cognitive Tasks C 2022 The Adult Speech Therapy Workbook
100% (2)
Functional Cognitive Tasks C 2022 The Adult Speech Therapy Workbook
23 pages
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
100% (25)
Chris Bailey - Hyperfocus - The New Science of Attention, Productivity, and Creativity-Viking (2018)
306 pages
Grokking Machine Learning v7 MEAP
100% (9)
Grokking Machine Learning v7 MEAP
280 pages
Ain't It Fun - Paramore
No ratings yet
Ain't It Fun - Paramore
2 pages
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
100% (24)
The Art of Asking ChatGPT For High-Quality Answers A Complete Guide To Prompt Engineering Techniques (Ibrahim John) (Z-Library)
52 pages
Raci - Matrix: Sample Lms Project R - A - C - I
No ratings yet
Raci - Matrix: Sample Lms Project R - A - C - I
4 pages
75 Productivity Hacks - System Sunday
100% (6)
75 Productivity Hacks - System Sunday
75 pages
Module-3 Association Analysis: Data Mining Association Analysis: Basic Concepts and Algorithms
No ratings yet
Module-3 Association Analysis: Data Mining Association Analysis: Basic Concepts and Algorithms
34 pages
Military Remote Viewing Manual
100% (5)
Military Remote Viewing Manual
72 pages
Interface Zero (OEF) (2019)
100% (14)
Interface Zero (OEF) (2019)
273 pages
I Want It That Way Chords
No ratings yet
I Want It That Way Chords
3 pages
Zach Roys Powerbuilding Program For Men MASTER
100% (1)
Zach Roys Powerbuilding Program For Men MASTER
51 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
Scikit - Notes ML
100% (2)
Scikit - Notes ML
12 pages
ML Unit 1
No ratings yet
ML Unit 1
44 pages
ML Lab
No ratings yet
ML Lab
21 pages
Dimensionality Reduction Lecture Slide
No ratings yet
Dimensionality Reduction Lecture Slide
27 pages
ML Unit-Iv
No ratings yet
ML Unit-Iv
19 pages
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
No ratings yet
Question Bank Module-1: Department of Computer Applications 18mca53 - Machine Learning
7 pages
ML First Unit
No ratings yet
ML First Unit
70 pages
2.building Blocks of Neural Networks
100% (1)
2.building Blocks of Neural Networks
2 pages
Loss Functions
No ratings yet
Loss Functions
37 pages
Ccs355 Neural Networks and Deep Learning Unit1 (1)
No ratings yet
Ccs355 Neural Networks and Deep Learning Unit1 (1)
29 pages
Machine Learning With Scikit-Learn: George Boorman
No ratings yet
Machine Learning With Scikit-Learn: George Boorman
34 pages
Neural Networks
No ratings yet
Neural Networks
29 pages
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
No ratings yet
Unit - IV - DIMENSIONALITY REDUCTION AND GRAPHICAL MODELS
59 pages
Evaluation Metrics For Regression: Dr. Jasmeet Singh Assistant Professor, Csed Tiet, Patiala
No ratings yet
Evaluation Metrics For Regression: Dr. Jasmeet Singh Assistant Professor, Csed Tiet, Patiala
13 pages
Data Pre-Processing (Pandas)
No ratings yet
Data Pre-Processing (Pandas)
19 pages
Answers For End-Sem Exam Part - 2 (Deep Learning)
No ratings yet
Answers For End-Sem Exam Part - 2 (Deep Learning)
20 pages
Gradient Descent
No ratings yet
Gradient Descent
15 pages
Dimensionality Reduction
No ratings yet
Dimensionality Reduction
4 pages
CCS355 Neural Networks and Deep Learning Lab
No ratings yet
CCS355 Neural Networks and Deep Learning Lab
43 pages
Curse of Dimensionality
No ratings yet
Curse of Dimensionality
9 pages
Lab#10 Ai
No ratings yet
Lab#10 Ai
3 pages
02 ML Supervised Learning
No ratings yet
02 ML Supervised Learning
32 pages
Pattern Classification
100% (1)
Pattern Classification
42 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
Missing Value Treatment
No ratings yet
Missing Value Treatment
22 pages
Deep Learning Unit 1
No ratings yet
Deep Learning Unit 1
32 pages
Ad3461 Ml Lab Manual
100% (1)
Ad3461 Ml Lab Manual
54 pages
MLT Unit 3
100% (1)
MLT Unit 3
38 pages
Assignment # 01 Bscs - 7 Semester: Machine Learning
100% (1)
Assignment # 01 Bscs - 7 Semester: Machine Learning
5 pages
Lab 1 - Machine Learning with Python - ML Engineering مهم
No ratings yet
Lab 1 - Machine Learning with Python - ML Engineering مهم
10 pages
Deep Learning Lab Manual - IGDTUW - Vinisky Kumar
100% (1)
Deep Learning Lab Manual - IGDTUW - Vinisky Kumar
33 pages
Machine Learning Module-3
No ratings yet
Machine Learning Module-3
23 pages
Bidirectional RNN and RVNN
No ratings yet
Bidirectional RNN and RVNN
15 pages
ML Project Shivani Pandey
100% (2)
ML Project Shivani Pandey
49 pages
Thyroid Disease Classification Using Machine Learning Project
No ratings yet
Thyroid Disease Classification Using Machine Learning Project
34 pages
Convolution Neural Networks U2
No ratings yet
Convolution Neural Networks U2
24 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
23 pages
Artificial Intelligence: Using Predicate Logic
No ratings yet
Artificial Intelligence: Using Predicate Logic
64 pages
Fundamentals of Data Science: Nehru Institute of Engineering and Technology
100% (1)
Fundamentals of Data Science: Nehru Institute of Engineering and Technology
17 pages
Ad3301 Data Exploration and Visualization
No ratings yet
Ad3301 Data Exploration and Visualization
24 pages
Unit -3-NNDL- Notes
No ratings yet
Unit -3-NNDL- Notes
17 pages
Lec-1 ML Intro
No ratings yet
Lec-1 ML Intro
15 pages
Dl Question Bank
No ratings yet
Dl Question Bank
23 pages
Regression Notes
100% (1)
Regression Notes
20 pages
Unit-2 Solution
No ratings yet
Unit-2 Solution
22 pages
Unit I Notes Machine Learning Techniques 1
No ratings yet
Unit I Notes Machine Learning Techniques 1
21 pages
Using Categorical Data With One Hot Encoding - Kaggle PDF
No ratings yet
Using Categorical Data With One Hot Encoding - Kaggle PDF
4 pages
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
100% (1)
Machine Learning (Analytics Vidhya) : What Is Logistic Regression?
5 pages
UNIT-4
No ratings yet
UNIT-4
79 pages
Tableau Lab Manual
No ratings yet
Tableau Lab Manual
6 pages
EDA Unit IV
No ratings yet
EDA Unit IV
17 pages
CCS355 Neural Networks and Deep Learning
No ratings yet
CCS355 Neural Networks and Deep Learning
142 pages
CP5191 Machine Learning Techniques L T P C3 0 0 3
No ratings yet
CP5191 Machine Learning Techniques L T P C3 0 0 3
7 pages
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
No ratings yet
Gujarat Technological University: Computer Engineering Machine Learning SUBJECT CODE: 3710216
2 pages
Deep Learning Exp
No ratings yet
Deep Learning Exp
25 pages
Unit 5 Fod (1) (Repaired)
No ratings yet
Unit 5 Fod (1) (Repaired)
28 pages
Deep Learning With Tensorflow
No ratings yet
Deep Learning With Tensorflow
15 pages
Machine Learning with Python: Design and Develop Machine Learning and Deep Learning Technique using real world code examples
From Everand
Machine Learning with Python: Design and Develop Machine Learning and Deep Learning Technique using real world code examples
Abhishek Vijayvargia
No ratings yet
Lawsuit Against Musk and Tesla Over AI Stuff
50% (2)
Lawsuit Against Musk and Tesla Over AI Stuff
76 pages
The Secrets of A Slot Machine
No ratings yet
The Secrets of A Slot Machine
4 pages
My Ai Cheat List
100% (11)
My Ai Cheat List
3 pages
Attention Is All You Need
50% (2)
Attention Is All You Need
11 pages
Test Ninjas Digital Sat Math Cheat Sheet
100% (4)
Test Ninjas Digital Sat Math Cheat Sheet
38 pages
Sudoku Theory
No ratings yet
Sudoku Theory
13 pages
AI Money Machine
100% (2)
AI Money Machine
267 pages
AI, Machine Learning & Big Data 2024
No ratings yet
AI, Machine Learning & Big Data 2024
274 pages
Mythic Magazine #009
100% (3)
Mythic Magazine #009
27 pages
Roadmap How To Learn AI in 2024 (Uncovered AI)
No ratings yet
Roadmap How To Learn AI in 2024 (Uncovered AI)
6 pages
Improved Statistical Test
87% (171)
Improved Statistical Test
20 pages
Rationality From AI To Zombies
86% (7)
Rationality From AI To Zombies
1,813 pages
2045: The Year Man Becomes Immortal
No ratings yet
2045: The Year Man Becomes Immortal
9 pages
Mercity - Ai-Guide To Fine-Tuning LLMs Using PEFT and LoRa Techniques
No ratings yet
Mercity - Ai-Guide To Fine-Tuning LLMs Using PEFT and LoRa Techniques
25 pages
I, Human - AI, Automation, and The Quest To Reclaim What Makes Us Unique
No ratings yet
I, Human - AI, Automation, and The Quest To Reclaim What Makes Us Unique
205 pages
Teas Topics To Study
100% (12)
Teas Topics To Study
6 pages
Splunk - Custom Search Queries
No ratings yet
Splunk - Custom Search Queries
3 pages
11-Diktat Logam Dan Paduannya
No ratings yet
11-Diktat Logam Dan Paduannya
83 pages
UTJ Multilevel Inverter-1
No ratings yet
UTJ Multilevel Inverter-1
82 pages
Class 28: Outline: Hour 1: Displacement Current Maxwell's Equations Hour 2: Electromagnetic Waves
No ratings yet
Class 28: Outline: Hour 1: Displacement Current Maxwell's Equations Hour 2: Electromagnetic Waves
33 pages
M100 PD EN
No ratings yet
M100 PD EN
2 pages
Hands On Bayesian Statistics With Python
No ratings yet
Hands On Bayesian Statistics With Python
12 pages
Calculation of Effective Enhanced Dynamic Wedge Factors From Segmented Treatment Tables For Symmetric and Asymmetric Photon Beams
No ratings yet
Calculation of Effective Enhanced Dynamic Wedge Factors From Segmented Treatment Tables For Symmetric and Asymmetric Photon Beams
8 pages
Crystal Report Viewer 1
No ratings yet
Crystal Report Viewer 1
54 pages
Enclosure Fire 2 PDF
No ratings yet
Enclosure Fire 2 PDF
194 pages
BS 8007
No ratings yet
BS 8007
31 pages
Guia de Motores International
100% (8)
Guia de Motores International
120 pages
Ripples, v9.0: SMT Parts
No ratings yet
Ripples, v9.0: SMT Parts
6 pages
Topic: Regression Model (Chapter 3 & 4) : Quantitative Analysis
No ratings yet
Topic: Regression Model (Chapter 3 & 4) : Quantitative Analysis
6 pages
Topic 7-Valuation
No ratings yet
Topic 7-Valuation
36 pages
1883 Spiral Chain
No ratings yet
1883 Spiral Chain
2 pages
XAFS Techniques For Catalysts, Nanomaterials, and Surfaces (2017) PDF
No ratings yet
XAFS Techniques For Catalysts, Nanomaterials, and Surfaces (2017) PDF
545 pages
Truss
100% (1)
Truss
19 pages
Assertions in Selenium
No ratings yet
Assertions in Selenium
13 pages
Cs101final Term Solved Mcqs
No ratings yet
Cs101final Term Solved Mcqs
48 pages
I+A Itype Rings: BANK PO Chapterwise Solved Paper REASONIMG 217
No ratings yet
I+A Itype Rings: BANK PO Chapterwise Solved Paper REASONIMG 217
1 page
19mis0349 VL2021220100926 Ast01
No ratings yet
19mis0349 VL2021220100926 Ast01
24 pages
The 9 Epiphanies That Shifted My Perspective Forever
No ratings yet
The 9 Epiphanies That Shifted My Perspective Forever
3 pages
Sharp Lq065t5ar01 Lcdpanel Datasheet
No ratings yet
Sharp Lq065t5ar01 Lcdpanel Datasheet
33 pages
Big History Threshold 3a
No ratings yet
Big History Threshold 3a
39 pages
OOP - Project (Gym Management)
No ratings yet
OOP - Project (Gym Management)
11 pages
Project Review - Final B187
No ratings yet
Project Review - Final B187
15 pages
Finestruc
No ratings yet
Finestruc
17 pages
Stefan Mayer Instruments: - February 2018
No ratings yet
Stefan Mayer Instruments: - February 2018
10 pages