How To Learn Machine Learning Algorithms For Interviews
How To Learn Machine Learning Algorithms For Interviews
Theoretical Understanding:
1. Tutorial 48th: https://www.youtube.com/watch?v=jS1CKhALUBQ
2. Tutorial 49th: https://www.youtube.com/watch?v=temQ8mHpe3k
2. Advantages
1. Work Very well with many number of features
2. Works Well with Large training Dataset
3. It converges faster when we are training the model
4. It also performs well with categorical features
3. Disadvantages
1. Correlated features affects performance
No
6. Impact of outliers?
Theoretical Understanding:
1. https://www.youtube.com/watch?v=1-
OGRohmH2s&list=PLZoTAELRMXVPBTrWtJkn3wWQxZkmTXGwe&index=29
2. https://www.youtube.com/watch?v=5rvnlZWzox8&list=PLZoTAELRMXVPBTr
WtJkn3wWQxZkmTXGwe&index=34
3. https://www.youtube.com/watch?v=NAPhUDjgG_s&list=PLZoTAELRMXVPBTr
WtJkn3wWQxZkmTXGwe&index=32
4. https://www.youtube.com/watch?v=WuuyD3Yr-
js&list=PLZoTAELRMXVPBTrWtJkn3wWQxZkmTXGwe&index=35
5. https://www.youtube.com/watch?v=BqzgUnrNhFM&list=PLZoTAELRMXVPBTr
WtJkn3wWQxZkmTXGwe&index=33
2. Advantages
1. Linear regression performs exceptionally well for linearly separable data
2. Easy to implement and train the model
3. It can handle overfitting using dimensionlity reduction techniques and cross
validation and regularization
3. Disadvantages
1. Sometimes Lot of Feature Engineering Is required
2. If the independent features are correlated it may affect performance
3. It is often quite prone to noise and overfitting
Yes
linear regression needs the relationship between the independent and dependent variables
to be linear. It is also important to check for outliers since linear regression is sensitive to
outlier effects.
HomeWork?
Practical Implementation
1. https://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.ht
ml
How To Learn Machine Learning Algorithms For Interviews
SVM
Theoretical Understanding:
1. https://www.youtube.com/watch?v=H9yACitf-KM
2. https://www.youtube.com/watch?v=Js3GLb1xPhc
2. Advantages
1. SVM is more effective in high dimensional spaces.
2. SVM is relatively memory efficient.
3. SVM’s are very good when we have no idea on the data.
4. Works well with even unstructured and semi structured data like text, Images
and trees.
5. The kernel trick is real strength of SVM. With an appropriate kernel function,
we can solve any complex problem.
6. SVM models have generalization in practice, the risk of over-fitting is less in
SVM.
3. Disadvantages
1. More Training Time is required for larger dataset
2. It is difficult to choose a good kernel
function https://www.youtube.com/watch?v=mTyT-oHoivA
3. The SVM hyper parameters are Cost -C and gamma. It is not that easy to fine-
tune these hyper-parameters. It is hard to visualize their impact
Yes
Although SVMs are an attractive option when constructing a classifier, SVMs do not easily
accommodate missing covariate information. Similar to other prediction and classification
methods, in-attention to missing data when constructing an SVM can impact the accuracy
and utility of the resulting classifier.
6. Impact of outliers?
It is usually sensitive to
outliers https://arxiv.org/abs/1409.0934#:~:text=Despite%20its%20popularity%2C%20SV
M%20has,causes%20the%20sensitivity%20to%20outliers.
Types of Problems it can solve(Supervised)
1. Classification
2. Regression
In SVM, to avoid overfitting, we choose a Soft Margin, instead of a Hard one i.e. we let
some data points enter our margin intentionally (but we still penalize it) so that our classifier
don't overfit on our training sample
https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
Practical Implementation
1. https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html
2. https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVR.html
Performance Metrics
Classification
1. Confusion Matrix
2. Precision,Recall, F1 score
Regression
1. R2,Adjusted R2
2. MSE,RMSE,MAE
How To Learn Machine Learning Algorithms For Interviews
Decision Tree Classifier And Regressor
Interview Questions:
1. Decision Tree
2. Entropy, Information Gain, Gini Impurity
3. Decision Tree Working For Categorical and Numerical Features
4. What are the scenarios where Decision Tree works well
5. Decision Tree Low Bias And High Variance- Overfitting
6. Hyperparameter Techniques
7. Library used for constructing decision tree
8. Impact of Outliers Of Decision Tree
9. Impact of mising values on Decision Tree
10. Does Decision Tree require Feature Scaling
Theoretical Understanding:
1. Tutorial 37:Entropy In Decision
Tree https://www.youtube.com/watch?v=1IQOtJ4NI_0
2. Tutorial 38:Information Gain https://www.youtube.com/watch?v=FuTRucXB9rA
3. Tutorial 39:Gini Impurity https://www.youtube.com/watch?v=5aIFgrrTqOw
4. Tutorial 40: Decision Tree For Numerical
Features: https://www.youtube.com/watch?v=5O8HvA9pMew
5. How To Visualize DT: https://www.youtube.com/watch?v=ot75kOmpYjI
2. Advantages
3. Disadvantages
No
6. Impact of outliers?
It is not sensitive to outliers.Since, extreme values or outliers, never cause much reduction
in RSS, they are never involved in split. Hence, tree based methods are insensitive to
outliers.
Ho to avoid overfitting
https://www.youtube.com/watch?v=SLOyyFHbiqo
Practical Implementation
1. https://scikit-
learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
2. https://scikit-
learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html
Performance Metrics
Classification
1. Confusion Matrix
2. Precision,Recall, F1 score
Regression
1. R2,Adjusted R2
2. MSE,RMSE,MAE
How To Learn Machine Learning Algorithms For Interviews
Logistics Regression
Theoretical Understanding:
1. Tutorial 35:Logitic Regression Part
1 https://www.youtube.com/watch?v=L_xBe7MbPwk
2. Tutorial 36:Logitic Regression Part
2 https://www.youtube.com/watch?v=uFfsSgQgerw
3. Tutorial 39:Logitic Regression Part
3 https://www.youtube.com/watch?v=V8fS0T_ktn4
4. Tutorial 42:How To Find Optimal Threshold for Binary
classification: https://www.youtube.com/watch?v=_AjhdXuXEDE
5. Interview question: https://www.youtube.com/watch?v=tcaruVHXZwE&t=122s
2. Advantages
3. Disadvantages
1. Sometimes Lot of Feature Engineering Is required
2. If the independent features are correlated it may affect performance
3. It is often quite prone to noise and overfitting
4. If the number of observations is lesser than the number of features, Logistic
Regression should not be used, otherwise, it may lead to overfitting.
5. Non-linear problems can’t be solved with logistic regression because it has a
linear decision surface. Linearly separable data is rarely found in real-world
scenarios.
6. It is tough to obtain complex relationships using logistic regression. More
powerful and compact algorithms such as Neural Networks can easily
outperform this algorithm.
7. In Linear Regression independent and dependent variables are related
linearly. But Logistic Regression needs that independent variables are linearly
related to the log odds (log(p/(1-p)).
yes
5. Missing Values
6. Impact of outliers?
Like linear regression, estimates of the logistic regression are sensitive to the unusual
observations: outliers, high leverage, and influential observations. Numerical examples
and analysis are presented to demonstrate the most recent outlier diagnostic methods
using data sets from medical domain
Practical Implementation
1. http://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.h
tml
Performance Metrics
Classification
1. Confusion Matrix
2. Precision,Recall, F1 score
3. Part 1 https://www.youtube.com/watch?v=aWAnNHXIKww
4. Part 2 https://www.youtube.com/watch?v=A_ZKMsZ3f3o
How To Learn Machine Learning Algorithms For Interviews
Decision Tree Classifier And Regressor
Interview Questions:
1. Decision Tree
2. Entropy, Information Gain, Gini Impurity
3. Decision Tree Working For Categorical and Numerical Features
4. What are the scenarios where Decision Tree works well
5. Decision Tree Low Bias And High Variance- Overfitting
6. Hyperparameter Techniques
7. Library used for constructing decision tree
8. Impact of Outliers Of Decision Tree
9. Impact of mising values on Decision Tree
10. Does Decision Tree require Feature Scaling
Theoretical Understanding:
1. Tutorial 37:Entropy In Decision
Tree https://www.youtube.com/watch?v=1IQOtJ4NI_0
2. Tutorial 38:Information Gain https://www.youtube.com/watch?v=FuTRucXB9rA
3. Tutorial 39:Gini Impurity https://www.youtube.com/watch?v=5aIFgrrTqOw
4. Tutorial 40: Decision Tree For Numerical
Features: https://www.youtube.com/watch?v=5O8HvA9pMew
5. How To Visualize DT: https://www.youtube.com/watch?v=ot75kOmpYjI
Theoretical Understanding:
1. Ensemble
technique(Bagging): https://www.youtube.com/watch?v=KIOeZ5cFZ50
2. Random forest Classifier And
Regressor https://www.youtube.com/watch?v=nxFG5xdpDto
3. Construct Decision Tree And working in Random
Forest: https://www.youtube.com/watch?v=WQ0iJSbnnZA&t=406s
2. Advantages
3. Disadvantages
No
6. Impact of outliers?
Robust to Outliers
Practical Implementation
1. https://scikit-
learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html
2. https://scikit-
learn.org/stable/modules/generated/sklearn.tree.DecisionTreeRegressor.html
3. https://scikit-
learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier
.html
4. https://scikit-
learn.org/stable/modules/generated/sklearn.ensemble.RandomForestRegress
or.html
Performance Metrics
Classification
1. Confusion Matrix
2. Precision,Recall, F1 score
Regression
1. R2,Adjusted R2
2. MSE,RMSE,MAE
How To Learn Machine Learning Algorithms For Interviews
Decision Tree Classifier And Regressor
Interview Questions:
1. Decision Tree
2. Entropy, Information Gain, Gini Impurity
3. Decision Tree Working For Categorical and Numerical Features
4. What are the scenarios where Decision Tree works well
5. Decision Tree Low Bias And High Variance- Overfitting
6. Hyperparameter Techniques
7. Library used for constructing decision tree
8. Impact of Outliers Of Decision Tree
9. Impact of mising values on Decision Tree
10. Does Decision Tree require Feature Scaling
Theoretical Understanding:
1. Ensemble
technique(Bagging): https://www.youtube.com/watch?v=KIOeZ5cFZ50
2. Adaboost(Boosting Technique):https://www.youtube.com/watch?v=NLRO1-
jp5F8
3. Gradient Boosting In Depth Intuition Part
1: https://www.youtube.com/watch?v=Nol1hVtLOSg
4. Gradient Boosting In Depth Intuition Part
2: https://www.youtube.com/watch?v=Oo9q6YtGzvc
5. Xgboost Classifier Indepth
Intuition: https://www.youtube.com/watch?v=gPciUPwWJQQ
6. Xgboost Regression Indpeth Intuition: https://www.youtube.com/watch?v=w-
_vmVfpssg
7. Implementation of Xgboost: https://youtu.be/9HomdnM12o4
2. Advantages
Advantages of Adaboost
1. Doesn't Overfit
2. It has few parameters to tune
3. Disadvantages
No
6. Impact of outliers?
Performance Metrics
Classification
1. Confusion Matrix
2. Precision,Recall, F1 score
Regression
1. R2,Adjusted R2
2. MSE,RMSE,MAE