Outliers and Influential Points
Outliers and Influential Points
Outliers and Influential Points
to
The presentation on the topic:
Contents
1. Definition of Outliers 1
3. Leverage Points 3
4
4. Influential Points
5. Necessity and Situation of Detecting Outliers 5
Data points that diverge in a big way from the overall pattern are called outliers
There are four ways that a data point might be considered an outlier:
Broadly
1 2
Extreme X and Y values Distant data point
3 4
3
Leverage Points
Outliers that fall horizontally (x-values) away from the center of the cloud but
don’t influence the slope of the regression line are called leverage points
4
Influential Points
The outliers which influence the slope of the regression line are called
influential points
Y
Y
X X
1. Univariate
2. Bivariate
3. Multivariate
Method of Detecting Outliers (Influential Points) 6
1. Univariate
Theoretically: The points that are lied away from the distance of
three standard deviation This can be done simply using Z-Score
Decision: The values less than -3 and greater then +3 is considered as outliers of Z-Score
2. Bivariate
Scatter plot is very simple and useful way to find out the existence of outliers
3. Multivariate
1. Cook’s D Bar Plot
2. Cook’s D Chart
3. DFBETAs Panel
4. DFFITs Plot
5. Studentized Residual Plot
6. Standardized Residual Chart
7. Studentized Residuals vs Leverage Plot
8. Deleted Studentized Residual vs Fitted Values Plot
9. Hadi Plot
10. Potential Residual Plot
Cook’s D Bar Plot 9
Thank
You!
A ny Q
uestio
n ?? ?