Normal Distribution and Regression Notes
Normal Distribution and Regression Notes
—Karl Pearson—
Normal Linear Regression
Distributions and Correlation
Frequency Distributions Linear Regression
Standard Normal Distribution Linear Correlation Coefficient
01
Normal Distributions
Normal Distributions
and the Empirical Rule
Let us recall…
• Statistics consists of a body of methods for
collecting and analyzing data.
• You can use statistical methods to determine what
kind and how much sample data you need to gather,
how you should organize and summarize these data,
and how you can analyze them and make
conclusions from them.
Let us recall…
• There are three important components for the
success of any statistical research study – design,
description, and inference.
1. Design – the researcher must know the
appropriate statistical methods to carry out a
plan, implement rules, and evaluate experiments
properly.
Let us recall…
2. Description – the researcher must know how to
guide readers in understanding the methods of a
research and in analyzing its results.
3. Inference – the researcher must use the results
of data analysis to make good predictions and
correct decisions.
The Normal Distribution
• The normal distribution is perhaps the most
commonly used continuous probability distribution in
the entire field of statistics.
• It provides a good model for most continuous
populations.
• It has a bell-shaped curve also known as the
normal curve.
Empirical Rules for Normal Distribution
We take note:
2. Find the area of the standard normal distribution to the right of:
a) 𝑧 = 2.5
b) 𝑧 = 0.24
The Standard Normal Distribution
Values of r Interpretation
𝑆𝑆𝑥𝑦
𝑟=
𝑆𝑆𝑥𝑥 𝑆𝑆𝑦𝑦
where
2 σ𝑥 2 2 σ𝑦 2 σ𝑥σ𝑦
𝑆𝑆𝑥𝑥 = σ 𝑥 − , 𝑆𝑆𝑦𝑦 = σ 𝑦 − , 𝑆𝑆𝑥𝑦 = σ 𝑥𝑦 − ,
𝑛 𝑛 𝑛
and 𝑛 is the sample size and “𝑆𝑆” stands for sum of squares.
Linear Correlation Coefficient
We note that:
𝑦 = 𝑚𝑥 + 𝑏
where
𝑆𝑆𝑥𝑦
𝑏 𝑖𝑛𝑡𝑒𝑟𝑐𝑒𝑝𝑡 =
𝑆𝑆𝑥𝑥
𝑚 𝑠𝑙𝑜𝑝𝑒 = 𝑦ത − 𝑏𝑥ҧ
𝑥ҧ = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑥
𝑦ത = 𝑚𝑒𝑎𝑛 𝑜𝑓 𝑦
Linear Correlation Coefficient
In your work with applications that involve the linear correlation coefficient 𝑟, it is important to
remember the following properties of 𝑟.
Using Excel – Example
The grades of 10 senior high school students on a midterm report
𝑥 and on the final examination 𝑦 are as follows:
x 78 72 50 99 68 94 72 81 96 68
y 86 56 65 99 70 84 80 55 99 70
Using Excel, we
obtain the following:
Solution
1. Select the
data set.
Line Chart Edit Chart Title
Right Click on the
Data and Choose
Line Chart “Add Trendline”
Line Chart
Choose Linear
Line Chart
Set the Dash Type
Line Chart
Click the “+” symbol to edit Chart Elements
Line Chart