Mathematical Statistics: Instructor: Dr. Deshi Ye
Mathematical Statistics: Instructor: Dr. Deshi Ye
Instructor:
Dr. Deshi Ye
In minitab: stat->dotplots->simple
Dot diagram
This diagram visually summarize the
information that the lathe is generally
running fast.
Data001.
80 data of emission (in ton)of sulfur
oxides from an industry plant
15.8 26.4 17.3 11.2 23.9 24.8 18.7 13.9 9.0 13.2 22.7 9.8
6.2 14.7 17.5 26.1 12.8 28.6 17.6 23.7 26.8
22.7 18.0 20.5 11.0 20.9 15.5 19.4 16.7 10.7 19.1 15.2
22.9 26.6 20.4 21.4 19.2 21.6 16.9 19.0 18.5 23.0
24.6 20.1 16.2 18.0 7.7 13.5 23.5 14.5 14.4 29.6 19.4
17.0 20.8 24.3 22.5 24.6 18.4 18.1 8.3 21.9 12.3
22.3 13.3 11.8 19.3 20.0 25.7 31.8 25.9 10.5 15.9 27.5
18.1 17.9 9.4 24.1 20.1 28.5
Frequency distributions
A frequency distribution is a tabular
arrangement of data whereby the data is
grouped into different intervals, and then
the number of observations that belong to
each interval is determined.
Data that is presented in this manner are
known as grouped data.
Class limits & frequnecy
Class limits Frequency
5.0 -- 8.9 3
9.0 12.9 10
13.0 16.9 14
17.0 20.9 25
21.0 24.9 17
25.0 28.9 9
29.0 32.9 2
Total 80
Class limit and width
lower class limit: The smallest value that can belong to
a given interval
2 0 67
6 0 8999
11 1 00111
17 1 223333
24 1 4445555
32 1 66677777
(13) 1 8888888999999
35 2 0000000111
25 2 222223333
16 2 4444455
9 2 66667
4 2 889
1 3 1
Ch2.5: Descriptive measures
Mean: the sum of the observation divided by the
sample size. n
x i
x i 1
n
Median: the center, or location, of a set of data. If
the observations are arranged in an ascending or
descending order:
If the number of observations is odd, the median is the
middle value.
If the number of observations is even, the median is
the average of the two middle values.
Example
15 14 2 27 13
Mean: 15 14 2 27 13
x 14.2
5
i
( x x ) 2 n n
n x ( xi ) 2
2
i
s2 i 1
s2 i 1 i 1
n 1 n(n 1)
Standard deviation s:
n
i
( x x ) 2
s i 1
n 1
Quartiles and Percentiles
Quartiles: are values in a given set of
observations that divide the data in 4 equal parts.
The first quartile,Q1 , is a value that has one
fourth, or 25%, of the observation below its
value.
The sample 100 p-th percentile is a value such
that at least 100p% of the observation are at or
below this value, and at least 100(1-p)% are at or
above this value.
Example
Example in P34:
14.7 15.2
Q1 14.95
2
19.0 19.1
Q2 19.05
2
22.9 23
Q3 22.95
2
Boxplots
A boxplot is a way of summarizing
information contained in the quartiles (or
on a interval)
Box length= interquartile range= Q3 Q1
Modified boxplot
Outlier: too far from third
quartile.
1.5(interquartile range)
of third quartile.
Modified boxplot:
identify outliers and
reduce the effect on the
shape of the boxplot.