Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Probability MC3020

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

Statistical Inference

Mrs. S. Vijendiran

Faculty of Engineering,
University of Jaffna.
Lecture -01
 Data is a collection of facts, such as values or
measurements.
 It can be numbers, words, measurements, observations or even
just descriptions of things.
Nominal Ordinal
Qualitative Vs Quantitative
Data can be qualitative or quantitative.
Qualitative data is descriptive information
(it describes something)
Quantitative data, is numerical information (numbers).
Nominal:
Example: Blood group, Nationality
Ordinal:
Example: Hotel rating, questionnaire responses(strongly disagree,
disagree…)
Discrete:
Example: Family size, number of cars
Continuous:
Example: height, weight
 Discrete data can only take certain values (like whole numbers)
 Continuous data can take any value (within a range)
Put simply:
Discrete data is counted,
Continuous data is measured
 A type of categorical data in which
objects fall into unordered categories.

 Type of Bicycle
 Mountain bike, road bike, BMX.
 Smoking status
 smoker, non-smoker
A type of categorical data in which order is
important.
 Class ofdegree-1st class, 2nd upper, 2nd
lower, pass, fail
 Opinion of students about stats classes-
Very unhappy, unhappy, neutral, happy
Organized representations of data that help to summarize
and simplify.
Types of frequency distributions :
 Simple frequency distributions
 Grouped frequency distributions
Raw Data Score f
2 5 8 7 2 2 8 3
7 2
6 8 5 2 5 7
6 3
4 5 6 2 8 6 5 4
4 1
2 5
 Classintervals
 Often data has natural classes
 Example: grades in a course
 Intervals must not be too large or too small

Class 𝒇
interval
90-100 1
80-89 5
70-79 2
60-69 7
<59 7
Graphical presentation of data
Categorical data Numerical data
Bar chart Histogram, frequency
Pie chart curve, stem and leaf plot ,
Line chart, Box plot
Measure of Location
• Mean
• Median
• Mode
Mean
Definition:
If 𝑥1 , 𝑥2 , 𝑥3 …𝑥𝑛 are the values of the variable X, then the arithmetic mean
of the set of observations defined by
𝑛
𝑖=1 𝑥𝑖
𝑋= .
𝑛
If the values 𝑥𝑖 occurs 𝑓𝑖 times (k≤ 𝑛)
𝑘
𝑖=1 𝑓𝑖 𝑥𝑖
𝑋= 𝑘 .
𝑖=1 𝑓𝑖
𝑘
Note: The total number of observations n= 𝑖=1 𝑓𝑖
𝑘
𝑖=1 𝑓𝑖 𝑚𝑖
For grouped data: 𝑋 = 𝑘 , where 𝑚𝑖 is the mid point.
𝑖=1 𝑓𝑖
Median
• The middle value when a variable’s values are ranked in order; the
point that divides a distribution into two equal halves.
• Scores which divide distributions into specific proportions
• Percentiles = hundredths
P1, P2, P3, … P97, P98, P99
• Median is the 50th percentile.
• Quartiles = quarters
• 1st quartile 𝑄1 is the 25th percentile
• 2nd quartile 𝑄2 is the 50th percentile (median)
• 3rd quartile 𝑄3 is the 75th percentile
• To locate the 𝑝𝑡ℎ percentile we use
𝑝
𝐿𝑝 = (𝑛 + 1)
100
Where, 𝐿𝑝 -the location of the percentile
𝑛 -the number of observation
𝑝 -the percentile
For grouped data
Mode
• The value that occurs most frequently.
Note:
• Mode is often uninformative for quantitative data with many
values.
• Modal class is the most frequently occurring class in a data set.
• Appropriate measures if the data has
• Numerical: mean, median, mode
• Ordinal scale: median, mode
• Nominal scale: mode
Measure of Dispersion/ measure of Variability
a) Range: maximum-minimum
b) Variance and Standard deviation(S.D)
These are applicable to quantitative data and most commonly used
measure of variability.
Simple data:
Variance and Standard deviation(S.D)
Grouped data
Population variance
c) Coefficient of variation (CV)
• CV compares variability of 2 data sets.
• This also applicable to numerical datasets
• It is measure of relative (proportionate) variation.
• It is important to use if data sets are of very different magnitudes.
standard deviation
CV= *100
𝑚𝑒𝑎𝑛
Thus CV express the S.D as a percentage of the mean
d) IQR(Inter quartile range) and Inter quartile deviation (IQD)
e) Mean Absolute Deviation
• Average of the absolute deviations from the mean

x x x
M . A.D. 
 x
5 -8 +8 n
24
9 -4 +4 
5
16 +3 +3  4 .8
17 +4 +4
18 +5 +5
0 24
Measures of Skewness
• Skewness measures the lack of symmetry in a set of data about its
mean
Skewness

Mean Mode Mean Mean


Mode
Median
Median Mode Median

Negatively Symmetric Positively


Skewed (Not Skewed) Skewed
If distribution is positively skewed : Mode<Median<Mean
If distribution is symmetric: Mode=Median=Mean
if distribution is negatively skewed: Mean<Median<Mode

You might also like