Week 01 Introduction
Week 01 Introduction
Week – 01
MKT3802 Statistical and
Experimental Methods for Engineers
Text Books
Probability and Statistics for Engineers and Scientist, Ronald E. Walpole, Raymond H.
Meyers, Sharon L. Myers, Keying Ye, Pearson Prentice Hall, 8th Edition, 2007 or 9th
Edition, 2012
2
Outline
• Introduction
• Role of probability
• Measure of location (central tendency)
• Measures of variability (dispersion)
• Discrete and continuous data
3
Definitions
• Statistics is
– a science which helps us to collect, analyze and
present data systematically.
– the art of learning from data.
4
Importance
• Simplifies mass of data
• Helps to get concrete info from data
• Helps decision making
• Presents facts in a precise & definite form
• Facilitates comparison
• Facilitates predictions
5
Limitations
• Does not deal with individual items
• Deal with quantitatively expressed items
• Results from interpretation are not universally
true
6
Application areas
Improving
product design
Economics
Testing product
performance
Business
Quality Control
Determining
Biology reliability
Engineering
7
Branches of Statistics
• Descriptive Statistics
– 1st phase
– methods for collection, organization, presentation and
analysis of the data
– without any attempt to infer anything from the known
data
• Inferential (Inductive) Statistics
– 2nd phase
– Process of
• drawing conclusions (inferences)
• performing hypothesis testing
• determining relationship between variables
• making predictions
8
Examples
• Descriptive Statistics
– The daily average temperature of Istanbul was 8°C last
week
– The scores of 50 students in a Process Control exam are
found to range from 30 to 80.
9
Probability
• a measure of the likelihood that a particular event
will occur.
– If we are certain that an event will occur, its probability is
1 or 100%.
– If it certainly will not occur, its probability is zero.
10
Statistics vs. Probability
• There is a clear distinction
between the probability
and the inferential
statistics
15
DoE Example
While we might draw conclusions about the role of humidity and the impact of
coating the specimens from the figure,
we cannot truly evaluate the results from an analytical point of view without taking
into account the variability around the average.
16
Measures of Location
(Central Tendency)
REMARK: Depending on the data, the median and mean can be quite different from
each other. 17
Measures of Location
Example: Suppose the data set is the following:
1.7, 2.2, 3.9, 3.11, and 11.5.
𝑥 = 𝑥(𝑛+1)/2 = 3.11
18
The mean
• Pros
– Most commonly used measure of location
– Uses all the observations in the data set
– All observations have equal weight
• Cons
– Affected by extreme values that may not be
representative of the sample
19
The median
• Pros
– Always exists and unique
– Not effected by extremes
• Cons
– Sorting is required
– Uses only one or two observation
20
Other measures of location
• Other measures of • Other types of mean:
location: – Geometric
– The mode – Harmonic
– Percentile/Quantiles – Quadratic
– Midrange – Trimmed
– Weighted
– Combination
21
Measures of Variability
(Dispersion)
• A measure of variability indicates how
observations are spread about the mean value
– Range
– Variance
– Standard deviation
– Coefficient of variation
22
Range
• The simplest one: 𝑅 = 𝑥𝑚𝑎𝑥 − 𝑥𝑚𝑖𝑛
• Pros
– Quick estimate of variance
– Easy calculation
• Cons
– Only uses extreme values
– The larger the data size, the more inefficient the
range becomes
23
Variance and Std. Deviation
24
Variance, s2 or σ2
• Pros
– An efficient estimator
– Can be added and averaged
• Cons
– Calculation can be tedious without the aid of
calculator or computer.
25
Standard Deviation, s
• Pros
– It is in the same dimensional unit as the observed
values
– An efficient estimator
• Cons
– Calculation can be tedious without the aid of
calculator or computer.
26
Coefficient of Variation, CV
• Measure of relative • Pros
dispersion – Can be used to compare
• Magnitude of variation variation between two
data set with different
to the size of the engineering units
quantity
𝑠
• 𝐶𝑉 = × 100 • Cons
𝑥
– Fail if mean is close to
zero
– Often misunderstood
and misused
27
MATLAB example for
variability
28
Discrete and Continuous
variables
Data
Qualitative
Quantitative
(Categorical)
Non-
Numeric Numeric
numeric
Discrete Continuous
29
Graphical Diagnostics
• Scatter Plot
• Stem-and-Leaf Plot
• Histogram Plot
• Box-and-Whisker Plot or Box Plot
30
Scatter Plot
31
Histogram Plot
Negative Positive
skewness skewness32
Self Study
• Read and try to understand
– Example 1.1
– Example 1.2
– Example 1.3
– Example 1.4
in Walpole.
33