Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
168 views

Statistical and Mathematical Methods For Data Analysis

This document provides information about statistical and mathematical methods for data analysis taught by Dr. Faisal Bukhari at Punjab University College of Information Technology. It lists textbooks and reference books on the topic and provides examples of probability distributions, including discrete and continuous distributions. It also defines key terms like probability density function and cumulative distribution function. In 3 sentences: The document outlines course material on statistical analysis taught by Dr. Bukhari, including probability distributions and functions. It provides examples of calculating probabilities for discrete and continuous random variables. It also lists textbooks and references used for the course material.

Uploaded by

Drive 02
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
168 views

Statistical and Mathematical Methods For Data Analysis

This document provides information about statistical and mathematical methods for data analysis taught by Dr. Faisal Bukhari at Punjab University College of Information Technology. It lists textbooks and reference books on the topic and provides examples of probability distributions, including discrete and continuous distributions. It also defines key terms like probability density function and cumulative distribution function. In 3 sentences: The document outlines course material on statistical analysis taught by Dr. Bukhari, including probability distributions and functions. It provides examples of calculating probabilities for discrete and continuous random variables. It also lists textbooks and references used for the course material.

Uploaded by

Drive 02
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 39

Statistical and Mathematical

Methods for Data Analysis


Dr. Faisal Bukhari
Punjab University College of Information Technology
(PUCIT)
Textbooks

Probability & Statistics for Engineers & Scientists,


Ninth Edition, Ronald E. Walpole, Raymond H.
Myer

Elementary Statistics: Picturing the World, 6th


Edition, Ron Larson and Betsy Farber

Elementary Statistics, 13th Edition, Mario F. Triola

Dr. Faisal Bukhari, PUCIT, PU, Lahore 2


Reference books
 Probability and Statistical Inference, Ninth Edition,
Robert V. Hogg, Elliot A. Tanis, Dale L. Zimmerman

 Probability Demystified, Allan G. Bluman

 Practical Statistics for Data Scientists: 50 Essential


Concepts, Peter Bruce and Andrew Bruce

Schaum's Outline of Probability, Second Edition,


Seymour Lipschutz, Marc Lipson

Python for Probability, Statistics, and Machine


Learning, José Unpingco
Dr. Faisal Bukhari, PUCIT, PU, Lahore 3
References
Readings for these lecture notes:

 Probability & Statistics for Engineers &


Scientists, Ninth edition, Ronald E. Walpole,
Raymond H. Myer

These notes contain material from the above book.

Dr. Faisal Bukhari, PUCIT, PU, Lahore 4


Discrete Probability Distribution
The set of ordered pairs (x, f(x)) is a probability
function, probability mass function, or probability
distribution of the discrete random variable X if, for
each possible outcome x,
1. f(x) ≥ 0,

2. σ𝑥 𝑓 𝑥 = 1,

3. P(X = x) = f(x).

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Example: A shipment of 20 similar laptop computers to
a retail outlet contains 3 that are defective. If a school
makes a random purchase of 2 of these computers,
find the probability distribution for the number of
defectives.

Dr. Faisal Bukhari, PUCIT, PU, Lahore


N = 20
n=2
k=3
P(X = x)= h(x; N, n, k) = (kCx)(N-kCn-x)/(NCn), max{0,
n-(N-k)} ≤ x ≤ min{n, k}
Let X represent the number of defective computers
max{0, n - (N-k)} = max{0, 2 - (20 - 3)}
= max(0, -17) = 0
min{n, k} = min(2, 3) = 2

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Probability Distribution
x P(X = x)
0 68
95
1 51
190
2 3
190
෍𝑃 𝑋 = 1

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Example : If a car agency sells 50% of its inventory of a
certain foreign car equipped with side airbags, find a
formula for the probability distribution of the number
of cars with side airbags among the next 4 cars sold by
the agency.

Dr. Faisal Bukhari, PUCIT, PU, Lahore


𝐧 x n−x
𝐛 𝐱; 𝐧, 𝐩 = p q , x = 0, 1, 2, …, n
𝐱
Here n = 4, p = 0.50, q = 0.50
Let x denotes the number of cars with side airbags

𝟒
𝐛 𝐱; 𝟒, 𝟎. 𝟓𝟎 = (0.50)x(0.50)4−x, x = 0, 1, 2, 3, 4
𝐱
= 4x (0.50)4, x = 0, 1, 2, 3, 4
1 4
𝐛 𝐱; 𝟒, 𝟎. 𝟓𝟎 = x = 0, 1, 2, 3, 4
16 x ,

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Cumulative Distribution Function
The cumulative distribution function F(x) of a discrete
random variable X with probability distribution f(x) is
F(x) = P(X ≤ x) = σ𝒕≤𝒙 𝒇 𝒕 , for −∞ < x < ∞

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Example A stockroom clerk returns three safety
helmets at random to three steel mill employees
who had previously checked them. If Smith, Jones, and
Brown, in that order, receive one of the three hats, list
the sample points for the possible orders of returning
the helmets, and find the value m of the random
variable M that represents the number of correct
matches

Dr. Faisal Bukhari, PUCIT, PU, Lahore


If S, J, and B stand for Smith’s, Jones’s, and Brown’s
helmets, respectively, then the possible arrangements
in which the helmets may be returned and the number
of correct matches are
Sample space m
SJB 3
SBJ 1
JSB 1
BJS 1
JBS 0
BSJ 0

Dr. Faisal Bukhari, PUCIT, PU, Lahore


For the random variable M, the number of correct
matches in the previous example, we have
2 3 5
F(2) = P(M ≤ 2) = f(0) + f(1) = + =
6 6 6
The cumulative distribution function of M is

0, for m < 0,
𝟏
, for 0 ≤ m < 1,
𝟑
F(m) = 𝟓
, for 1 ≤ m < 3,
𝟔
1, for m ≥ 3.

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Example : Find the cumulative distribution function of
1 4
the random variable X in 𝑓(𝑥) = x = 0, 1, 2, 3, 4.
16 x ,
Using F(x), verify that f(2) = 3/8.

Dr. Faisal Bukhari, PUCIT, PU, Lahore


1 4
𝑓(𝑥) = x = 0, 1, 2, 3, 4
16 x ,
1
𝑓(0) =
16
4
𝑓(1) =
16
6
𝑓(2) =
16
4
𝑓(3) =
16
1
𝑓(4) =
16
Dr. Faisal Bukhari, PUCIT, PU, Lahore
F(x) = P(X ≤ x) =σ𝒕≤𝒙 𝒇 𝒕 , for −∞ < x < ∞
1
F(0) = P(X ≤ 0) = 𝑓(0) = ,
16

F(1) = P(X ≤ 1) = 𝒇 𝟎 + 𝒇(𝟏) -----------------------------------(1)


1 4 𝟓
= + = ,
16 16 𝟏𝟔

F(2) = P(X ≤ 2) = 𝒇 𝟎 + 𝒇(𝟏) + 𝒇(𝟐) --------------------------(2)


1 4 6 𝟏𝟏
= + + = ,
16 16 16 𝟏𝟔

F(3) = P(X ≤ 3) = 𝒇 𝟎 + 𝒇(𝟏) + 𝒇(𝟐) + 𝒇(𝟑)


1 4 6 4
= + + +
16 16 16 16
1 4 6 4 𝟏𝟓
= + + + = ,
16 16 16 16 𝟏𝟔
Dr. Faisal Bukhari, PUCIT, PU, Lahore
F(4) = P(X ≤ 4) = 𝑓 0 + 𝑓(1) + 𝑓(2) + 𝑓(3) + 𝑓(4)
1 4 6 4 1
= + + + +
16 16 16 16 16
16
= =1
16
0, for x < 0,
𝟏
, for 0 ≤ x < 1,
𝟏𝟔
𝟓
, for 1 ≤ x < 2,
𝟏𝟔
∴F(x) = 𝟏𝟏
, for 2 ≤ x < 3,
𝟏𝟔
𝟏𝟓
, for 3 ≤ x < 4,
𝟏𝟔
1, for x ≥ 4.

Dr. Faisal Bukhari, PUCIT, PU, Lahore


(2) –(1):
11 5 6 𝟑
f(2) = F(2) – F(1) = - = =
16 16 16 𝟖

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Probability mass function plot vs.
Probability histogram

Probability mass function plot vs. Probability histogram

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Discrete cumulative distribution
function

Discrete cumulative distribution function


Dr. Faisal Bukhari, PUCIT, PU, Lahore
Continuous Probability Distributions
A continuous random variable has a probability of 0
of assuming exactly any of its values.

 Consequently, its probability distribution cannot be


given in tabular form.

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Continuous Probability Distributions
We shall concern ourselves with computing
probabilities for various intervals of continuous
random variables such as P(a < X < b), P(W ≥ c), and
so forth.
Note that when X is continuous,
P(a < X ≤ b) = P(a < X < b) + P(X = b) = P(a < X < b).

That is, it does not matter whether we include an


endpoint of the interval or not.

This is not true, though, when X is discrete.


Dr. Faisal Bukhari, PUCIT, PU, Lahore
Because areas will be used to represent probabilities
and probabilities are positive numerical values, the
density function must lie entirely above the x axis.

Typical density functions.


Dr. Faisal Bukhari, PUCIT, PU, Lahore
Probability Density Function
The function f(x) is a probability density function (pdf)
for the continuous random variable X, defined over the
set of real numbers, if
1. f(x) ≥ 0, for all x ∈ R.
+∞
2. ‫׬‬−∞ f(x) dx = 1.
𝒃
3. P(a < X < b) =‫𝒂׬‬ f(x) dx

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Example: Suppose that the error in the reaction
temperature, in ◦C, for a controlled laboratory
experiment is a continuous random variable X having
the probability density function

x2
,−1 < x < 2,
f(x) = ቐ 3
0, elsewhere
(a) Verify that f(x) is a density function.
(b) Find P(0 < X ≤ 1).
Dr. Faisal Bukhari, PUCIT, PU, Lahore
f(x) ≥ 0.
+∞
 ‫׬‬−∞ f(x) dx = 1.

2 x2
LHS = ‫׬‬−1 dx
3
x3
=[ ]2−1
9
[(2)3 − (−1)3]
=
9
=1
LHS = RHS

Dr. Faisal Bukhari, PUCIT, PU, Lahore


1 x2
P(0 < X ≤ 1) = ‫׬‬0 3 dx
x3
=[ ]10
9
[(1)3 − (0)3]
=
9
1
=
9

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Probability mass function plot vs.
Probability histogram

Probability mass function plot vs. Probability histogram

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Discrete cumulative distribution
function

Discrete cumulative distribution function


Dr. Faisal Bukhari, PUCIT, PU, Lahore
Cumulative Distribution Function
The cumulative distribution function F(x) of a
continuous random variable X with density function f(x)
is
𝒙
F(x) = P(X ≤ x) = ‫׬‬−∞ f(t) dt, for −∞ < x < ∞
𝑑𝐹(𝑥)
P(a < X < b) = F(b) − F(a) and f(x) = , if the
𝑑𝑥
derivative exists.

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Example: For the density function

x2
,−1 < x < 2,
f(x) = ቐ 3
0, elsewhere
, find F(x), and use it to evaluate P(0 < X ≤ 1).

Dr. Faisal Bukhari, PUCIT, PU, Lahore


𝒙
F(x) = P(X ≤ x) = ‫׬‬−∞ f(t) dt, for −∞ < x < ∞
For −1 < x < 2,
𝑥 t2
F(x) =‫׬‬−1 dt
3
t3 𝑥
= [ ]−1
9
[(x)3 − (−1)3]
=
9
x3 + 1
=
9

Dr. Faisal Bukhari, PUCIT, PU, Lahore


0, for x <−1,
x3 + 1
F(x) = , for −1 ≤ x < 𝟐,
𝟗
𝟏, for x ≥ 2

Dr. Faisal Bukhari, PUCIT, PU, Lahore


P(0 < X ≤ 1) = F(1) – F(0)

13 + 1 2
F(1) =
9
= 9
03 + 1 1
F(0) =
9
= 9
2 1 1
P(0 < X ≤ 1) = – =
9 9 9

Dr. Faisal Bukhari, PUCIT, PU, Lahore


Example: The Department of Energy (DOE) puts
projects out on bid and generally estimates what a
reasonable bid should be. Call the estimate b. The DOE
has determined that the density function of the
5 2
, b ≤ y ≤ 2b,
winning (low) bid is f(x) = ቐ8b 5
0, elsewhere
Find F(y) and use it to determine the probability that
the winning bid is less than the DOE’s preliminary
estimate b.

Dr. Faisal Bukhari, PUCIT, PU, Lahore


𝑥
F(x) = P(X ≤ x) = ‫׬‬−∞ f(t) dt, for −∞ < x < ∞
2
b ≤ y ≤ 2b
5
𝒚 5
F(y) =‫׬‬2 dy
8b
5b
5 𝒚
=[ 𝑦]2
8b
5b
5 5 2
= 𝑦 - ( b)
8b 8b 5
5 1
= 𝒚-
8b 4

Dr. Faisal Bukhari, PUCIT, PU, Lahore


2
𝟎, y < b,
5
F(y) = 5 𝒚 − 1 , 2 b ≤ y ≤ 2b
8b 4 5
1, y ≥ 2b.

Dr. Faisal Bukhari, PUCIT, PU, Lahore


To determine the probability that the winning bid is
less than the preliminary bid estimate b, we have
5 1
F(y) = 𝐲 -
8b 4
5 1
⇒F(b) = 𝐛 -
8b 4
5 1
⇒F(b) = -
8 4
𝟓 𝟏 𝟑
∴ P(Y ≤ b) = F(b) = - =
𝟖 𝟒 𝟖

Dr. Faisal Bukhari, PUCIT, PU, Lahore

You might also like