Statistics and Probability Solved Assignments - Semester Fall 2008
Statistics and Probability Solved Assignments - Semester Fall 2008
Question 1:
(Marks: 16)
Write the short notes on the following:
Solution:
Question 2:
(Marks: 4)
State which of the following represent qualitative data and which one of them represents
quantitative data.
i) Religion of the people of the country (qualitative data)
ii) Fee of VU students (quantitative data)
iii) Majority of population like Geo TV (qualitative data)
iv) Inches of rainfall in Lahore city during the last year (quantitative data)
Note:
Question 3:
(Marks: 10)
The following data are the weights in pound of 42 students of Virtual University.
Construct a stem-and-leaf display of the data.
Solution:
The stem-and-leaf display of the data is shown below.
Stem Leaf
12 5 6 8
13 8 2 6 5 5 5 5
14 4 9 6 0 7 8 4 6 2 0 5 2 5 7
15 0 7 8 2 4 3 0 6 4
16 4 8 3 5 1
17 6 3 1
18 9
Stem Leaf
12 5 6 8
13 2 5 5 5 5 6 8
14 0 0 2 2 4 4 5 5 6 6 7 7 8 9
15 0 0 2 3 4 4 6 7 8
16 1 3 4 5 8
17 1 3 6
18 9
Assignment 2
Question 1:
a) What is the difference between Chebyshev’s inequality and empirical rule (in
terms of skweness)?
Solution:
Chebyshev’s inequality and Empirical rule both tells us the proportion of data values
that must lie within a specified number of standard deviation from mean.
Chebyshev’s inequality is a general rule for all symmetric and non symmetric
distributions.
But empirical rule is applicable only on the symmetric distributions.
b) The share prices of a company in Lahore and Islamabad market during the last
months are recorded below:
Months Jan Feb March April May Jun July Aug Sep Oct
Lahore 105 120 115 118 130 127 109 110 104 112
Islamabad 108 117 120 130 100 125 125 120 110 135
In which market, the shares prices are more stable?
Solution:
For the stability of market we have to check the Coefficient of variation for both
cities, the city having less CV will show stability in its market.
x lahore
x 1150 115
n 10
S lahore 8.33
2
n n 10
C.V .Lahore
S 8.33
100 100 7.24
x 115
y Islamabad
y 1190 119
n 10
y
y
2 2
142628
S Islamabad 119 10.09
2
n
n 10
x S 6.97
2
n 10 n n 10
Solution:
What are moments? And why we use moments.
Moments are central parameters, which are used for testing the symmetry and
normality of the distribution.
What is meant by kurtosis?
The term kurtosis is meant to show the degree of peak ness of the distribution.
Lepto kurtic:
A distribution having a relatively higher peak is called Lepto kurtic distribution.
Platy kurtic:
A distribution, which is flat – topped, is called platy distribution.
Normal distribution:
A distribution which is neither very peaked nor very flat, is called normal
distribution or mesokurtic.
Regression:
It investigates the dependence of one dependent variable on the other
independent variable.
Regressor:
The independent or the non-random variable is also referred to as the regressor,
the predictor, the regression variable or the explanatory variable.
Regressand:
The dependent or the random variable is also referred as the regressand , the
predictand , the response or the explained variable.
Question 2:
If distribution has mean 1403 and mode 1487, what can you say about the
skewness?
Solution:
Mean = 1403
Mode = 1487
The distribution is negatively skewed, because
Mean < Mode
Question 3:
a) Distinguish between permutation and combination.
b) First four moments of a certain distribution about Y = 17.5 are 0.3,74,45,
and 12125 respectively. Find out whether the distribution is Lepto kurtic or
Platy kurtic.
Solution:
a. Permutation:
A permutation is an arrangement of all or part of a set of objects in a
definite order. The number of permutations of n distinct objects taken r
at a time is
n!
n
Pr
(n r )!
Combination:
A combination is an arrangement of objects without regard to their order. The
number of combinations of n objects taken r at a time is
n!
n
Cr
r !(n r )!
An Urn contains 5 white and 7 black balls, five balls drawn at random.
a) Find the distribution function of the probability distribution of no. of white balls.
b) Draw the graph of the distribution function.
a. Let X be a random variable which represent the number of white balls then the random
variable X takes the values 0,1,2,3,4,5 and their probabilities are:
N=12, n=5
5 7
p X 0 c c
. 0 5
12
c 5
=21/792
=7/264
5 7
p X=1 c c
. 1 4
12
c 5
=175/792
5 7
p X=2 c c
.2 3
12
c 5
=350/792
=175/396
5 7
p X=3 c c
. 3 2
12
c 5
=210/792
=35/132
5 7
p X=4 c 4 c1
.
12
c 5
=35/792
5 7
p X=5 c c
.
5 0
12
c 5
=1/792
Probability Distribution of X
In order to obtain the distribution function of the probability distribution, we compute the
Cumulative Probabilities as follows:
0 for x<0
21/792 for 0x<1
196/792 for 1x<2
F(X) = 546/792 for 2x<3
756/792 for 3x<4
791/792 for 4x<5
1 for x5
b.
Question 2
Three balls are drawn at random from a box containing 3 blue balls, 2 red balls and 3
green balls. If X represents no. of blue balls and y is the number of red balls. Then
a) Make the joint distribution of X and Y
b) Find f(x/1)
c) P(X=2/Y=0)
Solution:
The joint probability distribution will be determined as follows
3
Cx 2 C y 3C3 x y
f ( X x, Y y) 8
C3 Where x=0, 1, 2, 3 and y=0, 1, 2
3 2 3
C0 C0 C3
f ( x 0, y 0) 8
1/ 56
C3
3
C0 2C1 3C2
f ( x 0, y 1) 8
6 / 56
C3
3
C1 2C0 3C2
f ( x 1, y 0)
8
9 / 56
C3
Similarly we can find the remaining probabilities
a. Joint distribution of X and Y
Y X
0 1 2 3 h(y)
0 1/56 9/56 9/56 1\56 20/56
1 6/56 18/56 6/56 0 30/56
2 3/56 3/56 0 0 6/56
g(x) 10/56 30/56 15/56 1/56 1
For Part (b):
f ( x,1)
f (x/1)=
h(1)
Now we have to find first the h (1)
h(1)=f(0,1)+f(1,1)+f(2,1)+f(3,1)
=6/56+18/56+6/56+0=30/56
Then,
56 f ( x,1)
f (x/1)=
30
56
f (0 /1) f (0,1)
30
56 6 1
( )
30 56 5
56
f (1/1) f (1,1)
30
56 18 3
( )
30 56 5
56
f (2 /1) f (2,1)
30
56 6 1
( )
30 56 5
56
f (3 /1) (0) 0
30
x 0 1 2 3
c.
P(x=2/Y=0)
f ( x 2, y 0)
P( x 2 / Y 0)
h(0)
9 / 56
= 9 / 20
20 / 56
Assignment 5
Question 1:
Define Poisson process.
Sol:
A Poisson process represents a situation where events occur randomly over a
specified interval of time or space or length.
a) Given a random variable X, E(X) = 0.63 & Var (X) = 0.2331. Find E ( X 2 ) .
Sol:
E(X) = 0.63 & Var (X) = 0.2331
Var ( X ) E( X 2 ) E( X )
2
0.2331 E ( X 2 ) 0.3969
0.2331 0.3969 E ( X 2 )
E ( X 2 ) 0.63
Question 2:
a) When do we deal discrete Uniform distribution?
Sol:
The point to be kept in mind is that, whenever we have a situation where the various
outcomes are equally likely, and of a form such that we have a random variable X with
values 0, 1, 2, … ..n then we will be dealing with the discrete uniform distribution.
I. larger than 54
II. Smaller than 57.
Sol:
With 50and 2 25 , we have
i) At x=54
54 50
Z 0.80
5
Hence using table we have
P(X>54) = P (Z>0.8)
= 0.5- P (0 Z 0.8)
= 0.5 – 0.2881= 0.2119.
ii) At x= 57
57 50
Z 1.40
5
Therefore using table
P(X<57) = P (Z<1.40)
= 0.5+ P (0 Z 1.40)
=0.5+ 0.4192
= 0.9192
Question 3:
In which condition, Poisson distribution is used to approximate the hyper geometric
distribution?
Sol:
The Poisson distribution can be used to approximate the hyper geometric
distribution when
n < 0.05N, n > 20, and p < 0.05
a) A fair coin is tossed 20 times. Find the probability that the number of heads
occurring is between 10 and 14 inclusive by using the normal approximation
to the binomial distribution.
Sol:
Since n= 20, p= 0.5, q= 1-p = 0.5
np 20(0.5) 10
Unbiased estimator
An estimator is unbiased if the mean of its sampling distribution is equal to the
population parameter to be estimated.
Statistical Estimation
The statistical estimation is a procedure of making judgment about the unknown value of
a population parameter by using the sample observations.
Question 2:
a) A random variable X has the following probability distribution:
x 4 5 6
P(x) 0.3 0.5 0.2
Find the mean X and standard error X of the mean for a random sample of size 2.
Solution:
A random variable X has the following probability distribution:
x 4 5 6
P(x) 0.3 0.5 0.2
E( x) xP( x) 4.9
2 Var ( x) x 2 P( x) xP( x) 24.5 (4.9) 2 0.49
0.49 0.7
We know that:
X 4.9
2
0.49
X2 0.245
n 2
X 0.245 0.495
b) It is known that 3% of the persons living in Gujranwala city are known to have a
certain disease. Find the mean and standard error of sampling distribution of proportion
of diseased persons in a random sample of 500 persons.
Solution:
We have proportion in the population P= 0.03 and the sample size n= 500.
Let the sample proportion is P̂
Then, pˆ P 0.03
P (1 P ) 0.03(1 0.03)
And pˆ 0.00763
n 500
Question 3:
a) In a random sample of 500 people eating lunch at a hospital cafeteria on various
Fridays, it was found that x 160 preferred seafood. Find 95% confidence interval for the
actual proportion of people who eat seafood on Fridays at this cafeteria.
Solution:
160
The point estimate of population proportion is pˆ 0.32 .Using table we
500
find z0.05 / 2 1.96 .Therefore
pˆ (1 pˆ )
pˆ z / 2
n
(0.32)(0.68)
0.32 1.96
500
0.32 0.04
0.28, 0.36
b)
The mean and standard deviation for the quality grade-point averages of a random sample
are calculated to be 2.6 and 0.3. How large sample is required if we want to be 95%
confident that our estimate of is not off by more than 0.05
Solution:
We know that
z .ˆ
2
n /2
e
As given
z / 2 1.96
ˆ 0.3
e 0.05
n /2 138.3
e 0.05
n 138
Assignment 7
Question 1:
2 2
Prove that when n is large, s is approximately equal to S
Solution:
As we knowthat
( x x ) 2
s2 ( x x ) 2 (n 1) s 2
n 1
whereas
( x x ) 2
S
2
( x x ) 2 nS 2
n
Hence
(n 1) 2 1 2
(n 1) s 2 nS 2 S 2 s 1 s
n n
Now, as
1
n 0
n
Hence
If n is LARGE
S2 s2
(a) A random sample of 100 workers with children in day care show a mean day-care
cost of Rs.2650 and a standard deviation of Rs.500. Verify the department’s claim
that the mean exceeds Rs.2500 at the 0.05 level with this information.
Step 1:
H 0 : 2500
H1 : 2500 (one sided test )
Step 2 :
0.05
Step 3 :
x 0
z
S
n
2650 2500 150
z
500 50
100
z 3
Step 4 :
The critical region for 0.05 is z 1.645
Step 5 :
Since the calculated value is Z falls in the critical region, so we accept Alternative
hypothesis.
Question 2:
(a) A random sample of size n is drawn from normal population with mean 5 and
variance . Answer the following:
2
(b) In a poll of college students in a large university, 300 of 400 students living in
students’ residences (hostels) approved a certain course of action, whereas 200 of 300
students not living in students’ residences approved it. Compute the 90% confidence
interval for this difference.
Solution:
Let
300
pˆ1 0.75
400
qˆ1 1 pˆ1 1 0.75 0.25
and
200
pˆ 2 0.67
300
qˆ2 1 pˆ 2 1 0.67 0.33