Problem Set 1 Solutions
Problem Set 1 Solutions
Financial statistics
HT 2023
Andriy Andreev
Ralf Xhaferi
Department of Statistics
Question 1.1
A Swedish company which does business with a Norwegian company is
affected by the exchange rate. The company needs to know how many
Swedish kronor (SEK) corresponds to 100 Norwegian krone (NOK). After
analyzing data from an 18-month period, they found that the daily
change in the exchange rate can be presumed to be independent and
identically distributed from a normal distribution with expected value 0
and standard deviation 0.375 SEK.
a) What is the probability that the exchange rate drops at least 0.50
kronor from today's rate to tomorrow's rate?
b) What is the mean and standard deviation of the change over two
business days?
c) What is the probability that the exchange rate will rise at least 0.50
kronor over two business days?
Solution 1.1:
• We define the variable: 𝑋𝑖 =change of the exchange rate over a day(daily
change in exchange rate)
Then 𝑋𝑖 ~𝑁(0, 0.3752 )
a) What is the probability that the exchange rate drops at least 0.50 kronor from today's rate to
tomorrow's rate?
In this way we think for normal distribution, standardized.
𝑋−µ
If 𝑋𝑖 ~𝑁(0, 0.3752 ) then Z =
σ
x −0
i −0.5−0
P(𝑋𝑖 <-0.50)=P(0.375 < 0.375
)= P(Z <-1.33)= P(Z >1.33)= 1-P(Z<1.33)= 1- ɸ(1.33) =1-0.90824=0.09176
Consequently, the probability that the rate decreases with at least 0.50 SEK from the rate of the day
to the rate of tomorrow is 0.09176.
𝑋𝑖 =change of the exchange rate over a day, 𝑋𝑖 ~𝑁 0, 0.3752 and 𝑋𝑖+1 ~𝑁(0, 0.3752 )
b) What is the mean and the standard deviation of the change over two business
days? Remember that:
E(aX+bY)=aE(X)+bE(Y)
We define Y= 𝑋𝑖 + 𝑋𝑖+1 = 𝑐ℎ𝑎𝑛𝑔𝑒 𝑜𝑣𝑒𝑟 𝒕𝒘𝒐 𝒃𝒖𝒔𝒊𝒏𝒆𝒔𝒔 𝒅𝒂𝒚𝒔 Var(aX+bY)= 𝑎2 𝑉(𝑋)+ 𝑏2 𝑉 𝑌 + 2𝑎𝑏𝐶𝑜𝑣(𝑋, 𝑌)
Consequently, the change over two business days has a mean zero and a standard
deviation of 0.53
𝑌𝑗 =change of the exchange rate for two days
When 𝑌𝑗 ~𝑁(0, 0.532 )
c) What is the probability that the exchange rate will rise at least 0.50 kronor over two
business days?
𝑌𝑗 −0 0.5−0
P(𝑌 > 0.50)=P( > )= P(Z > 0.94)=1-P(Z<0.94)=1- ɸ(0.94) =1-0.82639=0.17361
0.53 0.53
Consequently, the probability that the rate increases with at least 0,50 SEK over two
bankdays is 0.17361
Question 1.2
You may choose between two corporate bonds (from companies A or B) that have
the same maturity (five years). There is a risk of bankruptcy associated with
corporate bonds, where you don’t get your money back.
Let:
X = return on investment in A's bond after five years (SEK).
Y = return on investment in B's bond after five years. (SEK).
If the company does not go bankrupt the accumulated interest rate after five years is
10% for A and 20% for B. The risk that company A goes bankrupt is 3% and the
corresponding risk for company B is 5%. The risk that both A and B go bankrupt
during the 5 years is estimated to be 2%:
Y= -1 Y= 0.2
0.05 0.95 1
a) What is the expected return on investment from A and B; E[X] and
E[Y]?
Company B
Company A
𝑪𝒐𝒗(𝑿,𝒀)
We know that Corr(X,Y) =
𝑽 𝑿 ∗𝑽(𝒀)
First we have to calculate variance. Recall: Var[X]=σ(𝑥 − 𝜇𝑥 )2 ∗ 𝑃(𝑥)= σ𝑎𝑙𝑙 𝑥(𝑥)2 ∗ 𝑃 𝑥 − (𝜇𝑥 )2
• Therefore Var(X) = (-1)2*0.03 + 0.12*0.97 – 0.0672= 0.03 + 0.0097 – 0.004489 = 0.035211
• Var(Y) = (-1)2*0.05 + 0.22*0.95 – 0.142= 0.05 + 0.038 -0.0196 = 0.0684
• Recall: Cov(X,Y)=σ𝑎𝑙𝑙 𝑥 σ𝑎𝑙𝑙 𝑦(𝑥𝑦) ∗ 𝑃 𝑥, 𝑦 − E[X] * E[Y]
• Cov(X,Y) = (-1)*(-1)*0.02 + (-1)*0.2*0.01 + 0.1*(-1)*0.03 + 0.1*0.2*0.94 – 0.067*0.14 = 0.02442
𝐶𝑜𝑣(𝑋,𝑌) 0.02442
• Corr(X,Y) = = 0.035211∗0.0684
=0.498
𝑉 𝑋 ∗𝑉(𝑌)
The correlation is 0.498.
c) If you invest 5000 kronor in each bond, what is the
expected return and variance from the portfolio? Let
W=5000X+5000Y be the return on the portfolio after 5
years.
W represents the overall return on the portifolio for 5
years
E[W] = E[5000X+5000Y] = 5000*E[X]+5000*E[Y]= 500(0.067
+ 0.14) = 1035
Var(W) = V[5000X+5000Y] =
50002*Var(X)+50002*Var(Y)+2*5000*5000*Cov(X,Y)
= 50002* 0.035211 +50002* 0.0684 +2*5000*5000* 0.02442 = 3
811 275
d) If you have 10 000 kronor to invest in either company, how should you
divide up your investments in order to minimize the variance of your
portfolio
Let α be the proportion of the 10000 invested in option A, that minimises the variance
The portfolio: W=10000 α X+10000(1- α)Y=10000[α X+(1- α)Y]
Var(Portfolio) = Var(10000[α X+(1- α)Y]) = 100002 𝑉𝑎𝑟[αX+(1−α)Y] =100002 [𝑎2 𝑉𝑎𝑟(X) +(1 − 𝛼)2 Var(Y) +
2∗α∗(1−α) cov(x,y) = 100002 [𝑎2 𝑉𝑎𝑟(X) +(1 − 𝛼)2 Var(Y) + 2∗(α−𝑎2 ) cov(x,y)
min Var(Portfolio) {𝛼 ∗ }
𝜕Var(Portfolio)
=0
𝜕𝛼
Var(Y)−Cov(X,Y)
α=
Var(X)+Var(Y)−2Cov(X,Y)
. Whether we include the factor 10000 or not, the variance (the
risk) is minimized for :
Var(Y)−Cov(X,Y)
α=
Var(X)+Var(Y)−2Cov(X,Y)
0.0684 − 0.0244
= = 0.8029
0.0684 + 0.0352 − 2∗0.0244
Weekly sales in number 526 421 581 640 412 500 444 443 580 570 376 723
of packages (1/2 kg) (Y)
Exposure area (square 6 3 6 9 3 9 6 3 9 6 3 9
feet) (X):
Suppose that the relationship between weekly sales and exposure area can be
described by a simple linear regression model : Yi = β0 + β1Xi +εi
Answer the following questions using the regression output on the following page
a) Estimate the model’s parameters using the least squares method and interpret the
parameter estimates in words.
b) Calculate the residuals and their variance 𝑆𝑒2 .
c) State the coefficient of determination 𝑅2 and interpret the value
d) Test at the 5% significance level the null hypothesis 𝛽1 = 0 against the alternative 𝛽1 >
0.
e) Construct a 95% confidence interval for 𝛽1 and interpret the interval.
f) Calculate 95% confidence intervals for the average weekly coffee sales when the
exposure area is 3, 6 and 9 square feet. Which interval is shortest and why?
g) Suppose that next week the exposure area will be 9 feet. Calculate a 95% prediction
interval for expected sales that week.
h) What assumptions should you make to answer d) – g)? Are all those assumptions
fulfilled? If not, what effect will this have?
Weekly Sales vs Exposure Area
750
700
650
600
Weekly Sales (Y)
550
500
450
400
350
300
2 3 4 5 6 7 8 9
Exposure Area (X)
Descriptive Statistics:
Standard
Variable Count Mean Minimum Q1 Median Q3 Maximum
Deviation
Regression Analysis:
Model summary
R Square 0.653254
Coefficients
Coefficients SE t-value P-value VIF
Intercept 320.25 49.21019 6.507799 6.83E-05
Exposure area 32.95833 7.593297 4.340451 0.001465 1
Prediction for Weekly Sales
Variable Setting 𝑿 = 𝟑
Fit SE Fit 95% CI 95% PI
419.125 29.4087 (353.598; 484.652) (261.316; 576.934)
Variable Setting 𝑿 = 𝟔
Variable Setting 𝑿 = 𝟗
β1 =32.96
If the surface exposure increases with a unit (a
Yi =
β0 + β1 i
𝑿 square foot) the weekly sales is estimated to
increase with 33 packages on average.
How to calculate manually:
β0 and β1
Y X 𝑋𝒀 𝒙𝟐
526 6
𝛽መ0 = 𝑦ത − β
1 𝑥ҧ
3156 36
421 3 (σ 𝑥𝑖 )(σ 𝑦𝑖 )
1263 9 σ𝑛
𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑆𝑥𝑦 𝑆𝑦
581 6 3486 36 𝛽መ1= 𝑛
2 = = 𝑟𝑥𝑦
𝑛 2 (σ 𝑥𝑖 ) 𝑆𝑥2 𝑆𝑥
σ𝑖=1 𝑥𝑖 −
640 9 5760 81 𝑛
412 3 1236 9 (72)(6216)
39669 − 39669 − 37296
500 9 4500 81 𝛽መ1= 12 =
444 6 72 2 504 − 432
2664 36 504 −
12
443 3 1329 9
2373
580 9 5220 81 𝛽መ1= = 72 = 32.9583~32.96
570 6 3420 36 6216 72
𝛽መ0= = − 32.96 ∗ = 518-(32.96*6)=320.24
12 12
376 3 1128 9
723 9 6507 81
. Consequently: 𝛽መ1= =32.96 and 𝛽መ0= 320.24
𝑦𝑖 = 6216 𝑥𝑖 = 72 𝑥𝑦 = 39669 σ 𝑥𝑖2 =504
.
Read this closely, important to know
• The interceptet B0 states an estimation of an average y-value for
individuals with the value x = 0. i.e., the estimation of the
population average B0 for the population of all individuals with
the value x = 0.
What does this mean? (Think of this first)
• In many situations the interpretation of the intercept B0 is not
meaningful. This is due to the fact that the values x = 0 often is
situates far outside the investigated area or that x-variable
cannot have the value x = 0 whatsoever.
• This concerns exactly our example where X is situated between
3 , 6 and 9.
b) Calculate the residuals and their variance 𝑺𝟐𝒆 .
• How to calculate the residuals manually: e=Y-𝑌
σ 𝒆𝟐 41513.92
Y X
𝑌=320.3+32.96X e=Y-𝑌 𝒆𝟐 𝑆2 = = 10
𝑛−2
526 6 518.06 7.94 63.0436
𝑆 2 = 4151.392
421 3 419.18 1.82 3.3124
581 6 518.06 62.94 3961.444
640 9 616.94 23.06 531.7636
412 3 419.18 -7.18 51.5524 From regression printout:
500 9 616.94 -116.94 13674.96 𝑆 = 64.4313 → 𝑆 2 = (64.4313)2= = 4151.354
444 6 518.06 -74.06 5484.884
443 3 419.18 23.82 567.3924
580 9 616.94 -36.94 1364.564
570 6 518.06 51.94 2697.764
376 3 419.18 -43.18 1864.512
723 9 616.94 106.06 11248.72
41513.92
c) State the coefficient of determination 𝑹𝟐 and interpret the value
.
R2 the coefficient of determination will be interpreted.
( Y ) Alternative
2
SST = (Y − Y )2 = Y 2 − Cov( X ,Y )
n r = cor ( X ,Y ) = = 0.808241
var( x)Var (Y )
( 6216 )
2
𝑏1 −β1
Test function: t= that is t-distributed
𝑆2
𝑒
𝑥)2 𝑏 −0
σ(𝑥𝑖−ഥ
t= 𝑆1
𝑏1
with n-2 degrees of freedom (n=12)
32.96−0
Test function: t-value= 4151.354
=4.34
Coefficients SE t-value P-value VIF 𝑥)2
σ(𝑥𝑖−ഥ
6 0 0
3 -3 9 32.96−0 32.96
Test function: t= =7.593266= 4.340688 with 10 df
9 3 9 57.65769
6 0 0
3 -3 9 tc=1.812 with 10 df
9 3 9
𝑥ҧ =72/12=6 Since t>tc (4.34>1.812), the null hypothesis is rejected
= 72
e) Construct a 95% confidence interval for 𝜷𝟏 and interpret the interval.
𝑆𝑒2
𝑆𝛽2 =
1 σ(𝑥𝑖 − 𝑥)ҧ 2
• We know that confidence interval for β1 is 𝛽1 = 𝑏1 −𝑡 α ∗ 𝑠𝑏1 < β1 < 𝑏1 + 𝑡 α ∗ 𝑠𝑏1
𝑛−2, 𝑛−2,
2 . 2 .
Interpretation: With 95% confidence this interval covers the expected sale increase when the
surface exposure increases with a square foot.
f) Calculate 95% confidence intervals for the average weekly coffee sales when the
exposure area is 3, 6 and 9 square feet. Which interval is shortest and why?
95% CI for 3 square feet, with
Variable Setting 𝑿 = 𝟑 length 131.054
Variable Setting 𝑿 = 𝟑
Variable Setting 𝑿 = 𝟗
1 (9−6)2
616.875 ± 2.228* 1 + 12
+ 72
*64.43.13
600
550 Effect :
500
450
1. If you cannot fulfil assumptions of normality, the
400 estimations continue to be unbiased
350
300
2 3 4 5 6 7 8 9 2. But the confidence interval and the test are wrong
Exposure Area (X)