4SSMN902 May 2022

King’s Business School, King’s College London
This paper is part of an examination and counts towards the award of a

degree. Examinations are governed by the College Regulations under the
authority of the Academic Board. Students must not share or distribute this
examination paper.
Examination 2021/22
Module Code and Title: 4SSMN902 Statistics for Economists
Examination Period: Period 2, May 2022
Time allowed: Students are recommended to spend no longer than 2 hours
on this paper. The paper will be available for 24-hours from 12:00 midday
on Tuesday 17 May 2022. Students will need to submit their answers by
12:00 midday on Wednesday 18 May 2022.
INSTRUCTIONS TO CANDIDATES:
1. Answer ALL questions.
2. All mathematical/statistical workings should be shown. Please use the
equation editor in Microsoft Word, or take photos of hand-written
mathematical/statistical workings and paste these directly onto the
answer sheet using software or uploaded photos. All text
answers/parts MUST be written into the document and not
submitted in images unless you have PAA allowing this.
3. Do not use statistics software such as STATA or Excel for this
examination. You are expected to use a Casio fx-83 or Casio fx-85
calculator (or similar) which does not have advanced
mathematical/statistical functions.
4. You are expected to use the t-tables and z-tables at the end of this
examination paper where necessary, rather than using external
sources.
5. A template cover sheet has been provided on the KEATS page; you
should complete this and type your answers below, or attach it to the
front of your submission. Make sure you clearly indicate the questions
you are answering throughout (e.g. Section A, Question 1).
6. If you have a PAA cover sheet, you should include this in addition to
your submission.
7. Save your work regularly, at least every 15 minutes.
ONLINE SUBMISSION INSTRUCTIONS:

1. You should submit your work via the Turnitin submission link on the
module KEATS page.
2. Ensure your document is submitted through Turnitin with the title
CANDIDATE ID – MODULE CODE- e.g. AC12345-4SSMN902
See next page

4SSMN902
3. Once submitted please check you are satisfied with the uploaded
document via the submission link.
4. If you experience technical difficulties and are unable to upload your
assessment by the deadline, please collate evidence of the technical
issue and submit a mitigating circumstances form (MCF). Remember
that the evidence must clearly show timestamps and proof that you
attempted to upload your assessment before the deadline.
See next page

Page 2 of 10
4SSMN902
Section A (30 marks)
Answer ALL questions. Each question carries 6 marks.
Question A1
A group of 60 engineers jointly work on a large research project. 40 of the

engineers are male (M) and the remaining are female (F). Out of the 60
engineers, 80% are junior (J) engineers and the remaining are senior
engineers (S). 25% of the senior engineers are female.
a) State the probabilities P(M), P(F), P(S), P(J) and P(M|S).

(3 marks)
b) Given the above information, find the probability of:
i) A female engineer being senior.
ii) A junior engineer being male.
iii) An engineer being senior and female.

(3 marks)
Question A2
A sample of 41 steel plants produce a mean of 670 tonnes of steel each day,
with sample standard deviation of 103.2.
a) Create a 95% confidence interval for the population mean.

(3 marks)
b) What is the interpretation of the confidence interval? What factors

could result in an increase in the range of the confidence interval
found in a)?
(3 marks)
See next page

Page 3 of 10
4SSMN902
Question A3
The following ANOVA table represents the estimates calculated by a

researcher who wants to test for the equality of the Return on investment
(ROI) in five different regions, based on samples of the ROI in 40 firms from
each region. The corresponding F-distribution critical values are also shown
in the table, at the 5% and 1% significance levels.
ANOVA table for ROI
Sum of Squares between Group Means 620

Sum of Squares Within Groups 1220
Total Sum of Squares 1840
Corresponding F-distribution critical values:
5% = 2.42, 1% = 3.41
a) State the null and alternate hypotheses. (1 mark)
b) Using an F test, test your null hypothesis in a) at the 5% and 1%

significance levels.
(3 marks)
c) As a general rule, why is it important to distinguish between not

rejecting the null hypothesis and accepting the null hypothesis?
(2 marks)
Question A4
From a sample of 61 A-level mathematics students, the sample mean mark in

the final examination was 55. The sample variance in the final examination
was 49.
a) Find the 95% confidence interval for the population mean mark for the
final examination.
(3 marks)
b) In your answer to part a) what do you assume about the distribution of

the sample mean? Is your assumption valid? Explain your answer.
(3 marks)
See next page

Page 4 of 10
4SSMN902
Question A5
From the entire population of soybean farms, consider soybean yield,

measured in metric tonnes per hectare of land, as a normally distributed
random variable with mean 4.5, and a standard deviation of 2.5. From the
population of soybean farms:
a) What is the probability that a randomly selected hectare of land has a

soybean yield of less of 3.5 metric tonnes per hectare?
(3 marks)
b) What is the probability that a randomly selected hectare of land has a

soybean yield of between 5 and 6.5 metric tonnes per hectare?
(3 marks)
Section B
Answer BOTH questions.
Question B1 (30 marks)
ANSWER ALL PARTS
a) When given a random sample of data from a single population, explain

the steps by which you do a hypothesis test about the population
mean. In your answer, be sure to include:
• The central limit theorem.
• The difference between a one sided and two sided test.
• The significance level of the test.
• The circumstances in which you would undertake a z-test and a t-

test.
(10 marks)
See next page

Page 5 of 10
4SSMN902
b) A mining company runs copper mines with various levels of fertility,

where the profits per tonne of copper extracted are greater in more
fertile mines. A greater annual quantity of copper can also be
extracted from more fertile mines. As per the sample of mines in the
table below, mine A is of highest fertility, mine D is of lowest fertility.
Profit per
tonne of
copper Annual
Mine extracted ($) Tonnes of Copper Extracted
Mine A 300 50
Mine B 200 30
Mine C 150 20
Mine D 100 10
i. What is mean, median and mode profit per tonne of copper

extracted? As a shareholder, which measure would you be more
concerned with?
(5 marks)
ii. Given that the above table is an unbiased sample of data, what
is the variance, standard deviation and coefficient of variation
of profit per tonne of copper extracted?
(6 marks)
iii. Test the hypothesis that the population mean profit per tonne
of copper extracted is $250 at the 5% level of significance.
(5 marks)
iv. If you were told the population variance of profit per tonne of
copper extracted was 5,000, how would your answer in part iii)
change?
(4 marks)
See next page
Page 6 of 10
4SSMN902
Question B2 (40 marks)
ANSWER ALL PARTS
Layla is investigating the relationship between monthly wages (W) and years
of experience (EX). Layla is also interested in whether this relationship
varies between males and females. She gathers information on monthly
wages and years of experience for a sample of 60 workers, consisting of 40
males and 20 females. For the whole sample, Layla finds a linear correlation
between years of experience and wages of 0.7.
Layla also runs a regression of the form:
ln(Wi ) =  +  ln( EX i ) +  i
Where ln denotes the natural logarithm. The results of Layla’s regression
analysis are given in the Table below.
Regression results: Dependent variable is ln(W)
Whole sample Men Women

intercept 1.61 1.41 1.30
(1.29) (1.22) (0.80)
ln(EX) 2.55 2.70 1.50
(0.60) (0.80) (1.04)
R2 0.49 0.44 0.22

N 60 40 20
Standard errors are in parentheses
a) How is correlation calculated? What is the added benefit of doing a

regression of the form carried out above, compared with linear
correlation analysis?
(4 marks)
b) Explain what is meant by the standard error of the reported

coefficients. What factors increase/decrease the standard error of
coefficients?
(4 marks)
c) Use the standard errors to find the t-ratios for the coefficients of
ln(EX) for the whole sample and separately for the male and female
regressions.
(3 marks)
See next page
Page 7 of 10
4SSMN902
d) For the whole sample, and separately for males and females, test the
hypothesis that the coefficient of ln(EX) is statistically significant at
the 5% significance level. What is the economic interpretation of the
coefficient values?
(7 marks)
e) For the whole sample, and separately for males and females,
construct a 95% confidence interval for the coefficients of ln(EX).
Explain the relationship between the 95% confidence intervals and the
hypothesis tests undertaken in part (d).
(7 marks)
f) Discuss any statistical limitations of your answers in part (d) and (e).
(5 marks)
g) Given the dependent variable is monthly wages, and that there is a

greater proportion of women who work part-time than men, how
might this affect your conclusions in d)? What is an alternative
dependent variable which can better capture the relationship between
wages and schooling?
(5 marks)
h) How is the R2 value calculated? What is the meaning of the R2 value?

Does a higher R2 value necessarily imply a better model?
(5 marks)
Final page of questions
Appendix on following pages
Page 8 of 10
4SSMN902
Appendix: Statistical Tables and Formulas
The standard normal distribution
Appendix continues on next page
Page 9 of 10
4SSMN902
Critical values of student’s t-distribution for different probability levels,

α and degrees of freedom, v.
Final page of appendix
Page 10 of 10

4SSMN902 May 2022

Uploaded by

Copyright:

Available Formats

4SSMN902 May 2022

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

4SSMN902 May 2022

Uploaded by

Copyright:

Available Formats

King’s Business School, King’s College London

This paper is part of an examination and counts towards the award of a

ONLINE SUBMISSION INSTRUCTIONS:

See next page

See next page

Section A (30 marks)

Answer ALL questions. Each question carries 6 marks.

A group of 60 engineers jointly work on a large research project. 40 of the

a) State the probabilities P(M), P(F), P(S), P(J) and P(M|S).

b) Given the above information, find the probability of:

i) A female engineer being senior.

ii) A junior engineer being male.

iii) An engineer being senior and female.

a) Create a 95% confidence interval for the population mean.

b) What is the interpretation of the confidence interval? What factors

See next page

The following ANOVA table represents the estimates calculated by a

ANOVA table for ROI

Sum of Squares between Group Means 620

a) State the null and alternate hypotheses. (1 mark)

b) Using an F test, test your null hypothesis in a) at the 5% and 1%

c) As a general rule, why is it important to distinguish between not

From a sample of 61 A-level mathematics students, the sample mean mark in

b) In your answer to part a) what do you assume about the distribution of

See next page

From the entire population of soybean farms, consider soybean yield,

a) What is the probability that a randomly selected hectare of land has a

b) What is the probability that a randomly selected hectare of land has a

Answer BOTH questions.

Question B1 (30 marks)

ANSWER ALL PARTS

a) When given a random sample of data from a single population, explain

• The central limit theorem.

• The difference between a one sided and two sided test.

• The significance level of the test.

• The circumstances in which you would undertake a z-test and a t-

See next page

b) A mining company runs copper mines with various levels of fertility,

i. What is mean, median and mode profit per tonne of copper

See next page

Question B2 (40 marks)

ANSWER ALL PARTS

Layla also runs a regression of the form:

Regression results: Dependent variable is ln(W)

Whole sample Men Women

R2 0.49 0.44 0.22

a) How is correlation calculated? What is the added benefit of doing a

b) Explain what is meant by the standard error of the reported

g) Given the dependent variable is monthly wages, and that there is a

h) How is the R2 value calculated? What is the meaning of the R2 value?

Final page of questions

Appendix on following pages

Appendix: Statistical Tables and Formulas

The standard normal distribution

Appendix continues on next page

Critical values of student’s t-distribution for different probability levels,

Final page of appendix

You might also like