Tutorial Week 3 (Q)
Tutorial Week 3 (Q)
Question 1
The remuneration packages for the CEO’s of 12 international companies are (in $US 000’s) as
follows:
(a) Obtain the summary statistics using data analysis function from excel. In addition, find
the coefficient of variation and interquartile range.
[Use the Data Analysis button on the Data tab, and select Descriptive Statistics]
(b) For each of the following, comment on its suitability as a measure of a typical value from
this data set:
(c) In breaking news, it has just been announced that the highest paid of these CEO’s has
negotiated a new remuneration package and will now receive $25 million. For this
revised data set, calculate the revised value of each of the following summary measures,
and briefly comment on whether and how the value has changed from the corresponding
value given in the table above.
(i) Mode (ii) Median (iii) Mean (iv) Range (v) Interquartile range
Question 2
A interview was carried out to the mothers who had the school-aged children. The objective of
this research was to study the amount of time spent watching television by 12-year-old children.
Mothers were chosen for interview randomly from the national database of births. The
interviewers were employed to interview mothers from the selection list during the hours of
9am and 5pm. If the mothers were not home, the interviewers were instructed to skip that
household and move on to the next one on their list. The table below shows grouped data for
the amount of time spent watching television by 12-year-old children.
1
Time spent watching TV per week No of Frequency
0 100
1-5 350
6-10 175
11-15 150
16-20 50
21-25 250
26-30 50
31-35 0
36-40 1
a. This interview method omits a particular group of mothers' what is this group' and how
could omitting this particular group of mothers cause a sample selection problem?
b. Based on the histogram above, write down the approximate (to the nearest integer) value
of the Range and Median for the number of hours of TV watched by 12-year-old children
per week. Show any working you use.
c. Would you say the histogram exhibits skewness, and if so, in what direction?
d. Write a few sentences summarising the situation with regard to 12-year-old children
watching TV. Use non-technical language and structure your paragraph around what you
learned from your answers to (b) and (c) above.
e. Based on the histogram, our best guess is that the Mode is somewhere in the range of 1-
5 hours per week. Explain why this is not necessarily the case.
f. There is only one 12-year-old child who watches 36-40 hours of TV per week in the
sample. Suppose we do not believe this figure. What would happen to the Mean and
2
Standard Deviation of the number of hours of TV watched if this observation were
omitted from the analysis, and why?
Question 3
Table 1 and Figure 1 reports the descriptive statistics and a histogram for the monthly income
[in thousands of Rupiah, the Indonesian currency] earned by Indonesian adults for the year
2000. Table 1
ii. Give an intuitive explanation for why the mean of a positively skewed distribution is
usually bigger than the median. [2 marks]
iii. Note the minimum value. What do you think is happening with a person whose income
is zero? [1 mark]
iv. In this data the mode = 0 Rupiah. Often the mode is not a very interesting or relevant
measure, but in this case, it is quite useful. Why is it not appropriate to use the mode in
some cases, and why in this case is it an interesting quantity? What does it tell us?
[3 marks]
3
Part B: Now consider variation in the data.
i. Give a simple interpretation of the value for the standard deviation. [2 marks]
ii. The formula for the standard deviation is given below. Explain in intuitive terms how the
standard deviation is constructed and hence how it measures “average variation”.
∑𝑛
𝑖=1(𝑥−𝑥̅ )
2
𝑆=√ [3 marks]
𝑛−1
iii. When looking at an income distribution, it can be argued that individuals with income = 0
should be removed from the distribution. What would you expect to see happen to the
standard deviation if those with income = 0 were removed from this data? Explain your
reasoning. [3 marks]
Figure 1
i. What is the most frequently occurring income category and how does it compare to the mean
and median? [2 marks]
4
ii. Approximately, what is the probability a person in this sample earned more than 5,000,000
Rupiah per month? How many people in the sample are in this income category, and what
range of values does this category take? [4 marks]
iii. Work out approximately how much more a typical person in the top 20% of incomes earns
compared to someone in the bottom 20%. Note: to work out the top 20%, it may be easier
to work from the lowest 80%. [5 marks]
iv. Explain what we learn from part i to iii above about inequality in the distribution of income.
[4 marks]
Question 4
A large wholesale company is evaluating the performance of its two sales staff over the current
financial year. This organisation’s 200 clients are allocated to each sales staff at the beginning
of the financial year, and the salesperson’s performance is gauged by the value of sales made
to each client. The company wants to know whether one salesperson achieved higher average
sales than the other, and whether this performance differs by gender of the client.
Descriptive Statistics output for Sales, by salesperson, is given in the table below.
[Note: Sales = value of merchandise sales made to the client in the current year, $.]
i. Carefully interpret the Mean figures in the output. What does this tell you about the sales
performance of Salesperson A compared to Salesperson B? [3 marks]
5
ii. Carefully interpret the Standard Deviation figures in the output. What does this tell you about the
sorts of clients the company sells to, and how the sales performance differs between Salesperson
A and Salesperson B? [5 marks]
iii. Why are we interested in average sales rather than the total value of sales for each salesperson in
this case? [2 marks]