Unit 1 - Bidimensional
Unit 1 - Bidimensional
Unit 1 - Bidimensional
1) A local wine tasting contest took fifty different wines from the near towns. For each wine it
was annotated the number of fermentation years and the alcohol content as a pair (x, y). The
data on the fifty contestants is the following:
(3,11) (4,13) (3,11) (3,12) (3,12) (2,13) (3,11) (2,13) (2,13) (2,12) (4,12) (2,12) (3,12) (3,11)
(2,12) (4,12) (4,12) (4,13) (4,13) (4,12) (3,13) (3,12) (4,12) (4,12) (2,13) (2,12) (3,13) (3,11)
(3,13) (2,11) (3,11) (3,13) (2,12) (2,12) (4,12) (3,12) (2,11) (3,11) (3,13) (3,11) (3,12) (3,12) (3,12)
(3,12) (2,12)
2) The table below has the number of pickup vans X and trucks Y of four different transport
companies. Determine whether the two variables are independent or not.
xi yi nij
1 2 3
2 2 2
1 3 9
2 3 6
3) The table below has the information about the number of workers of a certain company organized
by the number of weekly hours of work (X) and their (month) salaries (Y), in euros.
31-35 5 4 2 1 0
35-37 1 2 4 3 3
37-41 0 3 4 2 6
(a) Find the average of weekly work hours. If every employee works two extra hours per week,
what will be the new average?
(b) Find the most common salary for those employees working more than 35 hours a week.
(c) Determine the quartiles for the number or work hours of those employees whose salaries are
between 2750 and 4750 euros.
(d) Study the concentration of salaries.
(e) Determine the distribution of work hours for the salaries below 3750 euros, in relative
frequencies.
4) The government of Spain requires people with one of the following diseases to report it to their
local health center: tuberculosis, typhus, measles and meningitis. The information has been
collected and presented in the following table, for patients at most 25 years-old. The data is in
thousands of people.
Age
(a) Determine the marginal distributions of the variables age and disease.
(b) Determine the age distribution of the disease “tuberculosis”, and the disease distribution
of the age group from 10 to 15 years.
(c) Compute the average age, as well as the average age of those patients suffering from
meningitis.
5) Let be the variables X and Y, with the following joint absolute frequencies:
X 2 13 15 20 23 25
Y
4 5 13 28 1 4 18
7 4 20 33 3 2 6
14 3 16 11 4 3 7
15 14 8 13 16 5 2
17 8 14 24 21 3 3
Calculate:
(a) The marginal distributions of variables X and Y.
(b) The conditional distributions X/Y=13 and Y/X=15.
(c) The means and variances of X, Y, X/Y=13 and Y/X=15.
6) An economist studies the relation between two variables: gold price along the XXth century (X) and
the net benefits of stock markets (in constant € of 2017) during the same years (Y). Now the
economist wants to include the following transformations on its variables:
(a) The gold prices, expressed on dollars, have to be expressed on constant euros of 2017. We call
the new variable W.
(b) The net benefits have to be increased in a p% and add an extra benefit of E € from an additional
investment. We call the new variable Z.
S2X = 123
S2W = 78.72
SXY = 66
SWZ = 58.08
Obtain p.
7) A company has surveyed 200,000 people, asking them about the kind of beer consumed. The results
are classified by consumer age in the following table (in thousands):
Kind of beer
26-45 33 30 2
46-60 30 19 1
61-85 5 9 1
(a) Obtain the distribution in relative terms of the different analyzed beers.
(b) Calculate the average age of the low fermentation beer consumers. Is it representative?
(c) ¿Which kind of beer is the most commonly consumed by 26 to 60 years old consumers?
(d) Which is the age of the youngest consumer among the oldest 20%?
8) In a survey in which we interviewed 480 families, we obtained the following data about monthly
incomes (X) and saving accounts deposits (Y) in banks.
X
0-200 200-500 500-2000 2000-10000
Y
50-100 40 12 8 -
100-150 16 48 12 4
150-250 8 80 92 20
250-500 4 40 72 24
Assuming that the class marks are representative for each interval:
(a) Calculate the values of n1 , n2 , n2 and n3.
(b) Express, in so much per one, the values of: f12, f23, f34, f42, f2, f3, f2 and f4.
(c) Express, in percentage, the values obtained by: f13, f21, f32, f44, f1, f3, f3 y f4.
(d) Express, in so much per one, the values obtained by:
1. f(X1/Y=350) f(X2/Y=1250) f(Y1/X=375) f(Y2/X=200)
(e) Express, in percentage, the values obtained by
2. f(X3/Y=1250) f(X4/Y=6000) f(Y2/X=375) f(Y3/X=200)
(f) Calculate the marginal averages of X and Y.
9) A company has started selling a new product. The results of a survey in which 10 people were asked
to evaluate from 1 to 5 (worse to better) the following variables appear in the following table. The
meaning of each variable is as follows:
X1 = ‘’product global valuation’’ X2 = ‘’price-quality relation’’
X3 = ‘’product capability’’ X4 = ‘’advertising campaign’’
X1 X2 X3 X4
1 2 2 1
3 4 3 4
5 3 4 5
3 4 3 4
2 3 2 2
3 4 3 4
4 5 5 5
1 2 2 1
3 4 2 3
4 2 4 4
Calculate:
(a) The four one-dimensional marginal distributions.
(b) The distribution of the advertising campaign for punctuations 3 in the global valuation of the
product.
10) The joint distribution of the work surface in hectares (variable X) in a certain province, and the
wheat production in tons (variable Y) for year 2000, is showed in the following table:
X
Y [1,5; 2,5) [2,5; 3,5) [3,5; 4,5) [4,5; 5,5)
(1, 2] 3 4 6 9
(2, 3] 4 5 8 11
(3, 4] 5 8 11 13
(4, 5] 4 7 9 10
(a) Obtain the averages and variances of the marginal variables.
(b) Obtain the averages and variances of variable X conditioned to 3.5Y<4.5.
(c) Find the covariance.
11) The following data shows the results of a survey: for each territory in Spain, we have number of
unemployed per 1000 active people (X) and the number of civil servants per 1000 people (Y). Use
the data to:
12) A survey studies the relation between incomes in millions of euros (X) and number of employees
(Y) per company.
In class we have seen how to obtain the covariance from a bivariate distribution frequency table. You can
build the table from this dataset and apply the formula.
An alternative is to use the formula that calculates the covariance directly from the raw data. This formula
can only be used when no pair of values (X,Y) is obtained more than once:
𝑁
1
𝑆𝑥𝑦 = ∑(𝑥𝑖 − 𝑥̅ )(𝑦𝑖 − 𝑦̅)
𝑁
𝑖=1
Find:
(a) The covariance between variables X and Y. Interpret the result.
(b) A local currency (LC) has the following exchange rate: 1 LC = 2 €. If we convert our
incomes into this local currency, how will it affect the covariance?
(c) If all companies decide to increase in 10 the number of employees, how will it affect the
covariance?
13) We throw simultaneously 24 pairs of dice (X,Y), obtaining the following result:
(1,2) (2,3) (2,1) (3,1) (4,6) (1,6) (4,1) (5,2) (3,6) (3,4) (5,3) (4,2)
(2,5) (5,1) (4,2) (1,6) (6,2) (5,1) (1,6) (3,4) (3,5) (4,1) (4,2) (6,5)