Data Handling Learner notes
Data Handling Learner notes
MATHEMATICAL
LITERACY
LEARNER NOTES
DATA HANDLING
2023
1
PLEASE NOTE:
It is of utmost importance that you study and know the definitions e.g. mean, mode and
range. The definition already explains the calculation that must be done.
Data is raw information that has been collected, without any organization of analysis. It is
unprocessed.
Data Handling refers to the process of collecting, organizing, summarising, representing and
analyzing information. It means gathering and recording information and then presenting it in
a way that is meaningful to others.
Data Handling
Interpreting
Developing Collecting Summarising Classifying and Representing
and analysin
questions data data organising data data
data
DEVELOPING QUESTIONS
The first step in the statistical process is to develop or pose questions.
When developing/posing the question, you must first identify the main question, followed by
sub-questions.
QUESTION 1 - EXAMPLE
Main question - what is the average monthly income of people in your community?
Sub-questions
In which age category do you fall?
In which sector/industry do you work?
What is your job title?
How long have you been working in this job?
QUESTION 2
Formulate 3 sub-questions for the main question below that will enable meaningful data
collection:
Are the expenses incurred for a Matric dance justified?
QUESTION 3
Formulate 3 sub-questions for the main question below that will enable meaningful data
collection:
How can your school's matric pass rate be improved?
2
COLLECTING DATA
Methods of collecting data:
1. Observation – e.g. counting the number of people entering a store. This is the method
of collecting data by watching and recording the results. The advantage of this method
is that you don’t interact with people to get the response.
2. Interview – e.g. asking your fellow learners their opinion of the design for your matric
jacket. The interviewer asks the interviewee questions and records the response. The
advantage of this method is that the interviewer may ask further questions if the
response is vague.
3. Survey – e.g. leaners complete a questioner on cool drink perverseness for the tuck
shop. A questionnaire is a tool used to conduct a survey and can be completed online,
in person, by telephone etc. Questions should not be long and must be clear. Answer
must also be concise. Questionnaires must be anonymous and confidential.
Questionnaires should be short and simple and not bias. This is a list of questions used
to collect data from the respondents. Participants do not have to identify themselves.
The advantage of using this method is that you get the information directly from the
participants.
Population – the entire group of interest e.g. all the leaners at school.
Sample – a representative part of the population e.g. randomly selects a number of people per
grade. A sample must be representative, randomly chosen, large enough and free from bias.
QUESTION 1
Susan will be managing the new tuck shop at your school, so she decided to hand out
questionnaires to the learners in order to do market research.
Draw up a questionnaire Susan can use in order to gather the information she requires.
QUESTION 2
A researcher is interested in the effect on a high sugar snack on the energy levels of primary
school learners. A group of 250 primary school learners were selected. Half are tested while
consuming the high sugar snack and the other half are tested without consuming the snack.
2.1 Identify the population
2.2 Identify the sample
3
CLASSIFYING DATA
Organising data is taking information and arranging it into some kind of order (such as
ascending or descending order).
Classifying data means organising it in groups or classes, based on some common feature.
NUMERICAL DATA:
CATEGORICAL DATA:
is generally descriptive in nature, as data is classified and organised into categories.
data is usually observed, but not measured.
examples: textures, smells, tastes, gender, eye color and country of birth.
categorical data can exist of “yes” and “no” answers.
4
SUMMARISING DATA
1 3 5 6 8
Median = 5
1 3 5 7 8 9
57
Median = =6
2
Mode = the value in the data set that appears the most
= there may be more than one mode or no mode at all
5
MEASURES OF SPREAD
1 2 3 4 5 6 7 8 9 10 11
Q1 Q2 Q3
Example B:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
Q2
Q1 Q3
Q1 = 4 Q2 = 7,5 Q3 = 11
Interquartile = Q3 – Q1
range
Five-point It consists of the following values in the data set
summary 1. Minimum value
2. Q1
3. Q2 (Median)
4. Q3
5. Maximum value
E.g. 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
30
The position of the 30th percentile: (n + 1)
100
(n = number of data in the data set)
30
(20 + 1) = 6,3
100
6
GROWTH CHATS
Provides an indication of the typical weight, age and height growth patterns of
children and babies.
The concept of percentiles is used in growth charts.
The curves on the growth chart below represents the percentile values of the data
collected from different age groups.
The growth chart is used to compare the BMI (body mass index) of a child to others
in his age group.
This is also used to determine the health status of the baby.
EXAMPLES
1. What is the BMI of a 4 year old girl at the 95th percentile?
2. The couple’s 10 year old child has a BMI of 16 kg/m². Between which percentile
curve does her BMI lie?
Solutions:
1. Draw a vertical line upward from 4 years to the 95th percentile.
Draw a horizontal line across to find the relevant BMI.
The BMI is 18 kg/m².
7
8
BOX AND WHISKER PLOTS
Box and whisker plots are graphical representation of the five number summary of a
set of data.
The five number summary:
1. Minimum value
2. Lower quartile (𝑄1)
3. Median (𝑄2 )
4. Third quartile (𝑄3 )
5. Maximum value
EXAMPLE
Read from the box and whisker plot the values of the five number summary.
Solution:
Minimum value 70
Lower quartile (Q1) 100
Median (Q2) 110
Third quartile (Q3) 115
Maximum Value 120
9
QUESTION 1
1.1 Determine the total number of people living in rural areas. (3)
10
QUESTION 2
The population of South Africa, per province, gender and population group for 2016 is
shown on TABLE 2 on ANNEXURE A.
2.1 Which province has the most black, male persons and how many are they? (3)
2.2 Which ONE of the following represents the total number of coloured people in
South Africa in 2016?
2.3 Identify the population group and provinces that have the exact same number
of male and female persons. (2)
2.6 Express the number of Asian female persons in Gauteng to the total number of
persons in Gauteng as a ratio in the form 1 : ... (3)
[15]
11
ANNEXURE A
QUESTION 2
POPULATION OF SOUTH AFRICA, PER PROVINCE, GENDER AND POPULATION GROUP FOR 2016
Thousands
Black Coloured Asian White Total
Province Male Female Total Male Female Total Male Female Total Male Female Total Male Female Total
Western Cape 1 062 1 057 2 118 1 523 1 636 3 159 18 19 36 525 524 1 049 3 127 3 236 6 362
Eastern Cape 2 852 3 117 5 969 253 283 536 7 4 11 101 114 215 3 213 3 518 6 731
Northern Cape 312 333 645 235 239 474 2 - 2 34 37 71 583 609 1 192
Freestate 1 146 1 275 2 420 53 45 98 8 4 12 103 136 239 1 310 1 459 2 769
Kwazulu-Natal 4 647 5 013 9 660 56 56 112 362 410 772 127 135 262 5 192 5 614 10 807
North West 1 744 1 723 3 467 20 24 45 9 10 19 104 124 228 1 877 1 881 3 758
Gauteng 5 335 5 175 10 511 210 225 436 254 212 466 1 034 1 096 2 130 6 834 6 709 13 543
Mpumalanga 1 966 2 053 4 019 9 6 15 9 9 18 116 122 238 2 100 2 190 4 290
Limpopo 2 643 2 902 5 537 13 18 32 32 16 48 59 50 109 2 739 2 986 5 724
South Africa 21 698 22 648 44 346 2 373 2 533 4 906 700 684 1 384 2 203 A 4 540 26 974 28 202 55 176
[Adapted by www.statssa.gov.za]
12
QUESTION 3
The number of learners, teachers and schools in the school sector of South Africa is
indicated per province for 2016 in TABLE 3.
Use TABLE 3 and the information above to answer the questions that follow.
3.1 Which province had the most learners in private schools in 2016? (2)
3.2 Which provinces have less than the mean number of teachers per province for
public schools? (4)
3.3 Determine the median value of teachers per province for private schools. (2)
3.4 Calculate the range for the number of learners in public schools for all nine
provinces. (2)
[10]
13
QUESTION 4
4.1
TABLE 4 below shows the number of people per province working in TWO
workplaces, namely Usual Workplace (UWP) and Work From Home (WFH) for the
last quarter of 2020 and the first quarter of 2021.
4.1.1 Show how the total value of 83,5 for South Africa was calculated. (2)
4.1.2 Give ONE reason why the values in the table will differ from the actual
workplace values. (2)
4.1.3 Write down the number of people who worked at their usual workplaces
(UWP) in Gauteng during the first quarter of 2021. (2)
4.1.4 Give ONE example of a job that cannot be done by working from home. (2)
4.1.5 Calculate the mean number of people in the WFH category for South Africa
in the last quarter of 2020. (4)
[12]
14
REPRESENTING, INTERPRETING AND ANALYSING DATA
The following representations of data can be drawn:
Discrete data
Compound
(stacked) bar
graphs
15
Histograms Histograms are different
from bar graphs in that
they represent continuous
data. Data that is
displayed on a histogram
is also grouped. There
are no spaces between
the bars.
Pie Charts Pie charts are circular
graphs, divided into
sectors. They are used to
show the parts that make
up a whole. They can be
useful for comparing the
size of relative parts. The
information is often
presented as percentages
that must add up to
100%. They are often
used in media to show
clear and important
differences, but they
cannot show shape and
spread of data.
Scatter plots A scatter plot is the most
useful graph for studying
the relationship
(correlation) between
two variables.
16
SCATTER PLOT
A scatter plot is the most useful graph for studying the relationship (correlation) between two
variables. It shows one of the variables on the horizontal axis and the other variable on the
vertical axis. The resulting scatter plot of points will show at a glance whether a relationship
exists. You cannot have more than two sets of data on a scatter plot.
A scatter plot can show:
• positive correlation
• negative correlation
• no correlation.
• When seeing patterns remember that the tighter together the points are clustered, the stronger
the correlation between the variables you have plotted.
• If you find a pattern that slopes from the lower left to the upper right, this tells you that as x
increases, y also increases. This means there is a “positive” correlation between the two
variables.
• If you find a pattern that slopes from the upper left to the lower right, this tells you that as x
increases, y decreases. This means there is a “negative” correlation between the two variables.
17
QUESTION 5
The number of unemployed people in Quarter 2 was 7,6 million, which is 183 000 less
than in Quarter 3.
The graph below indicates the unemployment rate for the different genders and the total
for South Africa for the first three quarters of 2021.
Quarter 3
Quarter 2
Quarter 1
5.2.1 Write down the quarter which showed the highest rate of unemployed men. (2)
18
QUESTION 6
A box and whiskers plot is given below, as well as terms that describe the different letters
on the diagram.
A B C D E
A
TERMS:
Median ; Maximum ; Quartile 3 ; Minimum ; Quartile 1
ggg
6.1 Provide labels for the box and whiskers plot by matching the terms with the
letters shown on the diagram. Write ONLY the letter and correct term. (5)
QUESTION 7
The pie charts on ANNEXURE B compare the five best-selling vehicles in South
Africa, America and Canada for 2021.
7.1 Write down, in words, the total number of vehicles sold in America. (2)
7.2 Express as a ratio in the form ¬¬¬__ : __ : __, the number of Toyota RAV4s
sold in America, Canada and South Africa respectively. (2)
7.3 Write down the median number of the best-selling vehicles in South Africa. (2)
7.4 Determine the number of Ford F-series vehicles sold in Canada. (3)
7.5 The interquartile range for the top 10 vehicles sold in South Africa is 7 669 and
the value of Quartile 1 is 11 408.
19
ANNEXURE B
QUESTION 7
COMPARISON OF THE FIVE BEST-SELLING VEHICLES IN SOUTH AFRICA, AMERICA AND CANADA FOR 2021
SOUTH AFRICA AMERICA CANADA
Toyota RAV4 VW Polo Vivo Ford F-series Ram Pickup Ford F-series Ram Pickup
Ford Ranger VW Polo Chev Silverado Toyota RAV4 Toyota RAV4 GMC Sierra
Isuzu D-Max Honda CR-V Chev Silverado
TOTAL NUMBER OF TOTAL NUMBER OF TOTAL NUMBER OF
VEHICLES SOLD = 111 710 VEHICLES SOLD = 2 584 176 VEHICLES SOLD = 357 243
20
QUESTION 8
[Percentage occupancy is the percentage of all rental units that are rented out at a given
time.]
8.1 The average daily rate in Kula remained almost the same from 2011 to 2014.
Explain your observations regarding the percentage occupancy in Kula during the
same period. (4)
8.2 Compare the relationship between the average daily rates and the percentage
occupancy in Ubud for the year to date (YTD) Sep. 2014 to YTD Sep 2015. (4)
8.3 Explain why both graphs have a gap between 2014 and YTD September 2014. (4)
[12]
21
ANNEXURE C
QUESTION 8
AVERAGE DAILY RATES AND OCCUPANCY FOR DIFFERENT REGIONS FROM 2010 TO SEP. 2015
400 85
350 80
Average daily rate in USD
300
Percentage Occupancy
75
250
70
200
65
150
60
100
50 55
0 50
2010 2011 2012 2013 2014 YTD YTD 2010 2011 2012 2013 2014 YTD YTD
Sep Sep Sep Sep
2014 2015 2014 2015
22
QUESTION 9
9.1
TABLE 5 shows the types of voting stations (VSs) used during the 2016 local
government elections in South Africa.
Use TABLE 5 and the information above to answer the questions that follow.
9.1.2 State the province which has the most voting stations. (2)
9.1.3 Determine the mean number of voting stations (VSs) in South Africa. (3)
9.1.4 Write down the modal number of mobile voting stations in South
Africa. (2)
9.1.7 The bar graph on the ANSWER SHEET shows the total number of voting
stations.
On the same ANSWER SHEET, the first three bars are drawn showing
the permanent voting stations.
Fill in the remaining bar graphs showing the permanent voting stations. (6)
23
9.2
The TWO pie charts below show why and how people in South Africa travel.
Study the TWO pie charts above and answer the questions that follow.
9.2.1 Calculate the percentage of people whose reason for travel is sport. (2)
Calculate the number of people who travel to visit family and friends. (2)
[26]
24
ANSWER SHEET
QUESTION 9.1.7
Types of voting stations used during the 2016 local government elections
5000
4500
4000
3500
Number of voting stations
3000
2500
2000
1500
1000
500
0
Gauteng
Free State
Kwazulu-Natal
Mpumalanga
Western Cape
Limpopo
Eastern Cape
North West
Northern Cape
25
QUESTION 10
10.1
TABLE 6 below shows the estimated provincial half-yearly livestock numbers
(in thousands) for the nine provinces in South Africa for August 2020 and February 2021.
Use TABLE 6 and the information above to answer the questions that follow.
10.1.1 Write down the province with the second highest number of sheep for
February 2021. (2)
10.1.2 Calculate Eastern Cape's estimated total number of livestock for August 2020. (3)
A farmer in Limpopo stated that the missing value A in the table is less than
200.
Verify, showing ALL calculations, whether the farmer's statement is valid. (7)
26
10.2 South Africa's agricultural sector sales in 2019 amounted to R317,6 billion.
Use ANNEXURE D and the information above to answer the questions that follow.
10.2.3 Calculate, in millions, the actual rand value of horticulture sales. (3)
10.2.4 Give a valid reason why there is a category for other livestock under animals. (2)
[21]
27
ANNEXURE D
QUESTION 10.2
DISTRIBUTION OF R317,6 BILLION SALES IN SOUTH AFRICA'S AGRICULTURAL SECTOR IN 2019
28