Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

1A Sources of Data

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

Topic 1 Sources of Data

CXC CAPE APPLIED MATHEMATICS


UNIT 1: STATISTICAL ANALYSIS
MODULE 1 : COLLECTING & DESCRIBING DATA
TOPIC A : SOURCES OF DATA

Students should be able to:


✓ Distinguish between qualitative and quantitative data, and discrete and continuous data
✓ Distinguish between a population and a sample, a census and sample survey, and a parameter and
a statistic
✓ Identify an appropriate sampling frame for a given situation
✓ Explain the role of randomness in statistical work
✓ Explain why sampling is necessary
✓ Outline the ideal characteristics of a sample
✓ Distinguish between random and non-random sampling
✓ Distinguish among the following sampling methods – simple random, stratified random,
systematic random, cluster and quota
✓ Use the ‘lottery’ technique or random numbers (from a table or calculator) to obtain a simple
random sample
✓ Outline the advantages and disadvantages of simple random, stratified random, systematic
random, cluster and quota sampling

TYPES OF DATA
Qualitative data – tends to be non-numerical because it looks at characteristics, descriptions, opinions,
feelings, estimates.
Quantitative data – tends to be numerical because it is measurable. Therefore, it will often have units.
Discrete data – can take only certain values within a range.
Continuous data – can be any value, and is only restricted by the level of accuracy of the measuring
instrument used.
Examples: State the type of data for each example given below:
1. What something smells like
2. The number of hours someone studied
3. Your favourite place to go
4. How good food tastes
5. The sound level on a stereo system
6. Your score on your last Maths test
7. The height of your favourite sports personality
8. The weight of your pet
9. The most popular names for babies each year

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 1
Topic 1 Sources of Data
Read Pgs 1 – 4 (Discrete & Continuous Data)

DEFINITIONS
Population – All of the elements, persons, animals or units that fall into a set or group under analysis.
Sample – a subset of the population which is studied in order to make a determination about what the
population should be like.
Survey – a way of gathering information about a population.
Census – a survey taken from an entire population.
Sample survey – a survey taken using any portion of a population apart from the whole.
Parameter – a numerical value that is characteristic of a population and can be found or estimated by
calculation using survey data. E.g. the mean, variance, etc.
Statistic – a number that represents a piece of information, i.e. a numerical datum. E.g. how often you
do something, how common something is, etc. These are generally summarised from sample data.

Read Pgs 421 - 423 (Population & Surveys)


Do Pg 430 Ex 9a Q1

SAMPLE FRAMES
When it has been determined exactly what group you are going to study (target population) and how, a
comprehensive list of the members of that population must be created. This list is called the sample
frame. Each member of the list is given a number to identify them by, which allows members to be
referred to discretely, and creating the list also allows for easy subdivision into more manageable
pieces if necessary so an effort is made to list the members in a logical and systematic way (this also
allows members to be easily located). The information that will be used to find or contact each sample
unit is included (telephone numbers, addresses, form, etc). Depending on the difficulty in acquiring
certain types of information, you may list cluster groups to gather information from.
Examples
8. Population: Students taking Applied Maths Unit 1 at Today’s Secondary
Sample Frame: 1. Band C – Appiana Holmes, L6Picasso 2. Band C – Dapple Athlete,
L6Einstein 3. Band D – Lionel Mathemati, L6Suzuki …
9. Population: Birds in Barbados
Sample Frame: 1. Doves 2. Pigeons 3. Herons 4. Egrets 5. Hummingbirds
6. Finches 7. Blackbirds 8. Sparrows ….
Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 2
Topic 1 Sources of Data
10. Population: Students at Queen’s College
Sample Frame: Nominal Roll
11. Population: Home owners in St. James
Target Population: Home owners from St. James listed in the directory (telephone survey)
Sample Frame: List of Home owners listed in St. James
12. Population: Eligible voters
Sample Frame: Electoral register
13. Population: People who live in Bridgetown
Sample Frame: Map of Bridgetown with a list of streets to be surveyed (cluster groups)

Read Pgs 423 - 424 (Bias)

SAMPLING
Why is sampling necessary? Often a population is far too large to feasibly do a census. The amount of
resources required is too large to realistically tackle such a project – too many people to survey, too much
time needed to reach them (you need the information soon), too large a workforce needed to engage
everyone (you don’t have the money to pay them, the plant to house them, the computers for them to
input the data into, …). With this in mind, you can pick a subset which you hope will tell you what you
what to learn about the population. Often you try not to bias which elements of the population you put
into your sample, so that you can feel confident that you have not skewed your analysis onto one or two
sections of the population. Thus you want to give each element an equal chance of being chosen for your
sample. This produces what is known as a random sample.

IDEAL CHARACTERISTICS OF A SAMPLE


1. The number of sample units must be enough to allow the conclusions drawn from it to represent the
population (adequacy)
2. Each sample unit should have the possibility of being included or could be replaced by another
sample unit (independence)
3. It really represents the population – it has the same qualities/features/characteristics or the data and
the various areas (representative)

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 3
Topic 1 Sources of Data
RANDOM VS NON-RANDOM SAMPLING
Any sample in which an effort is made to give each sample unit an equal opportunity to be chosen is
random. When that is not followed then a non-random sample results. You may decide on a sample of
convenience, for example, where you choose to poll all of the persons on your street, which would give
persons living outside of your area no chance of being chosen.

Read Pgs 424 - 428 (Sampling Methods)

SAMPLING METHODS
Simple Random – Once you have a sample frame and each of the sample units is numbered, you can
randomly choose a number to pick a sample unit to be included in your sample. The methods used to
choose the random numbers will be explored more in the next section.
Stratified Random – The problem with simple random sampling is that although an effort has been
made to choose the members of the sample randomly, all of the members may still end up coming from
the same section of the population. E.g. a survey of the students of this school may turn out having a
majority of students from 2nd form and lower 6th. If the survey is interested in the students’ opinions
about their timetable then this will clearly produce a biased viewpoint. To avoid this problem,
especially in cases where it is deemed important to get the viewpoints/measurements of all segments of
the population, the population is first stratified (i.e divided into groups with seemingly similar
characteristics) and then individual members are then chosen randomly out of each group. E.g. a
sample of student opinions at this school may first stratify students by form level and/or gender before
picking students to be surveyed.
Systematic Random – in this method every 𝑘th term is chosen to be included in the sample. E.g. if
your population has 1000 elements and you would like a sample of 100 elements then you would chose
every 10th element. The first element would be chosen randomly out of the first 10 elements and then
the 10th element after that would be chosen. For example, if the first element chosen randomly was the
3rd one, then the other positions chosen would be 13th, 23rd, 33rd, 43rd, 53rd, etc.
Cluster – this method is used when it is difficult to create an exhaustive list of the sample units. If we
were seeking after the opinions of churchgoers in Barbados, it may not be practical to try to create a list
of all members of all church organisations before starting to choose persons for the sample. Instead,
you can create clusters (denominations of the church) and then sample clusters using one of the
previous methods discussed. All of the possible sample units in any given cluster would be studied
(one-stage) or units can be chosen out of each cluster to be sampled (two-stage).
Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 4
Topic 1 Sources of Data
Quota – the population is broken down by characteristics and the proportion of the population with
each characteristic is expected to be the same proportion in the sample. E.g. you would need to know
what proportion of the population is male and female, and then for each of these subgroups, how many
persons are in the various age categories, ethnic groups, urban/rural, etc. A matrix is then created with
each of the groups in their proportions and you simply have to find a person who meets the criteria for
each subgroup’s quota. E.g. a survey about life in Barbados may only be able to poll 318 people. If
the population is 48% male and 52% female, then you’re aiming to poll 153 males and 165 females.
We then have 18.29% people aged 0 - 14, 13.35% people aged 15 - 24, 44.62% people aged 25 - 54,
12.87% people aged 55 - 64 and 10.88% people aged over 65. This would mean the matrix could look
like this:
Males Females Total
0 – 14 28 30 58
15 - 24 20 22 42
25 – 54 68 74 142
55 – 64 20 21 41
65 and over 17 18 35
Total 153 165 318

Once the number of persons of each gender and age category are then polled it satisfies the criteria for
this survey.

Read Pgs 429 (Non-random Sampling)


Do Pg 430 Ex 9a Q2, 4, 7

OBTAINING A SIMPLE RANDOM SAMPLE


There are two ways to randomly select the sample units that you will use in any given sample:
Lottery Technique – This method is conducted somewhat like a lottery draw, thus the name. All of
the numbers assigned to the sample units are placed on pieces of card, chips or balls and placed into a
bag or bowl. The numbers are mixed thoroughly and drawn without looking to determine which
sample units will be used in the sample. This is repeated until the number of sample units to be used
(previously determined) are selected.
Random Numbers – (i) Random Number Tables – there are compilations of random numbers given
in tables. These tables can be used to pick ‘appropriate’ numbers for your sample frame to choose
Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 5
Topic 1 Sources of Data
which sample units will be included in the sample. One is shown below and one is in your textbook on
page 653.

- (ii) Electronic Random Number Generators – calculators and computer software (like Excel
and Numbers) have random number functions (Ran# and RanInt on the calculator or RAND
or RANDBETWEEN on the computer). The Ran# and RAND functions generate a number
from 0 to 1. This number can then be multiplied by the number you are interested in to
generate a number of the correct size. The RanInt and RANDBETWEEN functions allow
you to insert the two endpoints that you’re interested in. E.g. RANDBETWEEN(100, 500)
will randomly generate numbers from 100 to 500. This is another unbiased way of picking
sample units to include in your survey.

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 6
Topic 1 Sources of Data
ADVANTAGES & DISADVANTAGES OF SAMPLING METHODS
Sampling Method Advantages Disadvantages
Simple Random - Avoids the bias of the - Requires large amounts of
researcher resources – time, effort,
- Should represent the target money, access to
population information
- Can be used with - Can lead to poor
populations of any size representation of the target
population if the random
numbers omit large areas
of the population
Stratified Random - Guarantees a high degree of - Very time-consuming and
representativeness of all tedious to set up
segments of the target
population
- Because of the high degree
of representativeness we
can have confidence that
the results can be
generalized for the entire
population
Systematic Random - Ensures a fairly - It is not as random as simple
representative sample is random sampling
obtained without the need - Requires a lot of time, effort
to generate many random and money
numbers - Some areas of the target
population may be over or
underrepresented, especially
if there is some kind of
pattern occurring in the
population
Cluster - Makes it possible to create - Members of selected
a sample when there is no clusters may be very alike

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 7
Topic 1 Sources of Data
means of obtaining an but may be very different
itemised list of sample units from unselected cluster
- Easy, cheap and convenient groups
to start - It may not be easy to
generalise the findings to
the entire population
Quota - Creates a truly - Because the sample units
representative sample of the are not specific, the bias of
target population the researcher may
- It is easier and faster to introduce a bias
carry out since it does not - In this regard, it is then
require a sample frame or difficult to know how
specific sample units much of the results can be
generalized

Do Pg 430 Ex 9a Qs 3, 5, 6, 8

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 8
Topic 1 Sources of Data
PAST PAPER QUESTIONS
QUESTION 1

[2009 Paper 2 Q1]

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 9
Topic 1 Sources of Data
QUESTION 2

[2009 Paper 2 Q1]


QUESTION 3

[2009 Paper 2 Q2]

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 10
Topic 1 Sources of Data
QUESTION 4

[2009 Paper 3 Q1]

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 11
Topic 1 Sources of Data

QUESTION 5

[2010 Paper 2 Q1]


Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 12
Topic 1 Sources of Data

QUESTION 6

[2010 Paper 2 Q1]

QUESTION 7

[2010 Paper 3 Q1]

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 13
Topic 1 Sources of Data

QUESTION 8

[2011 Paper 2 Q1]

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 14
Topic 1 Sources of Data

QUESTION 9

[2011 Paper 2 Q1]

Queen’s College
Mathematics Department
Mr. Goodridge, Mrs. Maxwell Page 15

You might also like