Module 4. Data Collection and Sampling Week 3
Module 4. Data Collection and Sampling Week 3
DATA COLLECTION
and SAMPLING
Kinsa???
SELECTION OF THE PARTICIPANTS OF THE STUDY
Pila???
COMPUTATION OF THE SAMPLE SIZE
Unsaon???
METHODS OF DATA COLLECTION
Methods of Data Collection
Data can be collected
➢Directly using
•Questionnaires
•Interviews
PRIMARY DATA
•Experiments
•Direct observations
➢Indirectly through
•Existing documents/records SECONDARY
DATA
Methods of Communication
Factor Self-Administered Telephone On-line Survey Personal Interview
Questionnaire Interview
Cost Inexpensive Quite Expensive Quite Expensive Very Expensive
Speed Time-consuming Fast Moderately Fast Time-consuming
Response Rate Poor Average Average Very Good
Interviewer Bias None Likely None Highly Likely
Quality of May be vague Good May be vague Very good
Response
Type of Limited Limited Limited Wide-range
Information
Census Vs. Sample Survey
Census Sample Survey
Sample Size Determination
SAMPLE TERMS:
SIZE Total Population (𝑵) – This is the total number of
population.
Margin of error (𝒅) – Is a statistic expressing the
𝑍𝛼2ൗ 𝑝𝑞 INFINITE
amount of random sampling error in a survey's
results.
2
𝑛0 = POPULATION Level of Significance ( 𝜶 ) – Probability of
𝑑2 committing a type I error.
Sample proportion (𝒑) - The sample proportion is
𝑛0 what you expect the results to be. This can often be
𝑛= FINITE determined by using the results from a previous
𝑛0 − 1
1+ 𝑁 POPULATION survey, or by running a small pilot study. If you are
unsure, use 50%, which is conservative and gives
the largest sample size.
𝑞 = 1 − p.
Example. You are investigating the level of awareness of
CAS students in CSU towards the accessibility law or the BP
344. Three (3) programs were used as the target
populations, namely; BS Math (𝑁1 = 200), BS Bio (𝑁2 =
500), and BS SW (𝑁3 = 800). Since no data are available
on the proportion of CAS students knowledgeable, you take
the worst case scenario and set p = 0.5 (and therefore q =
1-0.5 = 0.5). As this is a preliminary study you are prepared
to accept a margin of error of ± 5% so you set d = 0.05. How
many students per program should you get for your
sample?
Given: 𝑁 = 1500, 𝑝 = 0.5, 𝑞 = 0.5, 𝑑 = 0.05, 𝛼 = 0.05.
So, 𝑍𝛼 = 𝑍0.025 = 1.96.
2
𝑍𝛼2ൗ 𝑝𝑞 1.96 2 (0.5)(0.5)
2
Now, 𝑛0 = 𝑑2
= (.05)2
= 384.16 ≈ 𝟑𝟖𝟓.
𝑛0 385
So, 𝑛 = 𝑛 −1 = 385−1 = 306.53 ≈ 𝟑𝟎𝟕.
1+ 0 1+
𝑁 1500
Stratum Population (𝑵𝒊 ) Proportion (𝑷𝒊 = 𝑵𝒊ൗ𝑵) Sample Needed (𝒏𝒊 = 𝒏 × 𝑷𝒊 )
BS Math 200 0.133 307 × 0.133 ≈ 41
BS Bio 500 0.333 307 × 0.333 ≈ 103
BS SW 800 0.533 307 × 0.533 ≈ 164
Total 1500 100.00 308
Sampling
Sampling is the process of selecting observations (a
sample) to provide an adequate description and
inferences of the population.
Sample
Set of units that is selected from
population
Represents the whole population
To draw inference
What you
What you
want to actually
Population
talk about observe in
the data
Sampling Process
Sampling
Sample
Frame
Inference
Why Do Sampling?
There are several reasons for researchers to do sampling
rather than conducting a census. Four important reasons are
as follows:
✓ Low cost of sampling
✓ Less time consuming in sampling
✓ Scope of sampling is high
✓ Accuracy of data is high
Nonsampling Error
Error in the Implementation of the
Measurement Error
Sampling Design
➢ Selection Error ➢ Instrument Error
➢ Frame Error ➢ Interviewer Bias
➢ Population Specification Error ➢ Response Error
- Response Bias
- Nonresponse Bias
➢ Processing Error
➢ Surrogate Information Error
Probability Sampling
SIMPLE RANDOM SAMPLING
All units of the frame are given an equal probability.
❑ Random number generators
❑ Lottery
STRATIFIED RANDOM SAMPLING
❑ Population is divided into two or more
homogeneous groups called strata
❑ Samples are randomly selected from each strata
CLUSTER SAMPLING
❑ The population is divided into natural groups (clusters).
❑ Randomly pick some clusters from all the clusters.
❑ Completely enumerate all samples from chosen clusters.
SYSTEMATIC RANDOM SAMPLING
❑ Order all units in the sampling frame
❑ Then every kth number on the list is selected
❑ k = Sampling Interval
MULTISTAGE SAMPLING
Primary Secondary
Clusters Clusters Simple Random Sampling withi n Secondary Clu
❑ Carried out in 1
1
2
stages 2
3
7
5
units at each stage 6
8
10
7
11
8
12
13
9
14
10
15
Nonprobability Sampling
CONVENIENCE SAMPLING
❑ Convenience sampling involves choosing respondents
at the convenience of the researcher.
❑ Very low cost
❑ Extensively used
❑ Restriction of Generalization.
JUDGEMENTAL SAMPLING
❑ Researcher employs his or her own "expert” judgment
about.
❑ There is an assurance of Quality response
❑ Meet the specific objective.
❑ Bias selection of sample may occur
❑ Time consuming process.
QUOTA SAMPLING
❑ Nonprobability sampling version of stratified sampling.
❑ Strata exist but nonrandom selection of individual
within the group
❑ Researcher just set a quota
SNOWBALL SAMPLING
❑ The research starts with a key person and introduce
the next one to become a chain
❑ Low cost
❑ Useful in specific circumstances & for locating rare
populations
❑ Projecting data beyond sample not justified
Exercises. What method of data collection is most appropriate for the following cases?
Give a brief explanation for your choice.
1. Studying two groups of patients and determining if exercise lowers the blood
pressure.
2. The Department of Health monitors and evaluates the benefits of the family planning
methods given to Brgy. Ampayon.
3. A group of medical intern students studies the effects of laughter to patients in a
Butuan Medical Center.
4. A nongovernment organization compares the household expenditures of two districts
in Butuan City.
5. A group of Anthropology students studies the culture and norms of two ethnic group.
6. A social welfare organization gathers information on hospital patients with mental
disorder.
7. A car manufacturer studies the preference of cars for the next production.
1. Formulation of the problem
(Research topic with at least 5
problem statements; Research
Designs)
2. Crafting of the Survey
Instrument (Initial and Final
Questionnaire; Pilot Testing; Activity 3.
Validation and Reliability)
3. Actual data gathering (Sample
Crafting a Sampling Design
Size; Sampling Techniques; Data
Collection Methods)
4. Data encoding (Correct Send output on or before
Database)
February 21, 2020 (Friday) to
Data Analysis (Descriptives:
bgagua@carsu.edu.ph
5.
Tabular, Graphical, Numerical
Measures, etc.)
6. Data Analysis (Inferential:
Comparison of Means,
Correlational Analyses; Regression
Analysis)
7. Final Write-up (Article-type)