Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Research Methods & Materials

Download as pdf or txt
Download as pdf or txt
You are on page 1of 78

Research methods

 Methods the specific techniques, tools or procedures applied to

achieve a given objective

 The techniques researchers use in performing research


 All those methods used by aresearcher during the course of

studying his research problem

Main Components of research methods
 Study area and period
 Study design
 Study Population
 Sample size and sampling procedure
 Variables and Operational definitions
 Data collection Tools and methods
 Data processing and analysis
 Ethical issues in research

• Study area and period:

• Study area – Description of the background

information including: Population, geography,
institution as relevant, use map if possible.

• Study period- data collection period.

• Choosing study population:

• Sampling:
• take a sub group from a large population & the sub group used as a basis
for making inferences regarding the larger population.
• Because;
• Difficulty to get information from everyone in the population,
• To obtain representative sample of a population,
• Feasibility, reduced cost, greater accuracy and greater speed,
• Population is dynamic (so that data should be collected in short time).

• However, the operation needs rigid control; sampling error, & smallness in
number render study to be suspected.
• The whole population is studied;
• For enumeration (census),
• when there is small population, and
• where there are extensive resources
• Otherwise, sampling.

• The issues of representative sampling technique and adequate sample

size are important for correct estimation of the parameter based on
the statistic.

• Target/source/reference population: The population to which

inference is made.
• Ideally the source and study population should overlap.
Important terms:
• Study population: The part of the source population that‟s given the
chance to be included in the study.
• Sample: Part of the study population that‟s actually studied.

•Sampling unit: Is the unit of selection in

the sampling process.
• Eg. Household

• Study unit: The unit from which information is collected.

• Eg. Study subjects (eg. Mother, under-five children, etc.)
Sampling techniques:

• Probability sampling: Every unit in the population has a known,

non-zero probability, of being sampled and the process involves
random selection.

• Non- probability sampling: is any sampling method where some

elements of the population have no chance of selection or where
the probability of selection can't be accurately determined.
Classification of Sampling Techniques
Sampling Techniques

Nonprobability Probability
Sampling Techniques Sampling Techniques

Convenience Judgmental Quota Snowball

Sampling Sampling Sampling Sampling

Simple Random
Systematic Stratified Cluster Multistage
Sampling Sampling Sampling sampling

The Sampling Design Process

i. Define the Target Population

ii. Determine the Sampling Frame

iii. Select a Sampling Technique

iv. Determine the Sample Size

v. Execute the Sampling Process

 Population: the material of the study whether it is human
subjects, animals or inanimate objects.

 Reference population (target population): the population

of interest, to which the investigators would like to generalize
the results of the study.
 Study population: accessible population
 Population from which the sample actually was drawn and about which a
conclusion can be made.
 the subset of the target population from which a sample will be drawn
and conclusion is made.

 Sampling frame: the list of all the units in the reference
population, from which a sample is to be picked
 Sample population: the actual group in which the study is
conducted or data is collected
 Sampling Unit: the unit of selection of a sample
e.g. Households, people, etc…
 Study unit: the units on which information will be collected.
 E.g. individuals

• Probability sampling:

• Simple random sampling (SRS),

• Systematic random sampling,

• Stratified sampling,

• Cluster sampling,

• Multistage sampling.
• Simple random sampling

• Principle: Equal chance/probability of drawing each unit.

• Procedure: List all units (persons) in a population, assign a

number to each unit , and randomly select units.

• Eg. Calculate the prevalence of anemia among 1000 pregnant

women attending ANC (sample size 100).

• List all pregnant women attending ANC.

• Randomly select 100 numbers between (1 and 1000).
• Lottery method
• Table of random numbers
• Using computer programs
• Systematic random sampling

• Principle: Select sample at regular intervals based on sampling


• Procedure:
• Number the units in the population from 1 to N,
• Decide on the n (sample size) that you need,
• Calculate the sampling fraction k (K = N/n),
• Randomly select an integer between 1 to k,
• Then take every kth unit through out the sampling frame.
• Eg.1: N= 12, n=4, calculate k? then select an integer b/n 1 and k and
list the samples?

• Eg. 2: N=200, n=50, calculate k and list the samples?

• Stratified sampling

• Applied when the source population is heterogeneous on one or

more variables of interest.
• The usual intension is to assure that the sample is representative
based on the heterogeneous variable/s.
• The population is first divided into classes (strata) base on the
• Then separate samples are taken from each stratum using
simple or systematic random sampling technique.
• The number taken from each stratum might be equal (non
proportional stratified sampling) or the number is determined
based on the proportion of each class in the source population
(proportional stratified sampling).
• Samples can be stratified across more than one variable.
• Select random samples from within homogeneous subgroups
• Sampling frame divided into groups (age, sex, socioeconomic
• Units in each group have the same probability of selection, but
probability differs between groups.

• List all units (persons) in a population.
• Divide the units into groups (called strata).
• Assign a number to each unit within each stratum.
• Select a random sample from each stratum (Simple or Systematic).
• Combine the strata samples to form the full sample.
• A stratified sampling approach is most effective when:

• Variability within strata are minimized and variability between strata are

• The stratifying variable is strongly correlated with the desired dependent

Cluster sampling
Principle: Select all units within randomly selected geographic clusters.
• Divide population into geographic groups (clusters).
• Assign a number to each cluster.
• Randomly select clusters.
• Sample all units within selected clusters OR select a random sample of
units within selected clusters.
• Cluster sampling;
• Is a sampling method applied when the source population is composed
of “natural” groups.
• Assuming the groups are homogenous among each other, cluster
sampling selects few groups (clusters) from the population as Primary
Sampling Unit (PSU).
• Then the required information is collected from all elements,
Secondary Sampling Units (SSU), within each selected group.
• Eg. Researchers usually use pre-existing units such as schools or
cities as their clusters.
Multistage sampling
• Is like cluster sampling, but involves selecting a sample within
each chosen cluster, rather than including all units in the cluster.
• Also called multistage cluster sampling.

• Thus, multi-stage sampling involves selecting a sample in at least two


Limitations of Sampling

 Demands more rigid control in undertaking sample operation.

 Sampling error
 Sampling bias
 Minority and smallness in number of sub-groups often render
study to be suspected.
 Sample results are good approximations at best.

 When we take a sample, our results will not exactly equal the
correct results for the whole population.

 That is, our results will be subject to errors.

 This error has two components:

Sampling and Non-sampling errors
a) Sampling error (i.e., random error)
 Random error, the opposite of reliability (i.e., Precision or
repeatability), consists of random deviations from the true value,
which can occur in any direction.
 can be minimized by increasing the sample size.
 Reliability (or precision): This refers to the repeatability of a
measure, i.e., the degree of closeness between repeated
measurements of the same value.

b) Non Sampling error (i.e., Bias)
 Bias, the opposite of validity, consists of systematic deviations
from the true value, always in the same direction.

 It is possible to eliminate or reduce by careful design of

the sampling procedure.

Study Design

 A study design is the process that guides researchers on how to

collect, analyse and interpret observations.
 Research design is a master plan specifying the methods and
procedures for collection and analyzing the needed information.
 It is a logical model that guides the investigator in the various
stages of the research.

Study designs could be exploratory, descriptive or analytical
1. Exploratory studies
 Are a small-scale study of relatively short duration, which is carried out
when little is known about a situation or a problem.
 It may include description as well as comparison.
 A national AIDS Control Program wishes to establish counseling services for
HIV positive and AIDS patients, but lacks information on specific needs
patients have for support.

 To explore these needs, a number of in-depth interviews are held with

various categories of patients (males, females, married and single) and with
some counselors working on a program that is already underway.

2. Descriptive studies
 studies that describe the patterns of disease occurrence and other health-
related conditions by person, place and time.
 Mainly concerned with distribution
 Useful for allocation of resources
 Important for hypothesis generation
 Less time consuming and less expensive
 Most common type of epidemiological design strategies in medical

Uses of Descriptive Studies
 They can be done fairly quickly and easily
 Allow planners and administrators to allocate resources
 Provide the first important clues about possible determinants of
a disease (useful for the formulation of hypotheses)

Types of descriptive studies

a) Case reports
b) Case series
c) Ecological studies
d) Cross-sectional studies
Ecological studies: data from entire populations are used to
compare disease frequencies between different groups during
the same period of time or in the same population at different
points in time

3. Analytic studies

 Studies used to test hypotheses concerning the

relationship between a suspected risk factor and an
outcome and to measure the magnitude of the
association and its statistical significance.

Characteristics of Analytic Studies

 Focus on the determinants (causes) of diseases.

 Used to test hypothesis
 Major distinguishing feature of analytic studies is the use of

 Two broad categories
1. Observational (Analytic cross sectional studies, Case control
Cohort studies, cohort studies, )
2. Interventional study design (experimental studies and quasi-
experimental studies)

Observational studies
 No human intervention involved in assigning study groups

 Simply observe the relationship between exposure and disease.

Things considered when choosing a study design
 The objective/ the research question
 The time you have
 The money you have
 The expertise you have
 The requirements of an organization

Sample Size Determination
 In planning any investigation, we must decide how many people
need to be studied in order to answer the study objectives.
 If the study is too small, we may fail to detect important
effects, or may estimate effects too imprecisely.
 If the study is too large, then we will waste resources.

Sample size determination depends on the:

 Objective of the study

 Design of the study
 Descriptive/Analytic
 Degree of precision or accuracy – the allowed deviation
from the true population parameter (can be within 1% or
5% etc)
 Plan for statistical analysis
 Degree of confidence level required, usually
specified as 95% (level of confidence that the
proportion in the whole population is indeed
between (p-d) and (p+d))

Determination of Sample Size for Estimating

 To estimate population mean, 

𝑍 α /2 2 𝜎 2

Estimating a mean
• The same approach is used but with S E =  / n

• The required (minimum) sample size for a very large population is

given by :
n = Z2 2 / w 2

Eg. A nurse wishes to estimate the mean serum cholesterol in a

population of men. From previous similar studies a standard
deviation of 40 mg/100ml was reported. If he is willing to
tolerate a marginal error of up to 5 mg/100ml in his estimate,
how many subjects should be included in his study ? ( =5%,
two sided)

a)If the population size is assumed to be very large, the

required sample size would be:

n = (1.96)2 (40)2 / (5)2 = 245.86  246 persons

Determination of Sample Size for Estimating

 The minimum sample size (n) required for a very large

population (N>10,000) is:

• n = required sample size,
• p = proportion of the population having the
• q= 1-p
• d = the degree of precision.
 Suppose that you are interested to know the
proportion of infants who breastfed >18 months of age in a rural area.
Suppose that in a similar area, the proportion (p) of breastfed infants was
found to be 0.20. What sample size is required to estimate the true
proportion within ±3% with 95% confidence. Let p=0.20, d=0.03,

Finite population correction factor
 The above formulas are used with the assumption of a
very large population (N>10,000)
 When the sample represents a significant (e.g. over 5%)
proportion of the population, a finite population correction
factor can be applied.
 This will reduce the sample size required.
 For fine population use the following formula.

Where n = the adjusted sample size, n0 = the original

required sample size and N = population size.
Comparison of two proportions
n (in each region) = f(,) (p1q1 + p2q2) / ((p1 - p2)²

 = type I error (level of significance)

 = type II error ( 1- = power of the study)
power = the probability of getting a significant result

f (,) =10.5, when the power = 90% and the level of significance = 5%
f (,) = 9.0, when the power = 85% and the level of significance = 5%
f (,) =7.84, when the power = 80% and the level of significance = 5%

Eg. The proportion of nurses leaving the health service is

compared between two regions. In one region 30% of
nurses is estimated to leave the service within 3 years of
graduation. In other region it is probably 15%. 43
The required sample to show, with a 90% likelihood
(power), that the percentage of nurses leaving the health
service is different in these two regions would be:
(assume a confidence level of 95%)

neach = (1.28+1.96)2 ((.3.7) +(.15 .85)) / (.30 - .15)2 = 158

 If 10% is added for non-response and other contingencies,

neach = 158 + 16 = 174

Therefore, 174 nurses are required in each region.

 Take the values of p and standard deviations from
other published research findings
 If there are more than one p value, take a p value that
gives the maximum value when p is multiplied with q.
 Other option: do pilot study
 If the value of p is unknown, take 50%
 Consider non response (5-10%) in your sample size

 Variable is a concept which can take on different quantitative
 A variable is a quantity which can vary from one individual to
 Variable is a property that taken on different value.
 For example; height, weight, income, age etc.
 The main focus of the scientific study is to analyze the functional
relationship of the variables.

Types of study variables
1. Dependent variable: a.k.a outcome variable
 If one variable depends or is a consequence of other, it is termed as
dependent variable.
2. Independent variable: a.k.a predictor, factor,
determinant, explanatory
 The variable that is antecedent to the dependent variable is termed as an
independent variable.

Operational Definition of Variables
 Operationalizing variables by choosing appropriate indicators is
 Operationalizing variables means that you make them
 E.g. In a study on VCT acceptance, you want to determine the level of
knowledge concerning HIV in order to find out to what extent the factor
„poor knowledge‟ influences willingness to be tested for HIV. The variable
‘level of knowledge’ cannot be measured as such. You would need to
develop a series of questions to assess a person‟s knowledge, for example
on modes of transmission of HIV and its prevention methods.

 If 10 questions were asked, you might decide that the
knowledge of those with:
 0 to 3 correct answers is poor,
 4 to 6 correct answers is reasonable, and
 7 to 10 correct answers are good.

Data collection and

Plan for Data Collection
Why should you develop a plan for data collection?
A plan for data collection should be developed so that:
 You will have a clear overview of what tasks have to be
carried out, who should perform them, and the duration of
these tasks;
 You can organise both human and material resources
for data collection in the most efficient way; and
 You can minimise errors and delays which may result from
lack of planning (for example, the population not being
available or data forms being misplaced).

Data collection Methods
Methods of Collecting Quantitative Data
 The most commonly used methods of collecting
information (quantitative data) are:
• The use of documentary sources
• Interview administered questionnaire
• Self-administered questionnaire

A. The use of documentary sources
 Clinical records and other personal records, death
certificates, published mortality statistics, census
publications, etc.

 Documents can provide ready-made information relatively easily
 The best means of studying past events.

 Problems of reliability and validity (because the information is collected
by a number of different persons who may have used different
definitions or methods of obtaining data).
 There is a possibility that errors may occur when the information is
extracted from the records. (This may be an important source of
unreliability if handwritings are difficult to read.
 Since the records are maintained not for research purposes, but for clinical,
administrative or other ends, the information required may not be
recorded at all, or only partly recorded.

B. Self-administered Questionnaire
 The respondent reads the questions and fills in the answers by
 Simpler and cheaper: questionnaires can be administered
to many persons simultaneously.
 They can be sent by post.
 Demands a certain level of education on the part of the

C. Interview Questionnaire
 Interview may be highly structured interview or relatively
 Stimulate and maintain the respondent's interest
 Allay if anxiety is aroused (e.g., why am I being asked these
 Repeat unclear questions
 “Follow-up” or “probing” questions to clarify a response
 Observations during the interview

 Expensive and time taking
 Leading/guiding question
 Difficult to address sensitive issues
 Social desirability bias: Occurs because subjects are
systematically more likely to provide a socially acceptable
In general, apart from their expense, interviews are
preferable to self-administered questionnaires
provided that they are conducted by skilled interviewers.

The choice of methods of data collection is based
 The accuracy of information they yield

 Practical considerations, such as, the need for

personnel, time, equipment and other facilities, in
relation to what is available.

Tools for data collection
 The construction of a research instrument or tool for data
collection is the most important aspect of a research project
 The famous saying about computers- “garbage in garbage out”-
is also applicable for data collection.
 The research tool provides the input into a
study and therefore the quality and validity of the output (the
findings), are solely dependent on it.

Guidelines to Construct a data collection
 Step I: Clearly define and individually list all the specific
objectives or research questions for your study.
 Step II: For each objective or research questions, list all the
associated questions
That you want to answer through your study.
 Step III: Take each research question listed in step II and
list the information
Required to answer it.
 Step IV: Formulate question(s) to obtain this information.

Questionnaire design
 A questionnaire consists of a set of questions presented to a
respondent for answers.
 The respondents read the questions, interpret what is
expected and then write down the answers themselves.
 It is called an Interview Schedule when the researcher asks
the questions (and if
necessary, explain them) and record the respondent‟s reply
on the interview schedule.

 Questionnaires are a very convenient way of collecting
useful comparable data from a large number of
 However, they can only produce valid and meaningful
results if the questions are clear and precise and if they
are asked consistently across all respondents
 Questionnaire should be developed and tested carefully
before being used on a large scale.
 Therefore, careful consideration needs to be given to
the design of the questionnaire.

Types of questions
1. Closed ended questions: include all possible
answers/prewritten response categories, and respondents
are asked to choose among them.
 -e.g. multiple choice questions, scale questions
2. Open ended questions: allow respondents to answer
in their own words.
 Questionnaire does not contain boxes to tick but instead leaves a
blank section for the respondents to write in an answer.
3. Combination of both: -Begins with a series of closed –ended
questions, with boxes to tick or scales to rank, and then finish with a
section of open-ended questions or more detailed response.

How to construct questionnaires?

 The type and content of a questionnaire depends much on

your research question and research objectives (be
clear about your dependent and independent variables)
 All questionnaires require a title (short description)
 It needs to be appealing and inviting
 It needs a confidential unique identifier

In questionnaire design remember to:
 Use familiar and appropriate language
 Avoid abbreviations, double negatives, etc.
 Avoid two elements to be collected through one question
 Pre-code the responses to facilitate data processing
 Avoid embarrassing and painful questions
 Watch out for ambiguous wording
 Avoid language that suggests a response
 Start with simpler questions
 Ask the same question to all respondents
 Provide other, or don‟t know options where appropriate

 Provide the unit of measurement for continuous variables
(years, months, k.g, etc)
 For open ended questions, provide sufficient space for the
 Arrange questions in logical sequence
 Group questions by topic, and place a few sentences of
transition between topics
 Provide complete training for interviewers
 Pre-test the questionnaire on 5% respondents in actual field
 Check all filled questionnaire at field level
 Include “thank you” after the last question

A) Possible Sources of Bias during data collection:
1. Defective instruments
2. Observer bias
3. Effect of the interview on the informant
4. Information bias

Data Quality Assurance
 Assuring data quality is important to get valid
research findings
 It can be assured through:
 Providing training for data collectors
 Supervision
 Pre-testing and pilot study
 Assigning appropriate and skilled personnel

Pre-test and Pilot study
 Before the collection of data can be started, it is necessary to
test the methods and to make various practical preparations.
 Pre-tests or pilot studies allow us to identify potential
problems in the proposed study.
 One of assuring data quality
 A pre-test usually refers to a small-scale trial of a
particular research component.

 A pilot study is the process of carrying out a preliminary

study going through the entire research procedure with
a small sample.

Plan for data processing and analysis
 Data processing and analysis should start in the field
 Checking for completeness of the data
 Performing quality control checks
 Sorting the data by instrument used and by group of
 Data of small samples may even be processed and
analyzed as soon as it is collected.

 The plan for data processing and analysis must be made after
careful consideration of the objectives of the study as
well as of the tools developed to meet the objectives.

 Preparation of a plan for data processing and analysis will

provide you with better insight into the feasibility of the
analysis to be performed as well as the resources that are

What Should the Plan Include?
When making a plan for data processing and analysis the
following issues should be considered:
1. Sorting data,
2. Performing quality-control checks,
3. Data processing, and
4. Data analysis

Ethical issues in

Ethical Considerations
Why do we need ethical approval?
 Before you embark on research with human subjects, you are
likely to require ethical approval.

 Ethical decisions are based on three main approaches: duty,

rights and goal-based.

A. Goal-based approach: assumes that we should try to
produce the greatest possible balance of value over
• Discomfort to one individual may be justified by the consequences
for the society as a whole.
B. Duty-based approach: your duty as a researcher is
founded on your own moral principles.
 As a researcher, you will have a duty to yourself and to the
individual who is participating in the research.
 The researcher should not lye or deceive his subjects for getting
good research outcome.
 If she/he did it, it is unethical.

C. Rights-based approach: the rights of the individual are
assumed to be all-important.

• Thus a subject‟s right to refuse must be upheld whatever

the consequences for the research.

• Research studies should be judged ethically on
three sets of criteria:
1. Ethical principles
2. Ethical rules
3. Scientific criteria.

Dissemination and Utilization of Results

– Feedback to the community

– Feedback to local authorities
– Identify relevant agencies that need to be
 Scientific publication
 Presentation in meetings/conferences
 Briefly describe how the study results can be best translated into

Thank you!

You might also like