Data Collection Methods
Data Collection Methods
METHODS
DATA TYPES & THEIR COLLECTION METHOD
COLLECTION OF
DATA
PRIMARY SECONDARY
Primary data are those which are collected for the first time and are original in
character
Primary data means the raw data (data without fabrication or not tailored data)
which has just been collected from the source and has not gone any kind of statistical
treatment like sorting and tabulation. The term primary data may sometimes be used
to refer to first hand information.
Data which has already been collected by someone, may be sorted, tabulated and has
undergone a statistical treatment. It is fabricated or tailored data. Such type of data is
known as Secondary Data.
Primary data: Collection Methods
Observation Method
Direct personal interview
Telephone interview
Mail questionnaire
Questionnaire filled by enumerates (Schedules)
Observation Method
Observation becomes a scientific tool and the method of data collection, when
it serves a formulated research purpose,
is systematically planned and recorded, and
is subjected to checks and controls on validity and reliability
In this method, the investigator gets the first hand data for the study area.
It includes recognizing and noting of facts or occurrences, often involving some sort of measurement.
Disadvantages
A time consuming and expensive method.
A limited amount of information may be available.
Unforeseen factors may interfere with the observation task
Types of Observation Method
Any combination
Direct Personal Interview
• Accurate screening
• Keep focus
• Capture emotions and behaviors
• More information and in greater depth can be obtained compared to other
methods
• Greater flexibility – provides interviewer with opportunity to restructure
questions
• Personal information can be obtained
• Possibility of spontaneous responses and thus more honest responses
Disadvantages
Structured interviews
the use of a set of predetermined questions and highly standardized techniques of
recording
the interviewer in a structured interview follows a rigid procedure, asking questions in
a form and order prescribed
Unstructured interviews
there is a flexibility of approach to questioning
do not follow a system of pre-determined questions and standardized techniques of
recording information
Telephone Interview
Advantages
Can lead to relatively high response rates in specific markets
Interviews can be completed fairly quickly
Can be used to reach samples over a wide geographic area
Cost effective
More control in targeting specific types of samples vs. other methods (i.e.
face-to-face surveys in public)
Provided that the questions are properly formulated and the interview is
professionally administered, the quality of data generated can be high vs.
other methods (e.g. surveys delivered over mobile devices, etc.).
Telephone Interview
Disadvantages
Typically, questions cannot be of a complex nature.
Unlike a face-to-face interview or focus group, interviewers – no matter
how experienced and skilled – cannot see body language.
When the target audience is available through an online panel, telephone
interviewing often appears as a much more expensive alternative.
Questionnaires
Low cost in term of time, labour and wealth – even when the universe is large and is
widespread
Respondents who are not easily approachable, can also be reached conveniently
Large samples can be used or the field of inquiry is wide
Questionnaires: Disadvantages
Closed Ended
Open Ended
Sequence of Questions
Demographics
Subject matter related
difficult questions
Advantages
Information can be got even from uneducated persons.
This information is more reliable and correct.
It covers wide area.
It is unaffected by the personal bias of the investigators.
There are fewer chances of non-responses as enumerator’s visits
personally.
Schedules sent through enumerators
Disadvantages
It is costly because enumerators have to be paid.
It is time consuming as every informant is visited.
It requires trained enumerators, which are not easily available.
The personal bias of enumerator may lead to wrong conclusions.
It can only be used by big organizations.
Difference between a Questionnaire and schedule
Point of Difference Questionnaire Schedule
Make a questionnaire's based on your understanding on assignment 1 ( case study of sustainable neighborhood)
and assignment 2 (critical analysis of master plan).
SECONDARY DATA
Secondary Data
Should be conducted
Early in the problem investigation stage and
Prior to any organized collection of information from primary sources
Rationale for Secondary data
1.Internal sources:
2.External sources
Internal Sources
External sources
Internal sources Government
Internal records Academic literature
Feedbacks Practitioner sources
Internal databases General Media
Standardized sources
Professional agencies
Evaluate
Who
What
Why
How
When Identify information required
Consistency from primary sources
Evaluating Secondary Data
STOP
STOP No
exceed the cost of its
population of interest?
acquisition?
Yes Yes
No
No Does the data cover the time Is the risk of bias high?
period of interest?
Yes
No Yes
Can the data be verified?
Are the definitions, data
No Yes
Can data collection methods and Yes
be revised? measurement systems known Use the data
and acceptable?
Benefits
Easy accessibility
Relative inexpensive
Quick sourcing
Sometimes more accurate than primary data
Some type of information available only through secondary sources
Enhances quality of primary data
Increased familiarity with problem
Better understanding of concepts, data, terminology
Issues with secondary data
Need to evaluate the quality of both the source of the data and the data
itself.
The main problems may be categorized as follows:
Definition and reference sets
Measurement Errors
Source bias
Reliability
Time scale
Uses of Secondary Data
Identify problem
Better define problem
Develop an approach to the problem
Formulate appropriate research design
Answer certain research questions and test some hypothesis
Interpret primary data more meaningfully
Monitoring environment
Other Methods of Data Collection
1. Warranty Cards
2. Distributor or Store Audits
3. Pantry Audits
4. Consumer Panels
5. Mechanical Devices
6. Depth Interviews
7. Content Analysis
8. Projective Tests
Measurement
1.
Description
Unique labels (Descriptors) that are used to designate each value on the scale
Denoted by
Greater than
Less than
Equal to
3. Distance
Defines that the scale has a true zero point, unique and fixed beginning
where the characteristic measured has zero value
Measurement scales
(or types of data):ways to categorize
different types of variables
It is a system of classification and does not place the entity along a continuum.
Nominal scale provide convenient ways to keeping track of people, objects, and
events.
Ordinal scales
involve the ranking of items along the continuum of the characteristic being scaled
there is no information about the interval between any two items on scale
In this scale, we do not know how much better one product is than others, only that it is
batter
All of the information a nominal scale would have given is available from an ordinal scale
Ordinal scales only permit the ranking of items from highest to lowest.
In addition, positional statistics such as the median, quartile and percentile can
be determined
Interval scales
Possible to interpret not only the order of scale scores but also the distance
between them
Possible to add or subtract to scale values without affecting the form of the scale but
one cannot multiply or divide
Most of the common statistical methods of analysis can be used on interval scales
such as arithmetic operation and standard deviation.
Ratio scales
Nominal scale is the least precise type of scale and Ratio scale is the mot precise
type of scale
ANALYSIS OF DATA
Editing
converted.
Year Sales (Rs.) Gross Profit (Rs.) Net Profit (Rs.)
Sometimes when the volumes of different 1974 71.43 21.43 7.14
1975 68.57 22.86 8.57
attributes may be greatly different for making 1976 65 22.5 12.5
11.1
r = 24.96
r= 25 (approx)
Cultivable area( in
State hectares)
State Cultivable area
Andhra Pradesh 663
Andhra Pradesh 121.96
Karnataka 448
Total 1957
Tamil Nadu 102.28
Graphical Representation
Graphic representation is a visual way of analyzing numerical data
Graphs serve:
a) As a method of presentation
b) As a tool of analysis.
Graphs are divided into two parts:
c) Graphs of Frequency Distribution.
d) Graphs of Time Series (Line graphs). Fig. Quadrant
Histogram
• Histogram is a non-cumulative frequency graph it is drawn on a natural scale
in which the representative frequencies of the different class of values are
represented through vertical rectangles drawn closed to each other.
• Measure of central tendency, mode can be easily determined with the help of
this graph.
Advantages of histogram:
1. It is easy to draw and simple to understand.
2. It helps us to understand the distribution easily and quickly.
3. It is more precise than the polygene.
Limitations of histogram:
4.It is not possible to plot more than one distribution on same axes as histogram.
5.Comparison of more than one frequency distribution on the same axes is not
possible.
6. It is not possible to make it smooth.
Graphical Representation
Uses of histogram:
1.Represents the data in graphic form.
2.Provides the knowledge of how the scores in the group are distributed.
Whether the scores are piled up at the lower or higher end of the
distribution or are evenly and regularly distributed throughout the scale.
3.Frequency Polygon. The frequency polygon is a frequency graph which is
drawn by joining the coordinating points of the mid-values of the class
intervals and their corresponding frequencies.
Graphical Representation
Table. Data for Frequency polygon Graph Fig. Frequency polygon Graph
Graphical Representation
Advantages of frequency polygon:
1.It is easy to draw and simple to understand.
2. It is possible to plot two distributions at a time on same axes.
3.Comparison of two distributions can be made through frequency
polygon.
4. It is possible to make it smooth.
Limitations of frequency polygon:
1. It is less precise.
2. It is not accurate in terms of area the frequency upon each
interval.
Graphical Representation
Uses of frequency polygon:
1.When two or more distributions are to be compared the
frequency polygon is used.
2. It represents the data in graphic form.
3.It provides knowledge of how the scores in one or more group are
distributed. Whether the scores are piled up at the lower or higher end
of the distribution or are evenly and regularly distributed throughout
the scale.
Graphical Representation
Frequency Curve or Smoothed
Frequency Curve
When the sample is very small and the frequency
distribution is irregular the polygon is very jig-
jag. In order to wipe out the irregularities.
Uses of Ogive:
1.Ogive is useful to determine the number of students
below and above a particular score.
2.When the median as a measure of central tendency is
wanted.