Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
2 views

Statistical Techniques Notes(Monitoring & Evalution - BMEC - Level 4)

The document provides an introduction to statistics, covering definitions, types, and the importance of statistical analysis in various fields. It discusses key concepts such as population, sample, and measures of central tendency, including mean, median, and mode, as well as graphical representations of data. Additionally, it highlights the advantages and disadvantages of statistical methods and the significance of understanding data variability.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Statistical Techniques Notes(Monitoring & Evalution - BMEC - Level 4)

The document provides an introduction to statistics, covering definitions, types, and the importance of statistical analysis in various fields. It discusses key concepts such as population, sample, and measures of central tendency, including mean, median, and mode, as well as graphical representations of data. Additionally, it highlights the advantages and disadvantages of statistical methods and the significance of understanding data variability.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 118

DOMASI DEVELOPMENT COLLEGE

MONITORING AND EVALUATION (BMEC)LEVEL 4


STATISTICAL TECHNIQUES

DOMASI DEVELOPMENT COLLEGE


• Prepared by Burnet N. Mulungu
• 0885543125 - burnetmulungu@gmail.com
INTRODUCTION TO STATISTICS
LECTURE I - INTRODUCTION TO STATISTICS
WHAT IS STATISTICS ?

• Statistics is defined as collection ,


compilation, analysis and
interpretation of numerical data .
• Statistics is the science of data

3
WHY STATISTICS?

• To develop an appreciation for variability and how it effect


product ,process and system.
• It is estimating the present ; predicting the future
• Study methods that can be used to solve problems ,build
knowledge.
• Statistics make data into information
• Develop an understanding of some basic ideas of statistical
reliability , stochastic process (probability concepts) .
• Statistics is very important in every aspects of society
(Govt., People or Business)
BASIC TERMS

• Measurement :assignment of numbers to


something
• Data: collection of measurements
• Sample: collected data
• Population : all possible data
• Variable : property with respect to which data
from a sample differ in some measurable way
TYPES OF STATISTICS

• Descriptive statistics :use to organize


and describe a sample /population

• Inferential statistics :use to extrapolate


(estimate ) from a sample to larger
population

6
?

Science of chance, Science of data


uncertainties collecting, processing,
what is possible , presentation, analysing
what is probable interpretation of data
mathematical formulas numbers with contex7t
8

STATISTI Information
Data CAL
TOOLS
9

POPULATION & SAMPLE

Sample
Population
POPULATION

The set of data (numerical or otherwise)


corresponding to the entire collection of
units about which information is sought
1
1

SAMPLES

Sample Definition:
A Subset of a population.

Representative Sample
– Has the characteristics of the population

Census - A Sample that Contains all Items in the Population


1
2

WHY SAMPLING?

In most studies, it is difficult to


obtain information from the entire
population because of various
reasons. We rely on samples to
make estimates or inferences related
to the population.
STAGES INVOLVED IN STATISTICAL PROCESS

1. Organizing the data: Considered as one of the benchmark steps in Statistics it


primarily deals with the organising the information and arranging the data
patterns systematically. The main aim is to make statistical analysis more efficient.
The data can be organised by using tools example, Nero.
2. Planning a study module: The information gathered or collected from multiple
sources requires planning which entails the activities of inquiry, interviews and
surveys. Furthermore, it controls the variables like whom to speak with, and how
to analyze findings and even can be beneficial research planning.
STAGES INVOLVED IN STATISTICAL PROCESS….

3. Presenting the data: The set of arrangements or combinations used to present


the data outlook and data characteristics. Presentations can make findings more
compelling and persuasive in the form of diagrams, charts, and various illustrations
to show relationships between data.
4. Interpreting the data: One of the most important stages in statistics.
Interpretation deal with data rendering and data evaluation whose outcome is
supported by mathematical reasoning and pre-planned standard of procedures.
ADVANTAGES

ADVANTAGES
• Because it is secondary data it is usually cheap and is less time consuming
because someone else has compiled it
• Patterns and correlations are clear and visible
• Taken from large samples so the generalisability is high
• Can be used and re-used to check different variables
• Can be imitated to check changes which increases reliability and
representativeness
DISADVANTAGES

• The researcher cannot check validity and can't find a mechanism for a
causation theory only draw patterns and correlations from the data
• Statistical data is often secondary data which means that is can be easily
be misinterpreted
• Statistical data is open to abuse it can be manipulated and phrased to
show the point the researcher wants to show (effects the objectivity)
• Because this is often secondary data it is hard to access and check
STATISTICAL INFERENCE

 Drawing Conclusions (Inferences) about a


Population Based on an examination of a
Sample taken from the population
CONCLUSION

• Statistics is the practice of analyzing pieces of information that


might seem conflicting or unrelated at first glance and on the
surface.
• It can lead to a solid career as a statistician, but it can also be a
handy metric in everyday life—perhaps when you’re analyzing
the odds that your favorite team will win the Super Bowl
before you place a bet, gauging the viability of an investment, or
determining whether you’re being comparatively overcharged
for a product or service.
LECTURE II - GRAPHICAL PRESENTATION OF DATA

GRAPHICAL
PRESENTATION OF
DATA
BAR GRAPH

 A bar graph is a graphical display of data using bars of different heights.


 The bars can be plotted horizontally or vertically.
 A vertical chart is sometimes called a column bar chart.
Bar Graph
12

Management No. of Schools 10

Government 4 6

4
No. of Schools
Local Body 8 2

0
Private Aided 10

Private Unaided 2
Cont…..
BAR GRAPH

80
Science Maths History
70

60
Anita 25 22 20
50
History
40 Maths

Raju 19 19 22 Science
30

20
Sunil 28 26 21
10

0
Anita Raju Sunil Rani
Rani 20 23 21
PIE CHART
 Pie diagrams are popularly used to denote percentage breakdown.
 It is a circular statistical graphic which is divided into slices to illustrate numerical proportion.
 In a pie chart, The arc length of each slice( consequently its central angle & area) is
proportional to the quantity it represent.
Pie Diagram

15%
Components Percentage

High achievers 60% High achievers


middle achievers
25% Low achievers
60%
middle achievers 25%

Low achievers 15%


HISTOGRAM
 A histogram is a set of rectangles whose areas in proportion to class frequencies.
 It is a graph in which the frequencies are represented by bars .
 The bars are adjacent to each other not separated.
 It was first introduced by Karl Pearson.

C/I (Scores) 96-100 91-95 86-90 81-85 76-80 71-75 66-70

Frequency 2 2 4 7 5 3 1
SHAPE

Left- Skewed Right- Skewed


USES FOR A HISTOGRAM

A Histogram can be used:


 to display large amounts of data values in a relatively simple chart
form.
 to tell relative frequency of occurrence.
 to easily see the distribution of the data.
 to see if there is variation in the data.
 to make future predictions based on the data.
FREQUENCY POLYGON
 A polygon is a many angled or many sided closed figure.
 The frequency polygon is a graphical representation of frequency distribution in which midpoints of the class
interval against the frequencies.
 To close the figure, we will take two extra intervals, one above and one below the given intervals are taken.
 In comparing two or more distributions by plotting two or more graphs on the same axis.

Ach. Scores Gp.A f Gp. B f Frequency Polygon


135-139 0 0 10
9
140-144 1 3 8
145-149 2 6 7
155-159 5 7 6
5 Gp.A f
160-164 5 3 4 Gp B f
165-169 8 9 3
2
170-174 6 4 1
175-179 4 2 0
180-184 3 1
185-189 0 0
CUMULATIVE FREQUENCY GRAPH
 In order to draw cumulative frequency graph, we have to obtained cumulative frequency directly from frequency.
 Cumulative frequencies are obtained by adding successively, starting from the bottom, the individual frequencies.
 These frequencies tell us the total number of cases lying below a given score or class interval.
 The last value for the cumulative frequency will always be equal to the total number of frequencies or size of
sample.
Cumulative Frequency Graph
C.I f Cum-f
45
30-32 0 0
40
33-35 2 2 35
36-38 4 6 30
39-41 4 10 25
42-44 6 16 20 Cum-f
45-47 10 26 15
48-50 8 34 10
5
51-53 4 38
0
54-56 2 40 30-32 33-35 36-38 39-41 42-44 45-47 48-50 51-53 54-56
OGIVE

 In order to draw cumulative percentage frequency graph or Ogive, we have to obtain


cumulative percentage frequencies by multiplying cumulative frequency with 100/N,
where N is total number of frequencies.
 These frequencies tell us the percentage of cases lying below given score or class
interval. Cum%f Curve or Ogive
C.I f Cum-f Cum%f 120
30-32 0 0 0
100
33-35 2 2 5
80
36-38 4 6 15
39-41 4 10 25 60
Cum%f
42-44 6 16 40 40
45-47 10 26 65
20
48-50 8 34 85
0
51-53 4 38 95
54-56 2 40 100
PICTOGRAPH
 A pictograph is a way of showing data using images.
 Each image stands for a certain numbers of things e.g., in the given illustration, one
picture of pencil corresponds to 3 pencils.

Name Number of pencils

John 18

Mike 12

Henry 21

Frank 9

George 15
MEASURES OF CENTRAL TENDENCY
INTRODUCTION TO MEASURES OF CENTRAL TENDENCY

 Definition: Measures of central


tendency are statistical metrics used
to determine the center or typical
value of a dataset.
 Purpose: Provide a summary measure
that represents the center point of the
data distribution.
MEAN

 Definition: The mean is the arithmetic average of a


dataset.
 Characteristics:
 Sensitive to extreme values (outliers).
 Most commonly used measure of central tendency.
MEAN
MEDIAN

 Definition: The median is a dataset’s middle value when ordered from least to
greatest.
 Formula:
 Odd Number of Observations: Median is the middle value.
 Even Number of Observations: The median is the average of the two middle
values.
 Characteristics:
 Less sensitive to extreme values compared to the mean.
 Provides a better central location in skewed distributions.
MEDIAN
MODE

 Definition: The mode is the value that appears most frequently in a


dataset.
 Characteristics:
 Can be used for both numerical and categorical data.
 A dataset may have one mode, more than one mode, or no mode at all.
MODE
COMPARING MEASURES OF CENTRAL TENDENCY

When to Use Mode:


When to Use Mean: When to Use Median:
Useful for categorical
Suitable for datasets Preferred for skewed
data or to identify the
without outliers. distributions or when
most common value.
Provides a balanced outliers are present.
Provides insight into
measure if data that is Represents the middle
the most frequent
symmetrically distributed. value of the dataset.
observation.
APPLICATIONS OF CENTRAL TENDENCY

 In Education: Calculating average grades, scores, and


performance metrics.
 In Business: Analyzing average sales, customer satisfaction
ratings, and market trends.
 In Healthcare: Assessing average patient data, treatment
outcomes, and health indicators.
LIMITATIONS OF MEASURES OF CENTRAL TENDENCY

 Mean: Can be distorted by outliers or extreme values.


 Median: This may not be representative of the dataset if
there are multiple peaks.
 Mode: Not always useful if the data is uniformly
distributed or has no repeated values.
CONCLUSION

 Recap the key points about each measure of


central tendency and their applications.
 Final Thoughts: Emphasize the importance of
choosing the appropriate measure based on the
data distribution and analysis goals.
MEASURES OF SPREAD
THE RANGE

The range of a data set is a measure of spread. That is, it measure how spread out the
data are.
The range of a data set is the difference between the largest and the smallest value.

Range = Largest Value – Smallest Value


EXAMPLE

The following table presents the average monthly


temperature, in degrees Fahrenheit, for the cities of San
Francisco and St. Louis. Compute the range for each city.

Jan Feb Ma Ap Ma Jun Jul Au Sep Oc No De


r r y g t v c
San 51 54 55 56 58 60 60 61 63 62 58 52
Francisco
St. Louis 30 35 44 57 66 75 79 78 70 59 45 35
Source: National Weather Service
SOLUTION
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

San Francisco 51 54 55 56 58 60 60 61 63 62 58 52

St. Louis 30 35 44 57 66 75 79 78 70 59 45 35
Source: National Weather Service

The largest value for San Francisco is 63 and the smallest is 51.
The range for San Francisco is 63 – 51 = 12.

The largest value for St. Louis is 79 and the smallest is 30.
The range for St. Louis is 79 – 30 = 49.
THE RANGE IS NOT USED IN PRACTICE

Although the range is easy to compute, it is not often used in practice. The
reason is that the range involves only two values from the data set; the
largest and smallest.
The measures of spread that are most often used are the variance and the
standard deviation, which use every value in the data set.
RANGE AND IQR
 Range = maximum – minimum
 Easy, but NOT as good as the…
 Quartiles & Inter-Quartile Range (IQR)
 Quartile 1 (Q1) cuts off bottom 25% of data (“25th percentile”)
 Quartile 2 (Q2) cuts off two-quarters of data
 same as the Median!
 Quartile 3 (Q3) cuts off three-quarters of the data (“75th percentile”)
OBTAINING QUARTILES
 Order data
 Find the median
 Look at the lower half of data set
 Find “median” of this lower half
 This is Q1
 Look at the upper half of the data set.
 Find “median” of this upper half
 This is Q3
EXAMPLE: QUARTILES
Consider these 10 ages:
05 11 21 24 27 28 30 42 50 52

median

The median of the bottom half (Q1) = 21


05 11 21 24 27

The median of the top half (Q3) = 42
28 30 42 50 52

EXAMPLE 2: QUARTILES, N = 53
100 124 148 170 185 215
101 125 150 170 185 220
106 127 150 172 186 260
106 128 152 175 187
110 130 155 175 192
110 130 157 180 194
119 133 165 180 195
120 135 165 180 203
120 139 165 180 210
123 140 170 185 212

L(M)=(53+1) / 2 = 27 Median = 165


EXAMPLE 2: QUARTILES, N = 53
100 124 148 170 185 215
101 125 150 170 185 220
106 127 150 172 186 260
106 128 152 175 187
110 130 155 175 192
110 130 157 180 194
119 133 165 180 195
120 135 165 180 203
120 139 165 180 210
123 140 170 185 212
Bottom half has n* = 26  L(Q1)=(26 + 1) / 2= 13.5 from bottom
Q1 = avg(127, 128) = 127.5
EXAMPLE 2: QUARTILES, N = 53
100 124 148 170 185 215
101 125 150 170 185 220
106 127 150 172 186 260
106 128 152 175 187
110 130 155 175 192
110 130 157 180 194
119 133 165 180 195
120 135 165 180 203
120 139 165 180 210
123 140 170 185 212
Top half has n* = 26  L(Q3) = 13.5 from the top!
Q3 = avg(185, 185) = 185
10 0166
11 009
12 0034578
EXAMPLE 2 Q1 = 127.5 13 00359
QUARTILES 14
15
08
00257
16 555
Q2 = 165 17 000255
18 000055567
Q3 = 185 19 245
"5 point summary" 20 3
21 025
= {Min, Q1, Median, Q3, Max} 22 0
= {100, 127.5, 165, 185, 260} 23
24
25
26 0
INTER-QUARTILE RANGE (IQR)
Inter-Quartile
 Q1 = 127.5
Range (IQR)
 Q3 = 185
= Q3  Q1
= 185 – 127.5
= 57.5

“spread of middle 50%”


2. STANDARD DEVIATION

Because the variance is computed using squared deviations, the units of the variance are the
squared units of the data. For example, in Battery Lifetime example, the units of the data are
hours, and the units of variance are squared hours. In most situations, it is better to use a
measure of spread that has the same units as the data.
We do this simply by taking the square root of the variance. This quantity is called the standard
deviation. The standard deviation of a sample is denoted s, and the standard deviation of a
population is denoted by σ.

s  s2   2
EXAMPLE

Recall that in the Battery Lifetime example, the sample variance was computed as s2 =
2. Find the sample standard deviation.
Battery 3 4 6 5 4 2
Lifetime

Solution: s  s 2  2  1.414

The sample standard deviation, s, is the square root of the sample variance.
STANDARD DEVIATION

Another common measure of spread is the Standard Deviation: a measure of the


“average” deviation of all observations from the mean.

To calculate Standard Deviation:


Calculate the mean.
Determine each observation’s deviation (x - xbar).
“Average” the squared-deviations by dividing the total squared deviation by
(n-1).
This quantity is the Variance.
Square root the result to determine the Standard Deviation.
STANDARD DEVIATION

(x1  x )2  (x2  x ) 2  ... (xn  x ) 2


var 
Variance:
n 1

Standard Deviation:
sx 
 i
(x  x ) 2

n 1

Example: Metabolic Rates 1792 1666 1362 1614 1460 1867 1439


STANDARD DEVIATION
1792 1666 1362 1614 1460 1867 1439

Metabolic Rates: mean=1600


x (x - x) (x - x)2
Total Squared
1792 192 36864 Deviation
214870
1666 66 4356
1362 -238 56644 var=214870/
6
1614 14 196 Variance
var=35811.6
1460 -140 19600 6
1867 267 71289
Standard s=√35811.66
1439 -161 25921 Deviation s=189.24 cal
Totals: 0 214870 What does this value, s,
mean?
STANDARD DEVIATION & RESISTANCE

Recall that a statistic is resistant if its value is not affected much by


extreme data values.

The standard deviation is not resistant.

That is, the standard deviation is affected by extreme data values.


APPROXIMATING THE STANDARD DEVIATION

Sometimes we don’t have access to the raw data in a data


set, but we are given a frequency distribution. In these
cases we can approximate the standard deviation.
APPROXIMATING THE STANDARD DEVIATION
Following is the procedure for approximating the standard deviation:
Step 1: Compute the midpoint of each class and approximate the mean of the frequency distribution.
Step 2: For each class, subtract mean from the class midpoint to obtain
(Midpoint – Mean).
Step 3: For each class, square the differences obtained in Step 2 to obtain (Midpoint – Mean)2, and multiply
by the frequency to obtain
(Midpoint – Mean)2 x (Frequency).
Step 4: Add the products (Midpoint – Mean)2 x (Frequency) over all classes.
Step 5: To compute the population variance, divide the sum obtained in Step 4 by n. To compute the sample
variance, divide the sum obtained in Step 4 by
n –1.
Step 6: Take the square root of the variance obtained in Step 5. The result is the standard deviation.
EXAMPLE
The following table presents the number of text messages sent via cell phone by a
sample of 50 high school students. Approximate the sample standard deviation number
of messages sent.

Number of Text Messages Sent Frequency


0 – 49 10
50 – 99 5
100 – 149 13
150 – 199 11
200 – 249 7
250 – 299 4
SOLUTION
Step 1: Compute the midpoint of each class. Recall from the last section that
the sample mean was computed as 137.
Number of Text Messages Sent Class
Midpoint
0 – 49 25
50 – 99 75
100 – 149 125
150 – 199 175
200 – 249 225
250 – 299 275
SOLUTION
Step 2: For each class, subtract mean from the class midpoint to obtain (Midpoint –
Mean).
Number of Text Messages Sent Class (Midpoint –
Midpoint Mean)
0 – 49 25 –112
50 – 99 75 –62
100 – 149 125 –12
150 – 199 175 38
200 – 249 225 88
250 – 299 275 138
SOLUTION
Step 3: For each class, square the differences obtained in Step 2 to obtain (Midpoint – Mean)2,
and multiply by the frequency to obtain (Midpoint – Mean)2 x (Frequency).

Number of Text Messages Sent Frequenc (Midpoint – (Midpoint –


y Mean) Mean)2 x
(Frequency
)
0 – 49 10 –112 125,440
50 – 99 5 –62 19,220
100 – 149 13 –12 1,872
150 – 199 11 38 15,884
200 – 249 7 88 54,208
250 – 299 4 138 76,176
SOLUTION

Step 4: Add the products (Midpoint – Mean)2 x (Frequency) over all classes.
(Midpoint – Mean)2 x
(Frequency)
125,440   (Midpoint-Mean)2  Frequency 
19,220 = 125,440+19,220+1,872+15,884+54,208+76,176
1,872
15,884  292, 800
54,208
76,176
SOLUTION

Step 5: Since we are computing the sample variance, we divide the sum
obtained in Step 4 by n –1.
  (Midpoint-Mean)2  Frequency  292, 800
s2  
n 1 50  1
 5975.51020

Step 6: Take the square root of the variance to obtain the standard deviation.

s  s 2  5975.51020  77.30142
DATA COLLECTION
INTRODUCTION

 The underlying need for Data collection is to capture quality


evidence that seeks to answer all the questions that have been
posed. Through data collection businesses or management can
deduce quality information that is a prerequisite for making
informed decisions.
 To improve the quality of information, it is expedient that data
is collected so that you can draw inferences and make informed
decisions on what is considered factual.
WHAT IS DATA COLLECTION?

 Data collection is a methodical process of gathering and


analyzing specific information to proffer solutions to relevant
questions and evaluate the results. It focuses on finding out
all there is to a particular subject matter. Data is collected to
be further subjected to hypothesis testing which seeks to
explain a phenomenon.
 Hypothesis testing eliminates assumptions while making a
proposition from the basis of reason.
IMPORTANCE OF DATA COLLECTION
There are a bunch of underlying reasons for collecting data, especially for a researcher.
Walking you through them, here are a few reasons;
 Integrity of the Research
A key reason for collecting data, be it through quantitative or qualitative methods is to
ensure that the integrity of the research question is indeed maintained.
 Reduce the likelihood of errors
The correct use of appropriate data collection of methods reduces the likelihood of
errors consistent with the results.
IMPORTANCE OF DATA COLLECTION

 Decision Making
To minimize the risk of errors in decision-making, it is important that accurate data is collected so
that the researcher doesn't make uninformed decisions.
 Save Cost and Time
Data collection saves the researcher time and funds that would otherwise be misspent without a
deeper understanding of the topic or subject matter.
 To support a need for a new idea, change, and/or innovation
To prove the need for a change in the norm or the introduction of new information that will be
widely accepted, it is important to collect data as evidence to support these claims.
DATA COLLECTION METHODS
PRIMARY DATA COLLECTION METHODS

 Primary data is collected from the first-hand experience


and is not used in the past. The data gathered by primary
data collection methods are specific to the research’s
motive and highly accurate.
 Primary data collection methods can be divided into two
categories: quantitative methods and qualitative
methods.
PRIMARY DATA COLLECTION METHODS
 Quantitative Methods
Quantitative techniques for market research and demand forecasting usually make use
of statistical tools. In these techniques, demand is forecast based on historical data.
These methods of primary data collection are generally used to make long-term
forecasts. Statistical methods are highly reliable as the element of subjectivity is
minimum in these methods.
 Time Series Analysis
 Smoothing Techniques
 Barometric Method
QUANTITATIVE METHODS

Time Series Analysis


 The term time series refers to a sequential order of
values of a variable, known as a trend, at equal time
intervals.
 Using patterns, an organization can predict the demand
for its products and services for the projected time.
QUANTITATIVE METHODS
Smoothing Techniques
 In cases where the time series lacks significant trends,
smoothing techniques can be used. They eliminate a random
variation from the historical demand. It helps in identifying
patterns and demand levels to estimate future demand.
 The most common methods used in smoothing demand
forecasting techniques are the simple moving average method
and the weighted moving average method.
QUANTITATIVE METHODS

Barometric Method
 Also known as the leading indicators approach,
researchers use this method to speculate future trends
based on current developments.
 When the past events are considered to predict future
events, they act as leading indicators.
PRIMARY DATA COLLECTION METHODS
Qualitative Methods
 Qualitative methods are especially useful in situations when historical data is not available. Or there is no need
of numbers or mathematical calculations.
 Qualitative research is closely associated with words, sounds, feeling, emotions, colors, and other elements
that are non-quantifiable. These techniques are based on experience, judgment, intuition, conjecture, emotion,
etc.
 Surveys
 Polls
 Interviews
 Delphi Technique
 Focus Groups
 Questionnaire
QUALITATIVE METHODS

Polls
 Polls comprise of one single or multiple choice question. When it is
required to have a quick pulse of the audience’s sentiments, you can go for
polls. Because they are short in length, it is easier to get responses from
the people.
 Similar to surveys, online polls, too, can be embedded into various
platforms. Once the respondents answer the question, they can also be
shown how they stand compared to others’ responses.
QUALITATIVE METHODS

Surveys
 Surveys are used to collect data from the target audience and gather insights into
their preferences, opinions, choices, and feedback related to their products and
services. Most survey software often a wide range of question types to select.
 You can also use a ready-made survey template to save on time and effort. Online
surveys can be customized as per the business’s brand by changing the theme, logo,
etc. They can be distributed through several distribution channels such as email,
website, offline app, QR code, social media, etc. Depending on the type and source of
your audience, you can select the channel.
QUALITATIVE METHODS
Interviews
 In this method, the interviewer asks questions either face-to-face or
through telephone to the respondents. In face-to-face interviews, the
interviewer asks a series of questions to the interviewee in person and
notes down responses. In case it is not feasible to meet the person, the
interviewer can go for a telephonic interview.
 This form of data collection is suitable when there are only a few
respondents. It is too time-consuming and tedious to repeat the same
process if there are many participants.
QUALITATIVE METHODS
Delphi Technique
 In this method, market experts are provided with the estimates
and assumptions of forecasts made by other experts in the
industry. Experts may reconsider and revise their estimates and
assumptions based on the information provided by other
experts.
 The consensus of all experts on demand forecasts constitutes
the final demand forecast.
QUALITATIVE METHODS

Focus Groups
 In a focus group, a small group of people, around 8-10 members,
discuss the common areas of the problem. Each individual
provides his insights on the issue concerned.
 A moderator regulates the discussion among the group members.
At the end of the discussion, the group reaches a consensus.
QUALITATIVE METHODS

Questionnaire
 A questionnaire is a printed set of questions, either open-ended
or closed-ended. The respondents are required to answer based
on their knowledge and experience with the issue concerned.
 The questionnaire is a part of the survey, whereas the
questionnaire’s end-goal may or may not be a survey.
SECONDARY DATA COLLECTION METHODS

Internal sources of secondary data:


 Organization’s health and safety records
 Mission and vision statements
 Financial Statements
 Magazines
 Sales Report
 CRM Software
 Executive summaries
SECONDARY DATA COLLECTION METHODS

External sources of secondary data:


 Government reports
 Press releases
 Business journals
 Libraries
 Internet
PROBLEMS ASSOCIATED WITH DATA COLLECTION
CONCLUSION

 Data collection is no more a once in a blue moon affair. Collecting data


has become a necessity for all organisations that want to be able to
make better informed decisions.
 Collecting data lets you know what your customers think about your
brand, points out the areas that can improve, helps generate leads, and
lets you update your products and services as per the latest customer
behaviour and trends.
SAMPLING
SAMPLING
 Sampling is a technique of selecting
individual members or a subset of the
population to make statistical inferences
from them and estimate characteristics of
the whole population. Different sampling
methods are widely used by researchers in
market research so that they do not need
to research the entire population to collect
actionable insights.
EXAMPLE

 If a drug manufacturer would like to research the


adverse side effects of a drug on the country’s
population, it is almost impossible to conduct a research
study that involves everyone.
 In this case, the researcher decides a sample of people
from each demographic and then researches them, giving
him/her indicative feedback on the drug’s behavior.
IMPORTANT STATISTICAL TERMS
Population:
a set which includes all
measurements of interest
to the researcher
(The collection of all
responses, measurements, or
counts that are of interest)
Sample:
A subset of the population
WHY SAMPLING NEEDED?

Get information about large populations

 Less costs

 Less field time

 More accuracy i.e. Can Do A Better Job of Data Collection

 When it’s impossible to study the whole population


POPULATİON İN RESEARCH

 It does not necessarily mean a number of people, it


is a collective term used to describe the total
quantity of things (or cases) of the type which are
the subject of your study.
 So a population can consist of certain types of
objects, organizations, people or even events.
SAMPLING FRAME

 Within this population,


there will probably be
only certain groups that
will be of interest to
your study, this selected
category is your
sampling frame.
POPULATIONS CAN HAVE THE
FOLLOWING CHARACTERİSTİCS:
Characteristics Explains Examples

homogeneous all cases are similar bottles of beer on a production line

stratified contain strata or layers people with different levels of income: low, medium, high

proportional contains strata of known percentages of different nationalities of students in a


stratified proportions university

grouped by type contains distinctive groups of apartment buildings – towers, slabs, villas, tenement blocks

grouped by location different groups according to animals in different habitats – desert, equatorial forest,
where they are savannah, tundra
SAMPLING METHODS OR TYPES
SAMPLING TECHNIQUES
Probability sampling techniques give the most reliable
representation of the whole population.
Non-probability techniques, relying on the judgment
of the researcher or on accident, cannot generally be
used to make generalizations about the whole population.
PROBABILITY SAMPLING

 It is a sampling technique in which sample from a larger population are


chosen using a method based on the theory of probability.
 For a participant to be considered as a probability sample, he/she must be
selected using a random selection.
 The most important requirement of probability sampling is that everyone
in your population has a known and an equal chance of getting selected.
 Probability sampling uses statistical theory to select randomly, a small
group of people (sample) from an existing large population and then
predict that all their responses together will match the overall population.
TYPES OF PROBABİLİTY SAMPLİNG

Four main techniques used for a probability sample:


Simple random
Stratified random
Cluster
Systematic
SIMPLE RANDOM SAMPLING

 As the name suggests is a completely random


method of selecting the sample. This sampling
method is as easy as assigning numbers to the
individuals (sample) and then randomly choosing
from those numbers through an automated
process.
STRATIFIED RANDOM SAMPLING

 Involves a method where a larger population can be


divided into smaller groups, that usually don’t overlap
but represent the entire population together.
 While sampling these groups can be organized and then
draw a sample from each group separately. A common
method is to arrange or classify by sex, age, ethnicity
and similar ways.
CLUSTER RANDOM SAMPLING

 It is a way to randomly select participants when they are


geographically spread out. Cluster sampling usually analyzes a
particular population in which the sample consists of more than
a few elements, for example, city, family, university etc. The
clusters are then selected by dividing the greater population into
various smaller sections.
SYSTEMATIC SAMPLING

 It is when you choose every “nth” individual to be a part of the


sample. For example, you can choose every 5th person to be in the
sample. Systematic sampling is an extended implementation of the same
old probability technique in which each member of the group is selected
at regular periods to form a sample. There’s an equal opportunity for
every member of a population to be selected using this sampling
technique.
TYPES OF NON-PROBABİLİTY SAMPLİNG

Four main techniques used for a non-probability sample:

 Convenience
 Judgemental
 Snowball
 Quota
CONVENIENCE SAMPLING

 It is a non-probability sampling technique used to create sample as per ease of access,


readiness to be a part of the sample, availability at a given time slot or any other practical
specifications of a particular element.
 Convenience sampling involves selecting haphazardly those cases that are easiest to
obtain for your sample, such as the person interviewed at random in a shopping center
for a television program.
JUDGMENTAL SAMPLİNG

 In the judgmental sampling, also called purposive sampling, the sample


members are chosen only on the basis of the researcher’s knowledge and
judgment.
 It enables you to select cases that will best enable you to answer your
research question(s) and to meet your objectives.
SNOWBALL SAMPLING

 Snowball sampling method is purely based on referrals and that is how a researcher is able to
generate a sample.Therefore this method is also called the chain-referral sampling method.
 This sampling technique can go on and on, just like a snowball increasing in size (in this case
the sample size) till the time a researcher has enough data to analyze, to draw conclusive
results that can help an organization make informed decisions.
QUOTA SAMPLING

 Selection of members in this sampling technique happens


on basis of a pre-set standard. In this case, as a sample is
formed on basis of specific attributes, the created sample
will have the same attributes that are found in the total
population. It is an extremely quick method of collecting
samples.
 Quota sampling is therefore a type of stratified sample in
which selection of cases within strata is entirely non-
random.
DIFFERENCE BETWEEN PROBABILITY SAMPLING AND NON-
PROBABILITY SAMPLING METHODS

Non-probability sampling is a sampling technique


Probability Sampling is a sampling technique in
in which the researcher selects samples based on
Definition which samples from a larger population are chosen
the researcher’s subjective judgment rather than
using a method based on the theory of probability.
random selection.

Alternatively Known as Random sampling method. Non-random sampling method

Population selection The population is selected randomly. The population is selected arbitrarily.
Nature The research is conclusive. The research is exploratory.

Since there is a method for deciding the sample, Since the sampling method is arbitrary, the
Sample the population demographics are conclusively population demographics representation is almost
represented. always skewed.

Takes longer to conduct since the research design This type of sampling method is quick since
Time Taken defines the selection parameters before the neither the sample or selection criteria of the
market research study begins. sample are undefined.
CONCLUSION

In conclusion, it can be said that using a sample in research saves


mainly on money and time, if a suitable sampling strategy is used,
appropriate sample size selected and necessary precautions taken
to reduce on sampling and measurement errors, then a sample
should yield valid and reliable information

You might also like