3business Research UNIT 3

Unit 3
Scaling & measurement techniques: Concept of Measurement: Need of Measurement; Problems in

measurement in management research – Validity and Reliability. Levels of measurement – Nominal, Ordinal,
Interval, Ratio. Attitude Scaling Techniques: Concept of Scale – Rating Scales viz. Likert Scales, Semantic
Differential Scales, Constant Sum Scales, Graphic Rating Scales – Ranking Scales – Paired comparison &
Forced Ranking – Concept and Application.
Concept of Measurement
In our daily life we are said to measure when we use some yardstick to determine weight, height, or some
other feature of a physical object. We also measure when we judge how well we like a song, a painting or the
personalities of our friends. We, thus, measure physical objects as well as abstract concepts. Measurement is
a relatively complex and demanding task, especially so when it concerns qualitative or abstract phenomena.
By measurement we mean the process of assigning numbers to objects or observations, the level of
measurement being a function of the rules under which the numbers are assigned.
In measuring, we devise some form of scale in the range and then map the properties of objects from the
domain onto this scale.
Need of Measurement
It is used in all our movement, in our research work, our industry. How one knows the distance, the time, the
height and the width of any geometrical shape. How one knows his size to buy clothes. How one
differentiates between cm, inch, foot, meter, mile and km. How could we deal with studying the universe
without measurement? So, measurement is an important part of human beings’ lives.
 In education, measurement is largely need for the analysis of data from educational assessments of test.
 It also needs in diagnosing the weak areas of learning of students.
 Measurement is need for the external assessment of the students and relates to the cognitive areas of
man’s achievement.
Problems in measurement in management research
 Respondent: At times the respondent may be reluctant to express strong negative feelings or it is just
possible that he may have very little knowledge but may not admit his ignorance. All this reluctance is
likely to result in an interview of ‘guesses.’ Transient factors like fatigue, boredom, anxiety, etc. may
limit the ability of the respondent to respond accurately and fully.
 Situation: Situational factors may also come in the way of correct measurement. Any condition which
places a strain on interview can have serious effects on the interviewer-respondent rapport. For instance,
if someone else is present, he can distort responses by joining in or merely by being present. If the
respondent feels that anonymity is not assured, he may be reluctant to express certain feelings.
 Measurer: The interviewer can distort responses by rewording or reordering questions. His behaviour,
style and looks may encourage or discourage certain replies from respondents. Careless mechanical
processing may distort the findings. Errors may also creep in because of incorrect coding, faulty
tabulation and/or statistical calculations, particularly in the data-analysis stage.
 Instrument: Error may arise because of the defective measuring instrument. The use of complex words,
beyond the comprehension of the respondent, ambiguous meanings, poor printing, inadequate space for
replies, response choice omissions, etc. are a few things that make the measuring instrument defective
and may result in measurement errors. Another type of instrument deficiency is the poor sampling of the
universe of items of concern.
Researcher must know that correct measurement depends on successfully meeting all of the problems listed
above. He must, to the extent possible, try to eliminate, neutralize or otherwise deal with all the possible
sources of error so that the final results may not be contaminated.
Validity and Reliability – Criterion for Good Measurement
Reliability is about the consistency of a measure, and validity is about the accuracy of a measure.
What is reliability?
Reliability refers to how consistently a method measures something. If the same result can be consistently
achieved by using the same methods under the same circumstances, the measurement is considered reliable.
Ex- You measure the temperature of a liquid sample several times under identical conditions. The
thermometer displays the same temperature every time, so the results are reliable.
Reliability is a necessary contributor to validity but is not a sufficient condition for validity, e.g., if a
weighing scale consistently measures correct weight, then it is both reliable and valid. However, if it
consistently overweighs by three kgs, then the instrument is reliable (as it is giving the same result again and
again) but not valid since it is overweighing by three kgs. So, if a measurement is not valid, it hardly matters
if it is reliable – because it does not measure what the designer needs to measure in order to solve the
research problem.
Two dimensions underlie the concept of reliability:

1. Repeatability
2. Internal Consistency
Types of reliability
1. Test-retest: The consistency of a measure across time: do you get the same results when you repeat
the measurement? Ex- A group of participants complete a questionnaire designed to measure
personality traits. If they repeat the questionnaire days, 0weeks, or months apart and give the same
answers, this indicates high test-retest reliability.
2. Internal consistency: The consistency of the measurement itself: do you get the same results from
different parts of a test that are designed to measure the same thing? Ex- You design a questionnaire
to measure self-esteem. If you randomly split the results into two halves, there should be a strong
correlation between the two sets of results. If the two results are very different, this indicates low
internal consistency.
3. Interrater: The consistency of a measure across raters or observers: do you get the same results
when different people conduct the same measurement? Ex- Based on an assessment criteria checklist,
five examiners submit substantially different results for the same student project. This indicates that
the assessment checklist has low inter-rater reliability (for example, because the criteria are too
subjective).
Type of reliability What does it assess? Example
Test-retest The consistency of a measure across A group of participants complete

time: do you get the same results a questionnaire designed to measure
when you repeat the measurement? personality traits. If they repeat the
questionnaire days, weeks or months
apart and give the same answers, this
indicates high test-retest reliability.
Interrater The consistency of a measure across Based on an assessment criteria

raters or observers: do you get the checklist, five examiners submit
same results when different people substantially different results for the
conduct the same measurement? same student project. This indicates
that the assessment checklist has low
inter-rater reliability (for example,
because the criteria are too
subjective).
Internal The consistency of the measurement You design a questionnaire to measure

consistency itself: do you get the same results self-esteem. If you randomly split the
from different parts of a test that are results into two halves, there should be
designed to measure the same thing? a strong correlation between the two
sets of results. If the two results are
very different, this indicates low
internal consistency.
What is validity?
Validity refers to how accurately a method measures what it is intended to measure. If research has high
validity that means it produces results that correspond to real properties, characteristics, and variations in the
physical or social world.
High reliability is one indicator that a measurement is valid. If a method is not reliable, it probably isn’t
valid.
Ex- If the thermometer shows different temperatures each time, even though you have carefully controlled
conditions to ensure the sample’s temperature stays the same, the thermometer is probably malfunctioning,
and therefore its measurements are not valid.
For example, variable like behavior of employees to measure consumer satisfaction in a big shopping mall is
a validity issue. As behavior of employees is not the only determinant of consumer satisfaction rather various
other factors such as pricing policies, discount policy, parking facility, and others may be responsible for
generating consumer satisfaction. Hence, the tool that was designed to measure consumer satisfaction from
“employee’s behavior” may not be a valid measurement tool. The researchers are always concerned about the
validity of their measuring instrument. Validity is referred in context of two terms viz., internal & external
validity. External Validity refers to the generalizability of research findings to the external environment like
population, variables, etc. In other words, external validity of research findings is the data’s ability to be
generalized across universe. On the other hand, internal validity is the ability of a research instrument to
measure what it is purported (supposed) to measure.
Types of validity
Type of validity What does it assess? Example
Construct The adherence of a measure to existing A self-esteem questionnaire could be

theory and knowledge of the concept assessed by measuring other traits known
being measured. or assumed to be related to the concept of
self-esteem (such as social skills and
optimism). Strong correlation between the
scores for self-esteem and associated traits
would indicate high construct validity.
Content This category looks at whether the A test that aims to measure a class of
instrument adequately covers all the students’ level of Spanish contains reading,
content that it should with respect to the writing and speaking components, but no
variable. listening component. Experts agree that
listening comprehension is an essential
aspect of language ability, so the test lacks
In other words, does the instrument cover
content validity for measuring the overall
the entire domain related to the variable, or
level of ability in Spanish.
construct it was designed to measure?
Criterion Refers to how well the measurement of A job applicant takes a performance test
one variable can predict the response of during the interview process. If this test
another variable. accurately predicts how well the employee
will perform on the job, the test is said to
have criterion validity.
Test of Sound Measurement/Characteristics of Good Measurement/Goodness of Measures

Any measurement tool should have the ability to measure a particular variable accurately and it must
measure what it is supposed to measure. A good instrument would enhance the quality of research results.
Hence it becomes necessary that we assess the ‘goodness’ of the measures developed. Any instrument that
meets the test of reliability, validity and practicality is said to possess the „goodness‟ of measures. These
tests of sound measurement are:
Levels of measurement – Nominal, Ordinal, Interval, Ratio.
There are different levels of measurement in statistics and data measured using them can be broadly
classified into qualitative and quantitative data.
Nominal scale is a naming scale, where variables are simply “named” or labeled, with no specific order.
Ordinal scale has all its variables in a specific order, beyond just naming them. Interval scale offers labels,
order, as well as, a specific interval between each of its variable options. Ratio scale bears all the
characteristics of an interval scale, in addition to that, it can also accommodate the value of “zero” on any of
its variables.
Nominal Scale
A nominal scale is the 1st level of measurement. This scale used to label variables that have no quantitative
values.
Characteristics of Nominal Scale
 A nominal scale variable is classified into two or more categories.

 It is qualitative. The numbers are used here to identify the objects.
 The numbers do not define the object characteristics. The only permissible aspect of numbers in the
nominal scale is “counting.”
Example:
An example of a nominal scale measurement is given below:

What is your gender?
M- Male
F- Female
Here, the variables are used as tags, and the answer to this question should be either M or F.
 Eye color: Blue, green, brown
 Hair color: Blonde, black, brown, grey, other
 Blood type: O-, O+, A-, A+, B-, B+, AB-, AB+
 Political Preference: Republican, Democrat, Independent
 Place you live: City, suburbs, rural
 Merits of Nominal Scale
a) Nominal scale provides convenient ways of keeping track of people, objects & events.
b) Nominal scale describes differences between things by assigning them to categories.
c) Nominal scales are counted data.
 Demerits of Nominal Scale

a) Nominal scale is the least powerful level of measurement because it has no quantitative value.
b) Nominal scale indicates no order or distance relationship.
Ordinal Scale
The ordinal scale is the 2nd level of measurement that reports the ordering and ranking of data without
establishing the degree of variation between them. Ordinal represents the “order.” Ordinal data is known as
qualitative data or categorical data. It can be grouped, named, and ranked.
Characteristics of the Ordinal Scale
 The ordinal scale shows the relative ranking of the variables

 It identifies and describes the magnitude of a variable
 Along with the information provided by the nominal scale, ordinal scales give the rankings of those
variables
 The interval properties are not known
 The surveyors can quickly analyse the degree of agreement concerning the identified order of
variables
Example:
 Ranking of school students – 1st, 2nd, 3rd, etc.

 Ratings in restaurants
 Evaluating the frequency of occurrences
o Very often
o Often
o Not often
o Not at all
 Assessing the degree of agreement
o Totally agree
o Agree
o Neutral
o Disagree
o Totally disagree
 Merits of Ordinal Scale
a) The ordinal scale implies a statement of „greater than‟ or „less than‟ without being able to state how
much greater or less.
b) Ordinal scale permits the ranking of items from highest to lowest.
 Demerits of Ordinal Scale

a) Ordinal scales have no absolute values.
b) The real differences between adjacent ranks may not be equal.
c) Precise comparisons cannot be made with the help of this scale.
Interval Scale
The interval scale is the 3rd level of measurement scale. It is defined as a quantitative measurement scale in
which the difference between the two variables is meaningful. In other words, the variables are measured in
an exact manner, not as in a relative way in which the presence of zero is arbitrary.
Characteristics of Interval Scale:
 The interval scale is quantitative as it can quantify the difference between the values
 It allows calculating the mean and median of the variables
 To understand the difference between the variables, you can subtract the values between the variables
 The interval scale is the preferred scale in Statistics as it helps to assign any numerical values to
arbitrary assessment such as feelings, calendar types, etc.
 All the techniques applicable to nominal and ordinal data analysis are applicable to Interval Data as
well.
 Net Promoter Score, Likert Scale, Bipolar Matrix Table are some of the most effective types of
interval scale.
Example:
This scale has all characteristics of the ordinal scale; in addition it has the property of equality of interval i.e.,
the distance between I and II is the same as the distance between II and III, so one can interpret not only the
order of scale scores but also the distance between them.
Time elapsed between 1:00 pm and 3:00 pm is the same as time elapsed between 8:00 pm and 10:00 pm.
The Fahrenheit scale is also an example of an interval scale. One can say that an increase in temperature from
300 to 400 involves the same increase in temperature as an increase from 600 to 700. The temperature of four
cities is: Shimla 100C, Delhi 200C, Banglore 220C and Jaipur 370C. It can be said that the difference in the
temperature of Delhi and Shimla is the same as difference in the temperature of Jaipur and Banglore.
However, we cannot say that Delhi is two times warmer than Shimla. This is because interval scale does not
have an arithmetic origin, rather they possess arbitrary origin i.e., 00C does not mean there is no temperature.
These variables have no “true zero” value. For example, it’s impossible to have a credit score of zero. It’s
also impossible to have an SAT score of zero. And for temperatures, it’s possible to have negative values
(e.g. -10° F) which means there isn’t a true zero value that values can’t go below.
Merits of Interval Scale

a) Interval scales are more powerful measurement than ordinal scales because it incorporates the concept of
equality of interval.
Demerits of Interval Scale

a) The primary limitation of the interval scale is the lack of a true zero.
Ratio Scale
The ratio scale is the 4th level of measurement scale, which is quantitative. It is a type of variable
measurement scale. It allows researchers to compare the differences or intervals. The ratio scale has a unique
feature. It possesses the character of the origin or zero points.
Characteristics of Ratio Scale:
 Ratio scale has a feature of absolute zero

 It doesn’t have negative numbers, because of its zero-point feature
 It affords unique opportunities for statistical analysis. The variables can be orderly added, subtracted,
multiplied, and divided. Mean, median, and mode can be calculated using the ratio scale.
 Ratio scale has unique and useful properties. One such feature is that it allows unit conversions like
kilogram – calories, gram – calories, etc.
Example:
 Height: Can be measured in centimeters, inches, feet, etc. and cannot have a value below zero.
 Weight: Can be measured in kilograms, pounds, etc. and cannot have a value below zero.
 Length: Can be measured in centimeters, inches, feet, etc. and cannot have a value below zero.
These variables have a “true zero” value. For example, length, weight, and height all have a minimum value
(zero) that can’t be exceeded. It’s not possible for ratio variables to take on negative values. For this reason,
the ratio between values can be calculated. For example, someone who weighs 200 lbs. can be said to weigh
two times as much as someone who weighs 100 lbs. Likewise someone who is 6 feet tall is 1.5 times taller
than someone who is 4 feet tall.
Merits of Ratio Scale

i) Ratio scale facilitates a kind of comparison between variables.
j) Ratio scale is the most precise type of scale.
k) All statistical techniques are usable with ratio scale.
l) Multiplication and division can be used with this scale.
Demerits of Ratio Scale

a) Researcher of behavioral sciences cannot use this scale.
Classification of Scales based on the types of data

Type of Data Type of Scale Rule for Classification Statistical Processes
Totals, Percentages, Mode,
Nominal Dichotomous Objects identical or different
Chi-square
Rank order, Percentile, Median, Rank
Ordinal Objects greater or smaller
Comparative correlation, ANOVA
Likert, Thurstone, Mean, Standard Deviation,
Differences between adjacent
Interval Stapel, Semantic Correlation coefficients, t-Test,
ratings are equal
differential f-test, Factor analysis
Ratio Special scales Fixed point of origin or zero All statistical operations
Attitude Scaling Techniques:

Marketers are interested in measuring consumers’ attitudes toward their products. An attitude scale involves
a series of phrases, adjectives, or sentences about the attitude object. If a marketer, for example, is measuring
people’s attitudes toward video compact disc players, respondents may be asked to state the degree to which
they agree or disagree with some statements such as “Video Compact Discs Players are complicated to
handle or operate.” There is a wide variety of methods available for measuring consumers’ attitudes.
Concept of Scale – Scaling is a technique used for measuring qualitative responses of respondents such as
those related to their feelings, perception, likes, dislikes, interests and preferences.
Rating Scales viz. Likert Scales
A rating scale is a popular closed-ended question type where you can assign different weights to each answer
option. Survey takers are typically asked to choose from multiple options scaled between two extremes such
as Unsatisfied to Satisfied. The rating scale can help you quantify subjective sentiments such as satisfaction,
experience, perception, loyalty, etc.
Some common themes in rating scale questions are:
A person’s satisfaction level with something

Their likelihood of recommending a product/service
How much do they agree with a statement?
How much easy do they find doing something?
Types of Rating Scale

Rating scales can be divided into two categories: Ordinal and Interval Scales.
There are four primary types of rating scales which can be suitably used:
 Graphic Rating Scale

 Slider Rating Scale
 Descriptive Rating Scale
 Comparative Rating Scale
 Frequency Rating Scale
1. Graphic Rating Scale
In these types of rating scale survey questions, the survey participants are required to respond to
graphics/images instead of numbers. For example, you must have seen star ratings (1 to 5) given by existing
customers while shopping online. The same can be seen in movie review platforms such as IMDB, where
you can give star ratings for a movie.
The facial expression/smiley face is another popular example that is used to measure a person’s satisfaction
or discomfort. Such pictorial or graphical scales are helpful, especially when you have to take feedback from
people who are not fluent in your language.
2. Slider Rating Scale
The slider scale allows people to respond by dragging a slider to an answer option that they find the most
appropriate. Such questions save your respondents’ time as they are not required to enter any text or number.
3. Frequency Scale
This type of rating scale question can help you understand how frequently a respondent performs a particular
behavior. This data can prove to be quite helpful for marketing experts who wish to understand customer
interactions, touchpoints, and product developers who wish to decode product usage patterns. Depending on
the nature of your study, you can provide specific answer options such as “every day”, “once a week”, or go
for more general options such as “sometimes”, “rarely”, etc. Overall, this is a great question to understand
consumer behavior towards your product or service.
4. Comparative Scale
This type of rating scale question allows respondents to compare between options and then select the one that
best meets the criteria of the question. For example, you can ask your customers, “Which feature do you find
the most useful in our product?” The customers can compare between two options and go for the feature that
they find the most beneficial.
5. Descriptive rating scale
In certain surveys or research numeric scale may not be of much help. A descriptive rating scale, explains
each option for the respondent. It contains a thorough explanation for the purpose of gathering information
with deep insights.
Likert Scales
Semantic Differential Scales
Constant Sum Scales
A constant sum scale is a type of question used in a market research survey in which respondents are required
to divide a specific number of points or percents as part of a total sum. The allocation of points are divided to
detail the variance and weight of each category.
For example, you may want to ask respondents to allocate 100 points among four different package designs
in a way that reflects their likelihood to purchase.
Ranking Scales
A ranking question is a type of survey question that asks respondents to compare a list of items with each
other and arrange them in order of preference. It is used by market researchers to understand the order of
importance of items from multiple items.
A ranking scale is a close-ended scale that allows respondents to evaluate multiple row items in relation to
one column item or a question in a ranking survey and then rank the row items. It is the scale used by market
researchers to ask ranking questions.
On a ranking scale, the question may be in terms of product features, needs, wants, etc. It can be used for
both online and offline surveys.
Paired comparison & Forced Ranking
Paired comparisons (i.e. simultaneously comparing two things with each other) are made by each respondent among a set of
items using a binary scale that indicates which of the two choices.
This is practically helpful when priorities are not clear enough, when alternatives are completely different from one
another, or when there is little objective data to base our decision on. It is useful in a wide range of applications, from
selecting the concept design for a new product before it goes into production, to deciding the skills and qualifications
when hiring people for a new position.
Forced Ranking Method
Forced ranking is a controversial management tool which measures, ranks and grades employees' work
performance based on their comparison with each other instead of against fixed standards.
Forced ranking process

In forced ranking process employees are divided into three into groups: A, B, or C.
- A group stands for the employees who are most engaged, motivated, passionate, open to collaboration and
committed. They make up the top 20%.
- B group stands for employees who are not as engaged or motivated but are crucial to the company’s success
because they are so abundant. They make up the middle 70%.
- C group stands for employees who are commonly non-producing procrastinators. They make up the bottom
10%.
Advantage
It focuses on making relative comparisons between a company’s best and worst employees using subjective
criteria. Overall forced ranking offers a chance for increased productivity, profitability and shareholder
value.

3business Research UNIT 3

Uploaded by

Copyright:

Available Formats

3business Research UNIT 3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3business Research UNIT 3

Uploaded by

Copyright:

Available Formats

Unit 3

Scaling & measurement techniques: Concept of Measurement: Need of Measurement; Problems in

Validity and Reliability – Criterion for Good Measurement

Two dimensions underlie the concept of reliability:

Type of reliability What does it assess? Example

Test-retest The consistency of a measure across A group of participants complete

Interrater The consistency of a measure across Based on an assessment criteria

Internal The consistency of the measurement You design a questionnaire to measure

Type of validity What does it assess? Example

Construct The adherence of a measure to existing A self-esteem questionnaire could be

Test of Sound Measurement/Characteristics of Good Measurement/Goodness of Measures

Levels of measurement – Nominal, Ordinal, Interval, Ratio.

Characteristics of Nominal Scale

 A nominal scale variable is classified into two or more categories.

An example of a nominal scale measurement is given below:

 Demerits of Nominal Scale

Characteristics of the Ordinal Scale

 The ordinal scale shows the relative ranking of the variables

 Ranking of school students – 1st, 2nd, 3rd, etc.

 Demerits of Ordinal Scale

Merits of Interval Scale

Demerits of Interval Scale

Characteristics of Ratio Scale:

 Ratio scale has a feature of absolute zero

Merits of Ratio Scale

Demerits of Ratio Scale

Classification of Scales based on the types of data

Attitude Scaling Techniques:

Some common themes in rating scale questions are:

A person’s satisfaction level with something

Types of Rating Scale

 Graphic Rating Scale

1. Graphic Rating Scale

5. Descriptive rating scale

Forced ranking process

You might also like