3business Research UNIT 3
3business Research UNIT 3
3business Research UNIT 3
Concept of Measurement
In our daily life we are said to measure when we use some yardstick to determine weight, height, or some
other feature of a physical object. We also measure when we judge how well we like a song, a painting or the
personalities of our friends. We, thus, measure physical objects as well as abstract concepts. Measurement is
a relatively complex and demanding task, especially so when it concerns qualitative or abstract phenomena.
By measurement we mean the process of assigning numbers to objects or observations, the level of
measurement being a function of the rules under which the numbers are assigned.
In measuring, we devise some form of scale in the range and then map the properties of objects from the
domain onto this scale.
Need of Measurement
It is used in all our movement, in our research work, our industry. How one knows the distance, the time, the
height and the width of any geometrical shape. How one knows his size to buy clothes. How one
differentiates between cm, inch, foot, meter, mile and km. How could we deal with studying the universe
without measurement? So, measurement is an important part of human beings’ lives.
In education, measurement is largely need for the analysis of data from educational assessments of test.
It also needs in diagnosing the weak areas of learning of students.
Measurement is need for the external assessment of the students and relates to the cognitive areas of
man’s achievement.
Problems in measurement in management research
Respondent: At times the respondent may be reluctant to express strong negative feelings or it is just
possible that he may have very little knowledge but may not admit his ignorance. All this reluctance is
likely to result in an interview of ‘guesses.’ Transient factors like fatigue, boredom, anxiety, etc. may
limit the ability of the respondent to respond accurately and fully.
Situation: Situational factors may also come in the way of correct measurement. Any condition which
places a strain on interview can have serious effects on the interviewer-respondent rapport. For instance,
if someone else is present, he can distort responses by joining in or merely by being present. If the
respondent feels that anonymity is not assured, he may be reluctant to express certain feelings.
Measurer: The interviewer can distort responses by rewording or reordering questions. His behaviour,
style and looks may encourage or discourage certain replies from respondents. Careless mechanical
processing may distort the findings. Errors may also creep in because of incorrect coding, faulty
tabulation and/or statistical calculations, particularly in the data-analysis stage.
Instrument: Error may arise because of the defective measuring instrument. The use of complex words,
beyond the comprehension of the respondent, ambiguous meanings, poor printing, inadequate space for
replies, response choice omissions, etc. are a few things that make the measuring instrument defective
and may result in measurement errors. Another type of instrument deficiency is the poor sampling of the
universe of items of concern.
Researcher must know that correct measurement depends on successfully meeting all of the problems listed
above. He must, to the extent possible, try to eliminate, neutralize or otherwise deal with all the possible
sources of error so that the final results may not be contaminated.
Reliability is about the consistency of a measure, and validity is about the accuracy of a measure.
What is reliability?
Reliability refers to how consistently a method measures something. If the same result can be consistently
achieved by using the same methods under the same circumstances, the measurement is considered reliable.
Ex- You measure the temperature of a liquid sample several times under identical conditions. The
thermometer displays the same temperature every time, so the results are reliable.
Reliability is a necessary contributor to validity but is not a sufficient condition for validity, e.g., if a
weighing scale consistently measures correct weight, then it is both reliable and valid. However, if it
consistently overweighs by three kgs, then the instrument is reliable (as it is giving the same result again and
again) but not valid since it is overweighing by three kgs. So, if a measurement is not valid, it hardly matters
if it is reliable – because it does not measure what the designer needs to measure in order to solve the
research problem.
Types of reliability
1. Test-retest: The consistency of a measure across time: do you get the same results when you repeat
the measurement? Ex- A group of participants complete a questionnaire designed to measure
personality traits. If they repeat the questionnaire days, 0weeks, or months apart and give the same
answers, this indicates high test-retest reliability.
2. Internal consistency: The consistency of the measurement itself: do you get the same results from
different parts of a test that are designed to measure the same thing? Ex- You design a questionnaire
to measure self-esteem. If you randomly split the results into two halves, there should be a strong
correlation between the two sets of results. If the two results are very different, this indicates low
internal consistency.
3. Interrater: The consistency of a measure across raters or observers: do you get the same results
when different people conduct the same measurement? Ex- Based on an assessment criteria checklist,
five examiners submit substantially different results for the same student project. This indicates that
the assessment checklist has low inter-rater reliability (for example, because the criteria are too
subjective).
What is validity?
Validity refers to how accurately a method measures what it is intended to measure. If research has high
validity that means it produces results that correspond to real properties, characteristics, and variations in the
physical or social world.
High reliability is one indicator that a measurement is valid. If a method is not reliable, it probably isn’t
valid.
Ex- If the thermometer shows different temperatures each time, even though you have carefully controlled
conditions to ensure the sample’s temperature stays the same, the thermometer is probably malfunctioning,
and therefore its measurements are not valid.
For example, variable like behavior of employees to measure consumer satisfaction in a big shopping mall is
a validity issue. As behavior of employees is not the only determinant of consumer satisfaction rather various
other factors such as pricing policies, discount policy, parking facility, and others may be responsible for
generating consumer satisfaction. Hence, the tool that was designed to measure consumer satisfaction from
“employee’s behavior” may not be a valid measurement tool. The researchers are always concerned about the
validity of their measuring instrument. Validity is referred in context of two terms viz., internal & external
validity. External Validity refers to the generalizability of research findings to the external environment like
population, variables, etc. In other words, external validity of research findings is the data’s ability to be
generalized across universe. On the other hand, internal validity is the ability of a research instrument to
measure what it is purported (supposed) to measure.
Types of validity
Content This category looks at whether the A test that aims to measure a class of
instrument adequately covers all the students’ level of Spanish contains reading,
content that it should with respect to the writing and speaking components, but no
variable. listening component. Experts agree that
listening comprehension is an essential
aspect of language ability, so the test lacks
In other words, does the instrument cover
content validity for measuring the overall
the entire domain related to the variable, or
level of ability in Spanish.
construct it was designed to measure?
Criterion Refers to how well the measurement of A job applicant takes a performance test
one variable can predict the response of during the interview process. If this test
another variable. accurately predicts how well the employee
will perform on the job, the test is said to
have criterion validity.
There are different levels of measurement in statistics and data measured using them can be broadly
classified into qualitative and quantitative data.
Nominal scale is a naming scale, where variables are simply “named” or labeled, with no specific order.
Ordinal scale has all its variables in a specific order, beyond just naming them. Interval scale offers labels,
order, as well as, a specific interval between each of its variable options. Ratio scale bears all the
characteristics of an interval scale, in addition to that, it can also accommodate the value of “zero” on any of
its variables.
Nominal Scale
A nominal scale is the 1st level of measurement. This scale used to label variables that have no quantitative
values.
M- Male
F- Female
Here, the variables are used as tags, and the answer to this question should be either M or F.
Eye color: Blue, green, brown
Hair color: Blonde, black, brown, grey, other
Blood type: O-, O+, A-, A+, B-, B+, AB-, AB+
Political Preference: Republican, Democrat, Independent
Place you live: City, suburbs, rural
Merits of Nominal Scale
a) Nominal scale provides convenient ways of keeping track of people, objects & events.
b) Nominal scale describes differences between things by assigning them to categories.
c) Nominal scales are counted data.
Ordinal Scale
The ordinal scale is the 2nd level of measurement that reports the ordering and ranking of data without
establishing the degree of variation between them. Ordinal represents the “order.” Ordinal data is known as
qualitative data or categorical data. It can be grouped, named, and ranked.
o Very often
o Often
o Not often
o Not at all
Assessing the degree of agreement
o Totally agree
o Agree
o Neutral
o Disagree
o Totally disagree
Merits of Ordinal Scale
a) The ordinal scale implies a statement of „greater than‟ or „less than‟ without being able to state how
much greater or less.
b) Ordinal scale permits the ranking of items from highest to lowest.
Interval Scale
The interval scale is the 3rd level of measurement scale. It is defined as a quantitative measurement scale in
which the difference between the two variables is meaningful. In other words, the variables are measured in
an exact manner, not as in a relative way in which the presence of zero is arbitrary.
Characteristics of Interval Scale:
The interval scale is quantitative as it can quantify the difference between the values
It allows calculating the mean and median of the variables
To understand the difference between the variables, you can subtract the values between the variables
The interval scale is the preferred scale in Statistics as it helps to assign any numerical values to
arbitrary assessment such as feelings, calendar types, etc.
All the techniques applicable to nominal and ordinal data analysis are applicable to Interval Data as
well.
Net Promoter Score, Likert Scale, Bipolar Matrix Table are some of the most effective types of
interval scale.
Example:
This scale has all characteristics of the ordinal scale; in addition it has the property of equality of interval i.e.,
the distance between I and II is the same as the distance between II and III, so one can interpret not only the
order of scale scores but also the distance between them.
Time elapsed between 1:00 pm and 3:00 pm is the same as time elapsed between 8:00 pm and 10:00 pm.
The Fahrenheit scale is also an example of an interval scale. One can say that an increase in temperature from
300 to 400 involves the same increase in temperature as an increase from 600 to 700. The temperature of four
cities is: Shimla 100C, Delhi 200C, Banglore 220C and Jaipur 370C. It can be said that the difference in the
temperature of Delhi and Shimla is the same as difference in the temperature of Jaipur and Banglore.
However, we cannot say that Delhi is two times warmer than Shimla. This is because interval scale does not
have an arithmetic origin, rather they possess arbitrary origin i.e., 00C does not mean there is no temperature.
These variables have no “true zero” value. For example, it’s impossible to have a credit score of zero. It’s
also impossible to have an SAT score of zero. And for temperatures, it’s possible to have negative values
(e.g. -10° F) which means there isn’t a true zero value that values can’t go below.
Ratio Scale
The ratio scale is the 4th level of measurement scale, which is quantitative. It is a type of variable
measurement scale. It allows researchers to compare the differences or intervals. The ratio scale has a unique
feature. It possesses the character of the origin or zero points.
Example:
Height: Can be measured in centimeters, inches, feet, etc. and cannot have a value below zero.
Weight: Can be measured in kilograms, pounds, etc. and cannot have a value below zero.
Length: Can be measured in centimeters, inches, feet, etc. and cannot have a value below zero.
These variables have a “true zero” value. For example, length, weight, and height all have a minimum value
(zero) that can’t be exceeded. It’s not possible for ratio variables to take on negative values. For this reason,
the ratio between values can be calculated. For example, someone who weighs 200 lbs. can be said to weigh
two times as much as someone who weighs 100 lbs. Likewise someone who is 6 feet tall is 1.5 times taller
than someone who is 4 feet tall.
Ratio Special scales Fixed point of origin or zero All statistical operations
Concept of Scale – Scaling is a technique used for measuring qualitative responses of respondents such as
those related to their feelings, perception, likes, dislikes, interests and preferences.
Rating Scales viz. Likert Scales
A rating scale is a popular closed-ended question type where you can assign different weights to each answer
option. Survey takers are typically asked to choose from multiple options scaled between two extremes such
as Unsatisfied to Satisfied. The rating scale can help you quantify subjective sentiments such as satisfaction,
experience, perception, loyalty, etc.
In these types of rating scale survey questions, the survey participants are required to respond to
graphics/images instead of numbers. For example, you must have seen star ratings (1 to 5) given by existing
customers while shopping online. The same can be seen in movie review platforms such as IMDB, where
you can give star ratings for a movie.
The facial expression/smiley face is another popular example that is used to measure a person’s satisfaction
or discomfort. Such pictorial or graphical scales are helpful, especially when you have to take feedback from
people who are not fluent in your language.
2. Slider Rating Scale
The slider scale allows people to respond by dragging a slider to an answer option that they find the most
appropriate. Such questions save your respondents’ time as they are not required to enter any text or number.
3. Frequency Scale
This type of rating scale question can help you understand how frequently a respondent performs a particular
behavior. This data can prove to be quite helpful for marketing experts who wish to understand customer
interactions, touchpoints, and product developers who wish to decode product usage patterns. Depending on
the nature of your study, you can provide specific answer options such as “every day”, “once a week”, or go
for more general options such as “sometimes”, “rarely”, etc. Overall, this is a great question to understand
consumer behavior towards your product or service.
4. Comparative Scale
This type of rating scale question allows respondents to compare between options and then select the one that
best meets the criteria of the question. For example, you can ask your customers, “Which feature do you find
the most useful in our product?” The customers can compare between two options and go for the feature that
they find the most beneficial.
In certain surveys or research numeric scale may not be of much help. A descriptive rating scale, explains
each option for the respondent. It contains a thorough explanation for the purpose of gathering information
with deep insights.
Likert Scales
Semantic Differential Scales
Constant Sum Scales
A constant sum scale is a type of question used in a market research survey in which respondents are required
to divide a specific number of points or percents as part of a total sum. The allocation of points are divided to
detail the variance and weight of each category.
For example, you may want to ask respondents to allocate 100 points among four different package designs
in a way that reflects their likelihood to purchase.
Ranking Scales
A ranking question is a type of survey question that asks respondents to compare a list of items with each
other and arrange them in order of preference. It is used by market researchers to understand the order of
importance of items from multiple items.
A ranking scale is a close-ended scale that allows respondents to evaluate multiple row items in relation to
one column item or a question in a ranking survey and then rank the row items. It is the scale used by market
researchers to ask ranking questions.
On a ranking scale, the question may be in terms of product features, needs, wants, etc. It can be used for
both online and offline surveys.
Paired comparison & Forced Ranking
Paired comparisons (i.e. simultaneously comparing two things with each other) are made by each respondent among a set of
items using a binary scale that indicates which of the two choices.
This is practically helpful when priorities are not clear enough, when alternatives are completely different from one
another, or when there is little objective data to base our decision on. It is useful in a wide range of applications, from
selecting the concept design for a new product before it goes into production, to deciding the skills and qualifications
when hiring people for a new position.
Forced Ranking Method
Forced ranking is a controversial management tool which measures, ranks and grades employees' work
performance based on their comparison with each other instead of against fixed standards.