Untitled
Untitled
Untitled
Scaling is the branch of measurement that involves the construction of an instrument thatassociates
qualitative constructs with quantitative metric units. Scaling evolved out of efforts inpsychology and
education to measure "unmeasurable" constructs like authoritarianism and self esteem. In many ways,
scaling remains one of the most arcane and misunderstood aspects of social research measurement.
And, it attempts to do one of the most difficult of research tasks --measure abstract concepts.Scaling is
the assignment of objects to numbers according to a rule. In scaling, the objects aretext statements,
usually statements of attitude or belief. It's how we get numbers that can bemeaningfully assigned to
objects -- it's a set of procedures.Scales are generally divided into two broad categories: unidimensional
and multidimensional.Ex. Unidimensional
Intelligence (9 factors), PersonalityThe unidimensional scaling methods were developed in the first half
of the twentieth century andare generally named after their inventor.
Purposes of Scaling
Why do we do scaling? Why not just create text statements or questions and use responseformats to
collect the answers? First, sometimes we do scaling to test a hypothesis. We mightwant to know
whether the construct or concept is a single dimensional or multidimensional one(more about
dimensionality later). Sometimes, we do scaling as part of exploratory research. Wewant to know what
dimensions underlie a set of ratings. For instance, if you create a set of questions, you can use scaling to
determine how well they "hang together" and whether theymeasure one concept or multiple concepts.
But probably the most common reason for doingscaling is for scoring purposes. When a participant gives
their responses to a set of items, weoften would like to assign a single number that represents that's
person's overall attitude or belief.
People often confuse the idea of a scale and a response scale. A response scale is the way youcollect
responses from people on an instrument. You might use a dichotomous response scalelike
Agree/Disagree, True/False, or Yes/No. Or, you might use an interval response scale like a1-to-5 or 1-to-
7 rating. But, if all you are doing is attaching a response scale to an object orstatement, you can't call
that scaling. As you will see, scaling involves procedures that you doindependent of the respondent so
that you can come up with a numerical value for the object. Intrue scaling research, you use a scaling
procedure to develop your instrument (scale) and youalso use a response scale to collect the responses
from participants. But just assigning a 1-to-5response scale for an item is not scaling! The differences are
illustrated in the table below.
process
is used to collectthe
response
scale value
item
not
set of items
used for a
single item
involve the direct comparison of stimulus objects. Comparative scale datamust be interpreted in relative
terms and have only ordinal or rank order properties.
select one (example : Do you prefer Pepsi or Coke?). This is an ordinal level technique whena
measurement model is not applied. Krus and Kennedy (1977) elaborated the pairedcomparison scaling
within their domain-referenced model. The Bradley – Terry– Luce (BTL)model (Bradley and Terry, 1952;
Luce, 1959) can be applied in order to derivemeasurements provided the data derived from paired
comparisons possess an appropriatestructure. Thurstone's Law of comparative judgment can also be
applied in such contexts.
Rasch model scaling – respondents interact with items and comparisons are inferred betweenitems
from the responses to obtain scale values. Respondents are subsequently also scaledbased on their
responses to items given the item scale values. The Rasch model has a closerelation to the BTL model.
Bogardus social distance scale – measures the degree to which a person is willing to
associate with a class or type of people. It asks how willing the respondent is to make
variousassociations. The results are reduced to a single score on a scale. There are also non-comparative
versions of this scale.
Q-Sort –
Guttman scale–
This is a procedure to determine whether a set of items can be rank-orderedon a unidimensional scale.
It utilizes the intensity structure among several indicators of agiven variable. Statements are listed in
order of importance. The rating is scaled by summingall responses until the first negative response in the
list. The Guttman scale is related toRasch measurement; specifically, Rasch models bring the Guttman
approach within aprobabilistic framework.
Constant sum scale – a respondent is given a constant sum of money, script, credits, orpoints and
asked to allocate these to various items (example : If you had 100 Yen to spend onfood products, how
much would you spend on product A, on product B, on product C, etc.).This is an ordinal level technique.
Stevens people simply assign numbers to the dimension of judgment. The geometric mean of those
numbers usually produces a power law with a characteristic exponent. In cross-
modality matching instead of assigning numbers, people manipulate another dimension, such
as loudness or brightness to match the items. Typically the exponent of the psychometricfunction can be
predicted from the magnitude estimation exponents of each dimension.
In
Noncomparative scales
, each object is scaled independently of the others in the stimulus set.The resulting data are generally
assumed to be interval or ratio scaled.
placing a mark on a line. The line is usually labeled at each end. There are sometimes a
series of numbers, called scale points, (say, from zero to 100) under the line. Scoring andcodification is
difficult.
Likert scale
Respondents are asked to indicate the amount of agreement or disagreement(from strongly agree to
strongly disagree) on a five- to nine-point scale. The same format isused for multiple questions. This
categorical scaling procedure can easily be extended toa magnitude estimation procedure that uses the
full scale of numbers rather than verbal
categories.
response scale in which 0 represents the absence of the theoretical construct and 10represents the
theorized maximum amount of the construct being measured. The same basicformat is used for multiple
questions.
Respondents are asked to rate on a 7 point scale an item onvarious attributes. Each attribute requires a
scale with bipolar terminal labels.
Stapel scale
–
Thurstone scale
There are three major types of unidimensional scaling methods. They are similar in that theyeach
measure the concept of interest on a number line. But they differ considerably in how theyarrive at
scale values for different items. The three methods are
Thurstone
or Equal-AppearingInterval Scaling,
Likert
Guttman
or "Cumulative" Scaling.
A.
Thurston Scale
In social science studies, while measuring attitudes of the people we generally follow
the technique of preparing the opinionnaire (or attitude scale) in such a way that the score of
the individual responses assigns him a place on a scale. Under this approach, the respondent
expresses his agreement or disagreement with a number of statements relevant to the issue.
While developing such statements, the researcher must note the following two points:
(i) That the statements must elicit responses which are psychologically related to the attitude
being measured;
(ii) That the statements need be such that they discriminate not merely between extremes of
attitude but also among individuals who differ slightly. Researchers must as well be aware
that inferring attitude from what has been recorded in opinionnaire has several limitations.
People may conceal their attitudes and express socially acceptable opinions. They may not
really know how they feel about a social issue. People may be unaware of their attitude about
an abstract situation; until confronted with a real situation, they may be unable to predict their
reaction. Even behaviour itself is at times not a true indication of attitude. For instance, when
politicians kiss babies, their behaviour may not be a true expression of affection toward
infants. Thus, there is no sure method of measuring attitude; we only try to measure the
expressed opinion and then draw inferences from it about people’s real feelings or attitudes.
With all these limitations in mind, psychologists and sociologists have developed several
scale construction techniques for the purpose. The researcher should know these techniques
so as to develop an appropriate scale for his own study. Some of the important approaches,
along with the corresponding scales developed under each approach to measure attitude are
The name of L.L. Thurston is associated with differential scales which have been
developed using consensus scale approach. Under such an approach the selection of items is
made by a panel of judges who evaluate the items in terms of whether they are relevant to the
(a) The researcher gathers a large number of statements, usually twenty or more, that express
various points of view toward a group, institution, idea, or practice (i.e., statements belonging
to the topic area).
(b) These statements are then submitted to a panel of judges, each of whom arranges them in
eleven groups or piles ranging from one extreme to another in position. Each of the judges is
requested to place generally in the first pile the statements which he thinks are most
unfavourable to the issue, in the second pile to place those statements which he thinks are
next most unfavourable and he goes on doing so in this manner till in the eleventh pile he
(c) This sorting by each judge yields a composite position for each of the items. In case of
marked disagreement between the judges in assigning a position to an item, that item is discarded.
(d) For items that are retained, each is given its median scale value between one and eleven as
established by the panel. In other words, the scale value of any one statement is computed as
(e) A final selection of statements is then made. For this purpose a sample of statements
,whose median scores are spread evenly from one extreme to the other is taken. The
position of each statement on the scale is the same as determined by the judges.After
developing the scale as stated above, the respondents are asked during the administration of
the scale to check the statements with which they agree. The median value of the statements
that they check is worked out and this establishes their score or quantifies their opinion. It
may be noted that in the actual instrument the statements are arranged in random order of
scale value. If the values are valid and if the opinionnaire deals with only one attitude
dimension, the typical respondent will choose one or several contiguous items (in terms of
scale values) to reflect his views. However, at times divergence may occur when a statement
The Thurstone method has been widely used for developing differential scales which are
utilised to measure attitudes towards varied issues like war, religion, etc. Such scales are
considered most appropriate and reliable when used for measuring a single attitude.
But an important deterrent to their use is the cost and effort required to develop them.
Another weakness of such scales is that the values assigned to various statements by the
The method is not completely objective; it involves ultimately subjective decision process.
Critics of this method also opine that some other scale designs give more information about
the respondent’s attitude in comparison to differential scales.