Construction and Standardisation of Attitude Scale
Construction and Standardisation of Attitude Scale
Construction and Standardisation of Attitude Scale
The following points were kept in mind in the selection of the statements.
Item Analysis
After administering the attitude scale to a preliminary group of respondents, the
researcher does an item analysis to identify the best functioning items. The item
analysis typically yields three statistics for each item: (1) an item discrimination
index, (2) the percentage of respondents marking each choice to each item, and
(3) the item mean and standard deviation.
The item discrimination index shows the extent to which each item discriminates
among the respondents in the same way as the total score discriminates. The item
discrimination index is calculated by correlating item scores with total scale
scores. If high scorers on an individual item have high total scores and if low
scorers on this item have low total scores, then the item is discriminating in the
same way as the total score. To be useful, an item should correlate at least .25
with the total score. Items that have very low correlation or negative correlation
with the total score should be eliminated because they are not measuring the same
thing as the total scale and hence are not contributing to the measurement of the
attitude. The researcher will want to examine those items that are found to be non
discriminating. The items may be ambiguous or double barreled (containing two
beliefs or opinions in one statement), or they may be factual statements not really
expressing feelings about the object. Revising these items may make them usable.
The item analysis also shows the percentage of respondents choosing each of the
five options and the mean and standard deviation for each item. Items on which
respondents are spread out among the options are preferred. Thus, if most
respondents choose only one or two of the options, the item should be rewritten
or eliminated. After selecting the most useful items as indicated by the item
analysis, the researcher should then try out the revised scale with a different
group of subjects and again check the items for discrimination and variability.
Validity
Validity concerns the extent to which the scale really measures the attitude
construct of interest. It is often difficult to locate criteria to be used in obtaining
evidence for the validity of attitude scales. Some researchers have used
observations of actual behavior as the criterion for the attitude being measured.
This procedure is not often used because it is often difficult to determine what
behavior would be the best criterion for the attitude and also because it is
expensive.
One of the easiest ways to gather validity evidence is to determine the extent to
which the scale is capable of discriminating between two groups whose members
are known to have different attitudes. To validate a scale that measures attitudes
toward organized religion, a researcher would determine if the scale
discriminated between active church members and people who do not attend
church or have no church affiliation. A scale measuring attitudes toward abortion
should discriminate between members of pro-life groups and members of pro-
choice groups. By “discriminate,” we mean that the two groups would be
expected to have significantly different mean scores on the scale. Another method
of assessing validity is to correlate scores on the attitude scale with those obtained
on another attitude scale measuring the same construct and whose validity is well
established.
Reliability
The reliability of the new scale must also be determined. Reliability is concerned
with the extent to which the measure would yield consistent results each time it is
used. The first step in ensuring reliability is to make sure that the scale is long
enough—that it includes enough items to provide a representative sampling of the
whole domain of opinions about the attitudinal object. Other things being equal,
the size of the reliability coefficient is directly related to the length of the scale.
Research shows, however, that if the items are well constructed, scales having as
few as 20 to 22 items will have satisfactory reliability (often above .80). The
number of items needed depends partly on how specific the attitudinal object is;
the more abstract the object, the more items are needed. You would also want to
calculate an index of reliability. The best index to use for an attitude scale is
coefficient alpha, which provides a measure of the extent to which all the items
are positively intercorrelated and working together to measure one trait or
characteristic (the attitude). Many statistical computer programs routinely
calculate coefficient alpha as a measure of reliability.
References and Further readings
Brennan, R. L. (2001). Some problems, pitfalls, and paradoxes in educational measurement.
Educational Measurement: Issues and Practice, 20(4), 6–18.
Thorndike, R. M. (2005). Measurement and evaluation in psychology and education. Upper Saddle
River, NJ: Pearson.