Psychology (7th Ed
Psychology (7th Ed
Psychology (7th Ed
As a review of the previous discussion, the instructor will ask the students the following:
It is the extent to which different observers are consistent in their judgments. For example, if you
were interested in measuring university students’ social skills, you could make video recordings of
them as they interacted with another student whom they are meeting for the first time. Then you
could have two or more observers watch the videos and rate each student’s level of social skills.
To the extent that each participant does in fact have some level of social skills that can be
detected by an attentive observer, different observers’ ratings should be highly correlated with
each other. Inter-rater reliability would also have been measured in Bandura’s Bobo doll study. In
this case, the observers’ ratings of how many acts of aggression a particular child committed while
playing with the Bobo doll should have been highly positively correlated.
Interrater reliability is often assessed using Cronbach’s α when the judgments are quantitative or
an analogous statistic called Cohen’s κ (the Greek letter kappa) when they are categorical.
Test-Retest Reliability
Consistency between an individual’s scores on the same test taken at two or more different times.
It is the extent to which this is actually the case. For example, intelligence is generally thought to
be consistent across time. A person who is highly intelligent today will be highly intelligent next
week. This means that any good measure of intelligence should produce roughly the same scores
Assessing test-retest reliability requires using the measure on a group of people at one time,
using it again on the same group of people at a later time, and then looking at test-retest
correlation between the two sets of scores. This is typically done by graphing the data in a
scatterplot and computing Pearson’s r. Figure below shows the correlation between two sets of
scores of several university students on the Rosenberg Self-Esteem Scale, administered two
times, a week apart. Pearson’s r for these data is +.95. In general, a test-retest correlation of +.80
or greater is considered to indicate good reliability.
Test-Retest Correlation Between Two Sets of Scores of Several College Students on the
Rosenberg Self-Esteem Scale, Given Two Times a Week Apart
Internal Consistency
A second kind of reliability is internal consistency, which is the consistency of people’s responses
across the items on a multiple-item measure. In general, all the items on such measures are
supposed to reflect the same underlying construct, so people’s scores on those items should be
correlated with each other.
Internal consistency can only be assessed by collecting and analyzing data. One approach is to
look at a split-half correlation. This involves splitting the items into two sets, such as the first and
second halves of the items or the even- and odd-numbered items. Then a score is computed for
each set of items, and the relationship between the two sets of scores is examined.
Inter-item Reliability
The degree of which different items measuring the same variable attain consistent results.
Validity is the extent to which the scores from a measure represent the variable they are intended to.
Face Validity
The degree to which a manipulation or measurement technique is self- evident.
The extent to which a measurement method appears “on its face” to measure the construct of
interest. Most people would expect a self-esteem questionnaire to include items about whether
they see themselves as a person of worth and whether they think they have good qualities. So a
questionnaire that included these kinds of items would have good face validity.
Content Validity
The degree to which the content of a measure reflects the content of what is being measured.
By this conceptual definition, a person has a positive attitude toward exercise to the extent that he
or she thinks positive thoughts about exercising, feels good about exercising, and actually
exercises. So to have good content validity, a measure of people’s attitudes toward exercise
would have to reflect all three of these aspects.
Like face validity, content validity is not usually assessed quantitatively. Instead, it is assessed by
carefully checking the measurement method against the conceptual definition of the construct.
Predictive Validity
It tells how well a certain measure can predict future behavior.
Predictive validity indicates the ability of a measure to predict performance on some outcome
variable. For instance, an autism screening measure utilized for infants and toddlers (Matson,
Boisjoli, & Wilkins, 2007) should have good predictive validity for future autism diagnoses based
on full evaluations. That is, infants and toddlers deemed “at risk” should be more likely to receive
an autism diagnosis later when they receive a full diagnostic evaluation.
For example, a prediction may be made on the basis of a new intelligence test, that high scorers
at age 12 will be more likely to obtain university degrees several years later. If the prediction is
born out then the test has predictive validity.
Construct validity
The degree to which an operational definition accurately represents the construct it is intended to
manipulate or measure.
This type of validity refers to the extent to which a test captures a specific theoretical construct or
trait, and it overlaps with some of the other aspects of validity. Construct validity does not concern
the simple, factual question of whether a test measures an attribute.
Instead it is about the complex question of whether test score interpretations are theoretical and
observational terms.
Concurrent Validity
The degree to which scores on the measuring instrument correlate with another known standard
measuring the variable being studied.
It pertains to the extent to which the measurement tool relates to other scales measuring the
same construct and that have already been validated.
Internal Validity
The certainty that the changes in behavior observed across treatment conditions in the
experiment were actually caused by independent variable.
It refers to whether the effects observed in a study are due to the manipulation of the independent
variable and not some other factor.
In-other-words there is a causal relationship between the independent and dependent variable.
Internal validity can be improved by controlling extraneous variables, using standardized
instructions, counter balancing, and eliminating demand characteristics and investigator effects.
External validity
It refers to the extent to which the results of a study can be generalized to other settings
(ecological validity), other people (population validity) and over time (historical validity).
Extraneous Variables and Confounding. When we conduct experiments there are other variables that can affect our
results, if we do not control them.
Extraneous variables. A variable other than an independent or dependent variable; a variable that is not
the focus of an experiment but can produce effects on the dependent variable if not controlled.
Confounding. An error that occurs when the value of an extraneous variable changes systematically
along with the independent variable in an experiment; an alternative explanation for the findings that
threatens internal validity.
1. History. In which an outside event or occurrence might have produced effects on the dependent variable.
5. Statistical Regression. Can occur when subjects are assigned to conditions on the basis of extreme scores on a test;
upon retest, the scores of extreme scorers tend to regress toward the mean even without any treatment.
6. Selection. Can occur when non-random procedures are used to assign subjects to conditions or when random
assignment fails to balance out differences among subjects across the different conditions of the experiment.
7. Subject Maturity. Produced by differences in dropout rates across the conditions of the experiment.
8. Selection Interaction. A family of threats to internal validity produced when a selection threat combines with one or
more of the other threats to internal validity; when a selection threat is already present, other threats can affect some
experimental groups but not others.
1. It means it produces results that correspond to real properties, characteristics, and variations in the physical or
social world, because it has
a. High reliability
b. Low Reliability
c. High Validity
d. Low Validity
ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
2. If the thermometer shows different temperatures each time, even though you have carefully controlled conditions
to ensure the sample’s temperature stays the same, the thermometer is probably malfunctioning, and therefore its
measurements are:
a. Not valid
b. Valid
c. Not Reliable
d. Reliable
ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
3. You measure the temperature of a liquid sample several times under identical conditions. The thermometer
displays the same temperature every time, so the results are:
a. Not valid
b. Valid
c. Not Reliable
d. Reliable
ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
4. A doctor uses a symptom questionnaire to diagnose a patient with a long-term medical condition. Several different
doctors use the same questionnaire with the same patient but give different diagnoses. This indicates that the
questionnaire is:
a. Not valid
b. Valid
c. Not Reliable
d. Reliable
ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
5. Based on an assessment criteria checklist, five examiners submit substantially different results for the same
student project. This indicates that the assessment checklist has:
a. High inter-rater reliability
b. Low inter-rater reliability
6. A group of participants complete a questionnaire designed to measure personality traits. If they repeat the
questionnaire days, weeks or months apart and give the same answers, this indicates:
a. High test-retest reliability
b. Low test-retest reliability
c. High inter-rater reliability
d. Low inter-rater reliability
ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
7. “A score of high self-efficacy related to performing a task should predict the likelihood a participant completing the
task”, this is an example of:
a. Face validity
b. Predictive validity
c. Construct validity
d. Concurrent validity
ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
9. It seeks agreement between a theoretical concept and a specific measuring device or procedure. For example, a
researcher inventing a new IQ test might spend a great deal of time attempting to "define" intelligence in order to
reach an acceptable level:
a. Face validity
b. Predictive validity
c. Construct validity
d. Concurrent validity
ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
10. It refers to the degree to which a study accurately reflects or assesses the specific concept that the researcher is
attempting to measure.
a. Reliability
b. Validity
c. Selection
d. Testing
ANSWER: ________
RATIONALIZATION ACTIVITY (THIS WILL BE DONE DURING THE FACE TO FACE INTERACTION)
The instructor will now rationalize the answers to the students. You can now ask questions and debate among yourselves.
Write the correct answer and correct/additional ratio in the space provided.
1. ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
2. ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
3. ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
4. ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
5. ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
6. ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
7. ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
8. ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
9. ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
10. ANSWER: ________
RATIO:___________________________________________________________________________________________
_________________________________________________________________________________________________
_________________________________________________________________________________________________
You will now mark (encircle) the session you have finished today in the tracker below. This is simply a visual to help you
track how much work you have accomplished and how much work there is left to do.
You are done with the session! Let’s track your progress.