Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Subtest Origins

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Running head: FIRST PUBLICATIONS OF SB & WECHSLER TESTS 1

First Publication of Subtests in the Stanford-Binet 5, WAIS-IV, WISC-V, and WPPSI-IV

Aisa Gibbons and Russell T. Warne

Utah Valley University

This manuscript has been accepted for publication in its current form. Please cite as:

Gibbons, A., & Warne, R. T. (2019). First publication of subtests in the Stanford-Binet 5,

WAIS-IV, WISC-V, and WPPSI-IV. Intelligence, 75, 9-18.

doi:10.1016/j.intell.2019.02.005

Aisa Gibbons, Department of Behavioral Science, Utah Valley University; Russell T.

Warne, Department of Behavioral Science, Utah Valley University.

Correspondence concerning this article should be addressed to 1243 Hillside Drive,

Pleasant Grove, UT 84602. Email: aisagibbons@gmail.com


FIRST PUBLICATIONS OF SB & WECHSLER TESTS 2

Abstract

In this article we describe the origins of the subtests that appear on the modern Stanford-Binet

Intelligence Scales (SB5), Wechsler Preschool and Primary Scale of Intelligence (WPPSI-IV),

Wechsler Intelligence Scale for Children (WISC-V), and Wechsler Adult Intelligence Scale

(WAIS-IV). We found that the majority of these subtest formats were first created in 1908 or

earlier and that only three have been created since 1980. We discuss the implications of this

findings, which are that (1) many subtests have lengthy research histories that support their use

in measuring intelligence; (2) many subtests have formats that predate modern theories of test

creation, cognitive psychology, and intelligence; and (3) the history of many subtests is more

complex than psychologists probably realize.


FIRST PUBLICATIONS OF SB & WECHSLER TESTS 3

First Publication of Subtests in the Stanford-Binet 5, WAIS-IV, WISC-V, and WPPSI-IV

One of the first successes in applied psychology was the development of intelligence

tests. Early tests in the 1910’s and 1920’s found rapid, widespread acceptance, with millions of

American examinees tested every year (Cronbach, 1975; Thorndike, 1975; Yerkes, 1921). The

use of these tests persists today, and in the 21st century the most popular individually

administered intelligence tests are the Stanford-Binet Intelligence Scale (SB5) and the Wechsler

Intelligence Scales, the latter of which are the Wechsler Adult Intelligence Scale (WAIS-IV), the

Wechsler Intelligence Scale for Children (WISC-V), and the Wechsler Preschool and Primary

Scale of Intelligence (WPPSI-IV). These instruments have dominated intelligence testing for

decades. The original version of the Stanford-Binet scale was first published over 100 years ago

(Terman, 1916), though many of the items were direct translations or close adaptations of items

from Binet’s 1905, 1908, and 1911 intelligence scales. Ironically, Binet and Terman had

opposite goals in their work on intelligence testing. Binet aimed to identify children who were

struggling academically (Wolf, 1973), while Terman had an interest in identifying gifted

children—an interest which started with his dissertation (Terman, 1905) and lasted until his

death. Indeed, Terman’s research on gifted children is his work that Terman is best remembered

today (Warne, 2019). The Stanford-Binet has been revised several times since 1916, with the

fifth edition, published in 2003, being the most recent.

The first Wechsler scale appeared in 1939 as the Wechsler-Bellevue, an intelligence test

designed for adult examinees (see description in Wechsler, 1944), as opposed to the child

examinees that Terman designed the Stanford-Binet for. Wechsler disapproved of the heavily

verbal content of the early versions of the Stanford-Binet and of the test’s ability to produce a
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 4

global IQ as the only measure of a person’s intellectual level (Wechsler, 1944). Therefore, he

designed his test to produce a verbal and performance (i.e., non-verbal) IQ score. To create the

Wechsler-Bellevue, Wechsler evaluated item formats that appeared on prior scales and selected

the ones which he thought were the best measures of intelligence, based on his research (Boake,

2002; Wechsler, 1944) and his experience administering the Army Alpha and Army Beta in

Texas during World War I (Yerkes, 1921, pp. 40, 80). As he wrote, “Our aim was not to produce

a set of brand new tests but to select, from whatever source available, such a combination of

them as would best meet the requirements of an effective adult scale” (Wechsler, 1944, p. 76).

Wechsler favored test formats and items that (a) showed high discrimination in intelligence

across much of the continuum of ability, (b) produced scores with high reliability, (c) correlated

strongly with other widely accepted measures of intelligence, and (d) correlated with

“pragmatic” subjective ratings of intelligence from people who knew the examinee—such as a

work supervisor (Wechsler, 1944). These criteria led Wechsler to believe that, for example, an

information subtest was effective but that the Army Beta’s cube analysis subtest was not

(because the latter was incapable of discriminating among people with intellectual disabilities).

The success of the scale led Wechsler to create a separate test for children (the WISC) in 1949

and another for preschool children (the WPPSI) in 1967. All Wechsler tests have been revised

several times since their creation (Kaplan & Saccuzzo, 2018).

Throughout the years, however, psychologists have updated these tests with new analyses

and norm samples, while also adding or removing subtests. Despite the revisions that have

occurred over the decades, the revisers of the Wechsler scales or the Stanford-Binet have never

completely replaced every subtest when updating an intelligence scale. The result is that
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 5

contemporary versions of these tests are an amalgamation of old subtest formats and modern test

construction methods.

It is the legacy of these old subtests on modern tests that intrigued us. Knowing that many

subtests on the Stanford-Binet or the Wechsler scales long predate the current versions of these

tests, we investigated the origins of these subtests, hoping to find the earliest publication of the

subtest format in the scholarly literature. Throughout the history of the changes to the subtests,

there has never been a compilation of the origins of the subtests on popular intelligence scales.

Considering many of the subtests that have long been part of the SB or the Wechsler scales are

still in use today, it is important to understand where they came from. The origin of these

subtests provides valuable information about the creation of the SB and Wechsler scales and may

shed light on test theory and test score interpretation. We believed that understanding the history

of subtests would lead intelligence test users to have a greater appreciation of these subtests.

Moreover, we have engaged in this historical research with the goal of correcting

misconceptions that psychologists have about the origin of frequently used intelligence subtests.

For example, in one article the authors claimed that Corsi invented the block tapping task in

1972 (Wongupparaj, Wongupparaj, Kumari, & Morris, 2017, p. 72). In reality, we show below

that the task was invented in 1913. Likewise, we found multiple sources (e.g., Boake, 2002;

Frank, 2013) that stated that the picture completion subtest (found on the WAIS-IV) originated

with Healy (1914), but we discovered that Healy’s task is different from the modern subtest,

which originated with Binet (see below). We believe that such misconceptions are probably

common. An incorrect understanding of the origin of a subtest may limit the thoroughness of

literature searches about psychometric validity or the Flynn effect. Finally, research about the
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 6

subtests’ psychometric properties outside of the context of the SB or Wechsler scales can

strengthen scientists’ interpretations of what these tests measure.

Search Procedures

The task of identifying the origin of subtests may seem easy at first glance, but there are

circumstances that make the task difficult. When the SB or Wechsler scales were first created or

later updated, the test creators or revisers often did not state any origins of the subtests on their

scales, let alone provide any citations for the first description of subtests. Modern test manuals

for these tests are silent on the issue of the origin of their subtests, probably because many

readers do not find information on the origins as important as technical data (e.g., validity of

interpretations and score reliability), administration instructions, or interpretive guidelines.

Lastly, throughout their history, many of these subtests have been known by different names or

were changed slightly (e.g., from written format to oral). These changes sometimes made it hard

to track down a particular subtest’s origin.

Our search procedures for these tests started with a careful reading of lengthy accounts of

the early history of intelligence testing (Boake, 2002; Matarazzo, 1972; Peterson, 1926/1969;

Wolf, 1973; Young, 1924). When these works discussed a particular subtest that resembled a

subtest on a modern Wechsler scale or the Stanford-Binet 5, we investigated literature that the

author cited so that we could track down the original source of the subtest. We did the same for

two sources about the history of a specific subtest (Richardson, 2005, 2011). Additionally, we

consulted the manuals for the original Stanford-Binet (Terman, 1916), the Army Alpha and

Army Beta (Yerkes, 1921), and we read English translations of Binet’s reports of his original

scales (Binet 1911/1916; Binet & Simon, 1905a/1916, 1908/1916) to understand which subtests

appeared on these influential instruments and to try to link them with modern subtests. Finally,
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 7

we conducted searches of each subtest’s names in Google Scholar and PsycInfo in an effort to

search for any earlier mentions of the subtests than what we had found.

We conducted all searches for each individual subtest, and we never searched for

multiple subtests’ origins at the same time. Once we identified an early use of a subtest, we

verified that the description did indeed correspond to modern intelligence subtests. (This was

important because sometimes a verbal description, such as “picture completion,” did not

correspond to a modern subtest.) In an effort to verify that we had indeed found the earliest

publication or description of a test, we would then search for earlier sources than what we had

found. To do this, we first searched the source’s citations in an effort to find any earlier

indications of the subtest’s use in the scholarly literature. We also conducted searches of

scholarly databases using terminology we found in the article to look for earlier sources. When

we exhausted these avenues and failed to find any earlier sources, we stopped the search for an

earlier publication of a subtest.

Subtests

Table 1 lists all of the subtests found in the Stanford-Binet 5, the WAIS-IV, the WISC-V,

and the WPPSI-IV. Subtests with very similar formats are combined into a single row. For

example, the WISC-V Digit Span, WAIS-IV Digit Span, WISC-V Picture Span, and WISC-V

Letter-Number Sequencing all require examinees to repeat in order a sequence of stimuli that

have been presented. Although the stimuli and/or difficulty differ, the required tasks are all

sufficiently similar that we saw the later subtests (e.g., WISC-V Letter-Number Sequencing) as

an adaptation of the original test (i.e., the Digit Span subtest). Thus, for each row in Table 1 we

only searched for a single subtest origin. Finally, readers should note that the subtests in Table 1
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 8

and in the rest of this section are listed in alphabetical order; when subtests are combined, we

listed the most widely known name for the subtest first.

INSERT TABLE 1 ABOUT HERE.

Arithmetic/Verbal & Nonverbal Quantitative Reasoning

Arithmetic items consist of questions relating to mathematics such as addition,

subtraction, multiplication, and division and have been on intelligence tests for a long time.

According to Wechsler, arithmetic items were used as a, “rough and ready measure of

intelligence,” as early as the late 1800s (1944, p. 82). Arithmetic items were common on

academic achievement tests before Binet, but these questions would not have been standardized

among different tests. Not only were arithmetic tests found on the original Wechsler scales, but

these types of items can also be found on the Army Alpha (Yoakum & Yerkes, 1920), Binet’s

scales (in questions like, “Counting 4 single sous,” were found on the 1908 scale; Binet &

Simon, 1908/1916), and on a test of reasoning ability created by Bonser (1910), though these do

not predate Binet and Simon’s (1908) use. Stone (1908) also created a standardized arithmetic

test that resembles items found on early intelligence tests, though it is unclear whether his work

had any influence on the creators or revisers of the Stanford-Binet or Wechsler tests.

The verbal quantitative reasoning subtest found on the Stanford-Binet 5 consists of items

where subjects are asked to count, perform addition and subtraction problems, and name

numbers. This test is extremely similar to arithmetic, so we believe that it has the same origin as

the arithmetic subtest.

Due to the demand recently to have more nonverbal items on intelligence scales, a

nonverbal quantitative reasoning subtest was formed for the SB5. The main difference between

nonverbal and verbal quantitative reasoning is that the verbal version of the test has the questions
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 9

written in words and numbers, while the nonverbal version uses pictures to ask arithmetic

questions. Even though the nonverbal quantitative reasoning subtest uses pictures instead of

words, its origins can be traced back to the same place as arithmetic items (which use words to

ask math questions). Verbal arithmetic-like items have not only been found on the early Binet

scales (Binet & Simon, 1908/1916), but have also been used to quickly measure intelligence

even before psychometrics was developed (Wechsler, 1944). It is likely that these types of items

were also used on exams such as academic achievement tests. Nonverbal forms of arithmetic

questions may have been used for other academic purposes as well, but the first time nonverbal

quantitative reasoning items have been found on an intelligence scale is in the latest version of

the SB5.

Block Design

In the block design subtest on the Wechsler tests, the examinee recreates a picture or

model they have seen with blocks. This subtest was first published in Kohs’s 1920 article, “The

Block-Design Tests.” Kohs stated in the opening paragraphs of his article that his goal was to

create a performance task that could measure intelligence without using language in the

instructions or executing the task. According to Boake (2002), Kohs based his cube task on a

game of the time named Color Cubes, which were being used already in classrooms to teach

children to imitate visual designs and learn colors (e.g., The Special Class Teachers’ Club,

1917).

Block Span

In the block span subtest, an examinee is shown an array of blocks, which the examiner

taps in a predetermined order. The examinee must then repeat the sequence of taps. (In some

intelligence test batteries, this test is called the Corsi block test; we call it “block span” because
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 10

this is its name on the SB5.) Although the block span subtest resembles the more common digit

span subtest (see below), the two subtests have different origins. The block span task has its roots

traced back to Knox’s (1913) Cube Imitation Test, which was designed as part of a non-verbal

test battery to identify immigrants at Ellis Island who had intellectual disabilities (Richardson,

2011). In Knox’s original version, the examiner would tap a series of larger cubes in a

predetermined order with a smaller cube; the examinee was to then use the smaller cube to repeat

the sequence. In later versions, the smaller cube was replaced with another object, generically

called a “pawn” (Richardson, 2005). In the modern SB5 block span subtest, the pawn has been

removed from the test, with the examinee instead using their fingers to tap the blocks in the order

they are shown. It is interesting to note that because screening immigrants was a task for

physicians, Knox had a distinctly medical viewpoint of intellectual disabilities and recommended

that only physicians administer his test battery, including the cube imitation test.

Cancellation

In the cancellation subtest, examinees are given a paper with a wide variety of random

symbols on it (e.g., jumbled letters of the alphabet). The examinee is also told to cross out every

example of a particular target symbol that they can find (e.g., every “B”). The earliest mention

we can find of this test comes from Peterson (1926/1969, pp. 79-80), who stated that Oehrn

reported in his 1889 dissertation a cancellation test of sorts that would ask the subjects to find

certain letters. Oehrn’s cancellation test was one of three that he used to measure “perception,”

the others requiring examinees to count the number of letters printed randomly on a page and to

notice errors as they proofread a passage (Peterson, 1926/1969; Spearman, 1904).

Coding/Animal Coding
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 11

In the coding subtest, examinees are given a key for symbols and are asked to decode a

message based on that same key. For example, the key could be as simple as 1 = A, 2 = B, etc.

With this sort of code, the examinee would then be asked to encode a message, like converting

“cat” to “3-1-20”. An influential use of the coding subtest was the “digit symbol subtest” found

on the Army Beta (Yerkes, 1921; Yoakum & Yerkes, 1920), which required examinees to

convert numbers to geometric symbols. The Army Beta creators credited the first appearance of

a coding subtest to Pyle’s (1913) book, where he called it the “substitution test.” However,

Dearborn (1910) seems to be the first researcher to use this type of subtest. In his study,

Dearborn gave different tasks to college students in their classroom. One of them was “The

Practice Experiment,” which strongly resembles the modern-day coding subtest. Dearborn

(1910) believed that this task could measure the speed at which a person could master a new

piece of information (i.e., the code) and reproduce it. However, it is important to notice, that

Dearborn administered the task across multiple days, whereas modern tests only administer

coding tasks on a single day.

Comprehension

During a comprehension subtest, the subject is asked to produce the answer to a question

that is not considered a “fact,” but which can be answered using previously learned informal

knowledge. Wechsler (1944) acknowledged that comprehension questions predate the creation of

his instruments. The original questions seem to appear on Binet’s original scale (Binet & Simon,

1905a/1916) under the section titled “Reply to an Abstract Question,” as well as the 1908 scale

(under the subtest titled “Comprehension Questions”; see Binet & Simon, 1908/1916). One

example from the third version of Binet’s test is, “When one breaks something belonging to

another what must one do?” (Binet, 1911/1916, p. 224). In his original scale, Binet stated that,
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 12

“This test is one of the most important of all, for the diagnosis of mental debility” (Binet &

Simon, 1905a/1916, p. 65). Wechsler (1944) also mentions that comprehension tests are those

that involve common sense, knowledge of practical information, and the ability to use past

experience (p. 81).

Delayed Response

In the delayed response subtest on the SB5 there are three cups with a toy under them.

After mixing the three cups around, the examinee tries to select the cup has the toy underneath it.

According to Roid and Barram (2004), this subtest is based on the “classic shell game” (p. 39),

and is a measure of short-term memory. The “classic shell game” has been used in criminal

activity (e.g., three-card monte) and as a magic trick. (However, it is important to note that the

delayed response subtest found on the SB5 lacks the deception of the classic shell game or a

sleight-of-hand trick.) Though the SB5 seems to be the first time that the shell game has

appeared on an intelligence scale, this task long predates intelligence tests. Apparently, this game

came to America in the 18th century from England as a variant called “thimble-rig” (Maurer,

1947). Though thimble-rig was played with thimbles instead of cups, it still had the basic

concept of the subject determining which cover had the object underneath it. Different versions

of thimble-rig are still played today and have been for centuries.

Digit Span/Picture Span/Letter-Number Sequencing

In the digit span subtest, the examiner verbally gives a series of one-digit numbers which

the subject must repeat. In some variants (often called backward digit span or reverse digit span),

the subject must repeat the sequence backwards. The picture span subtest is extremely similar;

during the task, the examinee is shown a set of pictures and then the subject must select the

pictures (preferably in a specified order) from a different array of images. The letter-number
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 13

sequencing subtest consists of verbally giving the examinee a set of numbers and letters and then

asking the examinee to repeat them back in alphabetical and numerical order.

The origin of the digit span subtest was an article by Jacobs (1887) in which he described

studies on school-age children in which the examiner read numerals out loud twice and then

required the subjects to repeat the numbers (either aloud or written on paper). Jacobs was

inspired by Ebbinghaus’s research memorizing nonsense symbols, and he believed that the digit

span task would have more accessible stimuli as a test of short-term memory capacity. This task

found its way to the 1905 version of Binet’s scale, where it was named “Repetition of Three

Figures” (Binet & Simon, 1905a/1916) because Binet had his examinees repeat back three

numbers (figures) that the examiner gave orally. It is not clear whether Binet was aware of

Jacobs’s (1887) article, but Jacobs showed that performance on digit span was better for

successively older children. This age progression in performance was a characteristic that Binet

saw as desirable in a task because he believed that intellectual ability increased (on average) with

age in children.

Digit span is a perennially popular subtest on intelligence tests, and innovations are not

unusual. Terman (1916, p. 207) credited Bobertag with inventing the backward digit span test in

1911. Blair (1957) suggested a nonverbal task which we see as the precursor to the modern

picture span subtest on the WAIS-IV and WISC-V. Blair’s task was designed to measure

memory span in deaf and hearing children by showing young examinee a series of cards with

visual stimuli; the child must then point to the stimuli in the same order on a set of identical

response cards. Modern users of the WISC-V and WAIS-IV are familiar with the letter-number

sequencing subtest, which requires the series of stimuli to be repeated in either ascending order

(for numbers) or alphabetical order (for letters). This subtest appeared on the WAIS-III
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 14

(Wechsler, 1997) and the version of the Wechsler Memory Scale that appeared the same year.

We see all of these tasks as adaptations of the original digit span task that Jacobs (1887)

proposed.

Early Reasoning

The SB5 includes the early reasoning subtest, which requires the young examinee to use

pictorial stimuli and tell a story about the image based on visual cues. The second Binet scale

(Binet & Simon, 1908/1916) is a clear predecessor for this test; the 1908 scale has three images,

each containing at least one human figure. The child then was asked to describe the picture, and

more complex responses based on interpretation (rather than simply naming objects in the

image) were viewed as indicative of greater intellectual ability. Binet found this subtest so useful

when diagnosing intellectual disabilities that he wrote, “Very few tests yield so much

information as this one. . .. We place it above all the others, and if we were obliged to retain only

one, we should not hesitate to select this one” (Binet & Simon, 1908/1916, p. 189).

Figure Weights

In the figure weights subtest found on the WISC-V and WAIS-IV, the examinee is shown

an image of scales with different weights on both sides. The examinee then chooses what type of

weights would balance a third scale. The figure weights task seems to be most similar to the

original Piagetian balance beam task, which Inhelder and Piaget introduced in 1955 as a task that

can indicate whether a child has reached the formal operational stage of reasoning (de

Ribaupierre & Lecerf, 2006). The main difference between the two is that figure weights is two-

dimensional (on paper) and is only based on different colors and shapes while the Piagetian

balance beam task uses actual weights in the item administration process. Even so, the Piagetian

balance beam task is a clear precursor to the figure weights subtest.


FIRST PUBLICATIONS OF SB & WECHSLER TESTS 15

Form Board/Form Patterns/Visual Puzzles/Object Assembly

The form board and form patterns subtests contain tasks that ask the examinee to match

geometric shapes to other geometric shapes. Early versions of the form board resembled modern

puzzles for young children, with wooden pieces that had to be placed into matching shapes that

were cut into a wooden board. Form boards are among the oldest subtests still in use today; Jean

Marc Gaspard Itard was the first to use a form board-like task when he studied and educated a

young boy found in the wild (named the “wild boy of Aveyron”) in 1798. Itard’s successor and

colleague, Édouard Séguin, made more permanent versions of the same test, and Séguin’s widely

read descriptions of form boards resulted in their popular usage among psychologists and

physicians studying and training children and individuals with intellectual disabilities

(Richardson, 2011; Sylvester, 1913).

The very similar visual puzzles and object assembly subtests have an origin in the puzzles

used for entertainment and geography education, which were first created in the 1750s in

England and were in widespread use in the early 20th century when the first intelligence tests

were being created (Norgate, 2007). Both form boards and visual puzzles/object assembly were

incorporated into nonverbal testing settings (Richardson, 2011), and a paper-and-pencil version

of object assembly—in which the examinee must divide a square to show how a set of two or

three shapes can form the entire square—was present on the Army Beta (Yoakum & Yerkes,

1920).

Information

According to Wechsler (1944), “Questions formulated to tap the subject’s range of

information have, for a long time, been the stock in trade of psychiatric examinations, and prior

to the introduction of standardized intelligence tests they were widely used by psychiatrists in
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 16

estimating the intellectual level of patients” (p. 77). The original Binet scale also had information

items, such as, “Giving the name of four common coins,” and asking the examinee to give their

age (Binet & Simon, 1905a/1916). Binet and Simon (1905b/1616) gave explicit credit to the

French physicians Blin (1902) and Damaye (1903) for these items about topics that it was

reasonable to expect a person to be exposed to a given culture to know. The original items that

inspired Binet covered a variety of topics, including questions about the body and age in general;

they are clear sources for the information subtest on modern intelligence tests. However, Hall

(1893, pp. 16-22) published a series of items administered to children in two cities. While most

of these were vocabulary items and whether the child had seen certain items or events (e.g., seen

a watchmaker at work, or seen an axe), some of them include information-type items. Examples

of these include whether they know “That leathern things come from animals,” “What bricks are

made of,” and “Origin of butter” (all examples form Hall, 1893, p. 20). Hall (1893) believed that

effective teaching required relating new information to what the child already knew; therefore,

an understanding of children’s vocabulary and information about the world around them would

be pedagogically useful. While the information items Hall (1893) used do not seem to have

influenced Binet, Terman did acknowledge their influence on early Stanford-Binet information

subtests (Terman, 1924, p. 111).

Many later psychologists created their own information items that were culturally

appropriate for their examinees. An example of this is on the Army Alpha, which has the

following information item: “The pitcher has an important place in a) tennis b) football c)

baseball d) handball” (Yoakum & Yerkes, 1920, p. 274).

Last Word
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 17

In the last word subtest on the SB5 the examinees are (1) asked a question, (2) prompted

to answer, and then (3) asked to remember the last word of the question. Roid and Barram (2004,

p. 49) say that the last word subtest is based on a task reported by Daneman and Carpenter’s

(1980) study, making it one of the newest subtests found on the SB5 and the modern versions of

the Wechsler scales. Daneman and Carpenter (1980) created this task to measure working

memory capacity in college students.

It is important to note the differences between the modern SB5 last word subtest and

Daneman and Carpenter’s (1980) task. The earlier authors had their subjects read different

sentences aloud and then recall the last word of each sentence (in the order that the sentences

were presented) after reading the final sentence. Moreover, Daneman and Carpenter (1980) did

not require examinees to answer a question. Despite the differences, the connection with

Daneman and Carpenter’s test is undeniable.

Matrices (Object Series/Matrix Reasoning)

While taking the matrices test, the subject is shown a pattern of geometric figures and is

asked to complete the pattern. Although the SB5 and all the modern Wechsler tests contain

matrix items, the best-known test to use matrix items is the Raven’s tests (the Raven’s Coloured

Progressive Matrices, Raven’s Progressive Matrices, and Raven’s Advanced Progressive

Matrices), a series of nonverbal matrices tests that are extremely good measures of fluid

intelligence. Because the Raven name has always been associated with matrix items, the origin

of this subtest is not obscure. Penrose and Raven (1936) were the first to describe a matrix,

though Raven’s (1939) article is a more widely known early report of matrix items. Matrix tasks

were designed to be a measure of “innate mental capacity” (Penrose & Raven, 1936, p. 7) that

did not rely heavily on language or educational experiences.


FIRST PUBLICATIONS OF SB & WECHSLER TESTS 18

Memory for Sentences

In the memory for sentences subtest, the examinee must repeat back a sentence that is

read to them. The 1905 Binet Scale subtest titled “Repetition of 15 Word Sentences,” in which

the examinee was also supposed to repeat back a sentence (Binet & Simon, 1905a/1916). But

according to Wolf (1973, pp. 86-87), Binet and Victor Henri used a memory for sentences test as

early as 1892, and this earlier work presaged Binet’s use of the subtest on his first intelligence

scale. Consistent with Binet’s emphasis on studying complex mental capacities instead of simple

abilities, Binet believed that memory of entire sentences was more useful as a measure of

cognitive development than memory for isolated words. Memory for sentences had a sharper age

performance gradient than for memory of isolated words, which Binet saw as a useful

characteristic in mental tasks (Wolf, 1973, p. 87).

Picture Absurdities

In a picture absurdities subtest, an examinee is shown a picture that has something wrong

or absurd in it. The examinee then must explain what is “absurd” about the picture. For example,

a picture might depict a firefighter holding a hose, but with flowers emerging from his hose,

instead of water. Although conceptually similar to the verbal absurdities subtest (see below), the

picture absurdities subtest emerges independently years later. The earliest description of this

subtest in English is Terman and Chamberlain’s (1918) description of an “absurd pictures”

subtest containing images, such as “A man with three legs,” and a man smoking an upside pipe.

Terman claimed in this article that he was inspired in 1914 by a “picture puzzle” in a children’s

magazine in which many objects within a picture were “. . . so drawn as to contain an absurdity”

(Terman & Chamberlain, 1918, pp. 347-348). However, in this description, Terman also

mentioned that he was unaware of Rossolimo’s “test of this kind.” No exact citation to
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 19

Rossolimo’s test is given, but we found two references in 1911 to works by Rossolimo that could

have been Terman’s sources. One article in English contains an offhand mention of a test of

“pictures containing absurdities” to measure “comprehension” (Rossolimo, 1911a, p. 212). A

second, more detailed article in German describes a picture absurdities subtest containing 30

images (10 for children, 10 for uneducated adults, and 10 for educated adults) as part of a mental

test battery (Rossolimo, 1911b, pp. 278-279). This article does not reproduce any of the images,

but the descriptions clearly describe images that would be on a picture absurdities subtest. For

example, one item for children consisted of, “A lady is reading a book with her eyes blindfolded,

with glasses put over the bandage” (Rossolimo, 1911b, p. 278, translated via Google translate).

These two articles by Rossolimo are the earliest record of a picture absurdities subtest we have

found.

Picture Completion

During the picture completion subtest, the examinee is shown pictures that have

something missing. The examinee is then asked to fill in whatever is missing in the picture. The

picture completion subtest’s origin can be traced to Binet’s 1908 subtest called “Unfinished

Pictures” (Binet & Simon, 1908/1916). Both the original subtest and its modern equivalent on

the WAIS-IV have similar instructions, though the artwork is much more sophisticated in the

modern subtest. Figure 1 shows some examples of the “Unfinished Pictures” task found on the

1908 Binet-Simon scale. Picture completion items were also found on the Army Beta (Yoakum

& Yerkes, 1920).

INSERT FIGURE 1 ABOUT HERE.

Picture Memory/Picture Naming


FIRST PUBLICATIONS OF SB & WECHSLER TESTS 20

During the picture memory subtest, examinees are briefly shown some images and

afterwards are then asked to choose from a new group of pictures the images they had previously

seen. Binet’s original scale has a subtest titled “Exercise of Memory of Pictures” (Binet &

Simon, 1905a/1916), which he called “. . . a test of attention and visual memory” (p. 60). The

original version of this task shares the same format of presenting a pictorial stimulus and then

asking the examinee to recall it later.

Position and Direction

The position and direction subtest (found on the SB5) asks the examinee to move items

based on commands that include words related to spatial position (e.g., “on,” “inside”). More

difficult items designed for older age groups ask the examinee to imagine rotating in various

directions sequentially (e.g., “left,” “right,” “north,” south”) and then to state what direction the

examinee would face after the hypothetical sequence is completed. Both types of items appear on

the 1937 Stanford-Binet (Terman & Merrill, 1937), but had been reported nearly twenty years

before (Terman & Chamberlain, 1918, pp. 344-345) as part of a pilot procedure for 23 subtests,

many of which appeared on later intelligence tests.

Procedural Knowledge

In the procedural knowledge subtest, the subject is shown a series of cards and is asked to

describe either how they use the object or perform the task shown in the card. In Wolf’s 1973

biography of Binet, she mentioned that Damaye’s (1903) study of normal cognitive development

included several questions that resemble information items. One specific type of item they

mentioned was asking the child about an object, especially asking them to describe the use of it

(p. 173). This description highly resembles procedural knowledge. Even though there are slight

differences in the fact that the modern procedural knowledge subtest uses cards and pictures,
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 21

Blin and Damaye’s question is the oldest description of an item resembling procedural

knowledge.

Similarities

While taking the modern-day similarities tests, the examinee is asked to qualitatively

define the relationship between a pair of words provided to them. The 1905 version of Binet’s

scale mentions a subtest called, “Resemblances of Several Known Objects Given from Memory”

(Binet & Simon, 1905a/1916). This subtest examined the subject’s ability to compare common

objects and state how they are similar. Earlier than this test, though, Binet and Henri suggested

using a similarities task to measure comprehension in children as early as 1895 (Peterson,

1926/1969, p. 89).

In the picture concepts subtest, the examinee chooses which images among those

provided all share a common characteristic. This focus on commonalities makes this subtest

greatly resemble the similarities subtest, which requires examinees to identify how objects

presented verbally are similar. We see the picture concepts subtest as having the same origin,

which is Binet’s 1905 subtest, “Resemblances of Several Known Objects from Memory” (Binet

& Simon, 1905a/1916). As far as we could find, the first time this picture form of a similarities

subtest was published was as part of the WPPSI-III (Wechsler, 2002).

Symbol Search/Bug Search

In the symbol search subtest, the examinee is shown one or two target images and is then

supposed to determine whether there is a matching symbol from another set of images. The

WPPSI’s bug search is a similar test, but with target images that are cartoon bugs (as age-

appropriate stimuli). These subtests are similar to matching activities designed for children. As

formatted for an intelligence test, though, it was first seen on the WISC-III (Wechsler, 1991),
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 22

and it is seen as a measure of “perceptual organization, fluid intelligence, and planning and

learning ability” (Weiss, Saklofske, Holdnack, & Prifitera, 2016, pp. 13-14).

Verbal Absurdities

The verbal absurdities subtest consists of statements which have something false in them

that the examinee is supposed to identify. In Binet’s second test, he had a subtest named

“Criticism of Sentences” (Binet & Simon, 1908/1916, pp. 227-229). A memorable (though

gruesome) example can be found on the 1908 Binet scale: “Yesterday they found on the

fortification the body of an unfortunate girl, cut into eighteen pieces. It is believed that she killed

herself” (Binet & Simon, 1908/1916, p. 228). The same subtest (retaining some of Binet’s

sentences, including the one we quote, was called “Absurdities” in the 1916 Stanford-Binet

(Terman, 1916). According to Wolf (1973, pp. 147-148), Binet suggested this type of test much

earlier in 1896 as one of his measures of “comprehension.”

Verbal Analogies

The current verbal analogies test first gives the examinee two words which the examinee

must then understand the relationship between. Based on this relationship, the examinee is then

supposed to generate a fourth word that has the same relationship with the third word that the

first two words have with one another. The earliest mention we have been able to find of verbal

analogy items is in an 1894 article by Binet when he suggested items that asked examinees to

define the relationship between two words (Wolf, 1973). Using this item format, Binet believed

that “. . . one would certainly arrive at a test of judgment and of other complex functions” (as

quoted in Wolf, 1973, p. 93). For example, in Binet’s version of analogies, the examinee would

be asked to explain the relationship between the word spoon and soup. Though Binet’s 1894

suggestion for a test isn’t exactly the same as the current version of verbal analogies, it still asks
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 23

the examinee to discover the relationship between two words. We see this as a clear precursor to

modern verbal analogy items.

Vocabulary/ Receptive Vocabulary

Vocabulary subtest and receptive vocabulary subtest items require the examinee to define

a word from a standardized list, whether in a multiple choice or free-response format. These

types of items are on the current version of the Stanford-Binet and all three of the Wechsler

intelligence scales, but predate the original versions of these modern tests. The first time that

vocabulary items were seen on an intelligence test specifically was in Binet’s original 1905 test

(Binet & Simon, 1905a/1916). Both Wolf (1973, p. 84) and Matarazzo (1972, p. 32) stated that

Binet published a vocabulary test in 1890, a full 15 years before his first intelligence scale.

However, vocabulary tests were part of educational testing in the 19th century and were not

unique to the realm of intelligence testing (e.g., see Hall’s, 1893, summary of an 1869 German

report of the performance of 10,000 children on a German vocabulary test). These vocabulary

tests often functioned as academic achievement tests, though Hall (1893) saw understanding a

child’s vocabulary as serving a foundation for future teaching (see the subsection the information

subtest origin).

Zoo Location

A new subtest on the WPPSI-IV is the zoo location subtest. For this task, the examinees

are shown zoo animals at a specific location on a simple two-dimensional map. Later, the

examiner asks the child to place the zoo animals in the location that they were previously. A

precursor of this subtest is the 7/24 test (originally created by Barbizet & Cany, 1968), which

required examinees to reproduce from recall a random pattern of 7 dots by placing round tokens

on a 24-square grid.
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 24

Discussion

Tracing the origins of all subtests found on the SB5, WAIS-IV, WISC-V, and WPPSI- IV

is a project that has not been undertaken before. We wrote this article to give intelligence test

users an appreciation for the history of these subtests and also explain more about the creation of

the original scales. Perhaps with a knowledge that most subtests have been in use for many years,

practitioners can have more confidence in their use of these item formats because they can know

that these subtests have been subjected to repeated investigation for several decades.

Many of the origins of the modern subtests date back over a century. Indeed, the median

year of publication for the subtests was 1908, and only three subtests originated after 1980. Most

subtests on the Wechsler tests and SB5 have withstood the test of time and have accumulated a

large body of validity research, proving their utility in measuring intelligence. Moreover,

continuity in subtests gives researchers tools for longitudinal testing, research into development

and aging, and the investigation of population-level trends (e.g., the Flynn effect). Using the

same subtest formats over the decades also permits research and knowledge about how these

items function to accumulate. A timeline showing our proposed candidates for the first

publication of each test is displayed in Figure 2.

INSERT FIGURE 2 ABOUT HERE.

One discovery that we found striking was the diverse sources of inspiration for subtests.

While the majority did have roots in the creation of cognitive tests, others have their origin in

games (the delayed response subtest, the object assembly subtest), classroom lessons (the block

design subtest), the study of a feral child (form boards and related subtests), school assessments

(vocabulary subtest) and more. To us, this means that items on intelligence tests often have a

connection with the real world—even when they are presented in a standardized, acontextual
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 25

testing setting. Additionally, this undercuts the suggestion that critics of intelligence testing often

make that intelligence test items are meaningless tasks that are divorced from any relationship to

an examinee’s environment (e.g., Gould, 1981).

On the other hand, one criticism of intelligence tests seems justified from our study:

subtests that appear on popular intelligence tests have changed little in the past century (Linn,

1986). While one could argue that the enduring appeal of these subtests is due to their high

performance in measuring intelligence, the fact remains that many of these subtests were often

created with little guiding theory or understanding of how the brain and mind work to solve

problems (Naglieri, 2007).While sophisticated theories regarding test construction and the

interrelationships of cognitive abilities have developed in recent decades (e.g., Carroll, 1993), it

is often not clear exactly how the tasks on modern intelligence tasks elicit examinees to use their

mental abilities to respond to test items.

It is apparent from our research that as the creators and revisors of the Stanford-Binet and

Wechsler tests have considered new item formats, they have taken inspiration (in a very direct

way) from pre-existing item formats. From a pragmatic perspective, this makes sense; using an

existing item format is easier than inventing a new one. Moreover, these item formats often had

research supporting their use in intelligence testing, whereas a new subtest would not.

Additionally, copyright laws only protect the exact item content—not the general format of a

subtest. Therefore, we believe that most psychologists creating or revising the Wechsler or

Stanford-Binet tests found it expedient to reuse existing item and subtest formats, which is why

most subtest formats on modern intelligence tests are over 100 years old.

We also wish to draw the reader’s attention to the fact that there have been many subtests

developed that do not appear on current intelligence scales. An example of this is Porteus’s
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 26

(1915) maze task, which appeared on the Army Beta and in some iterations of the WISC.

Though the test functioned well as a non-verbal test of intelligence (see Porteus, 1965, for a

review), other non-verbal tasks have surpassed it, and it does not appear on any Wechsler test

today or the SB5. This demonstrates that intelligence testing is not a static technology. Rather,

test creators and revisers frequently re-examine subtests to determine the best available tasks to

include on an intelligence test. Likewise, there are some popular tests, like the Woodcock-

Johnson IV, Differential Ability Scales II, and the Kaufman Assessment Battery for Children II

that are all more popular than the SB5. While there is overlap between these tests’ subtests and

the subtests we explored in this article, an exploration of the origin of the item formats used on

these more popular tests would be beneficial.

Our work highlights the influence of a small number of intelligence tests: Binet’s scales,

the Army Alpha and Army Beta, the original Stanford-Binet, and the early Wechsler tests. Many

of the subtests we examined either originated in these scales or reached a large audience of

psychologists via the inclusion of these tests. Other tasks often are capable of measuring

intelligence, such as executive functioning tasks (Brydges, Reid, Fox, & Anderson, 2012), but

these rarely find a place on the Wechsler or Stanford-Binet tests. Although we admire the work

of the early pioneers of intelligence testing (Warne, 2019; Warne, Burton, Gibbons, & Melendez,

2019), we believe that modern test revisers and creators would benefit from examining the work

of modern cognitive psychologists for inspiration in creating new test formats.

We also wish to emphasize that even though subtest or item formats in some cases have

been consistent through the decades, this does not imply that item content has remained

unchanged. For example, words on vocabulary or information subtests have often changed as

tests have been revised. Modern examinees who take these subtests do not receive the same
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 27

questions as examinees did a hundred years ago; rather, the general format of the test item and

what the examinee is asked to do is what is the same. Test creators and revisers regularly use the

framework of a subtest’s item format to create new stimuli to probe examinees’ mental abilities.

Indeed, test publishers now commonly revise intelligence tests at regular intervals to combat

breaches in item confidentiality, the Flynn effect, and item drift that may develop over time.

It is important to recognize that many subtests on other intelligence scales can also be

traced back to the early 20th century. For example, Binet’s (1905a/1916) original scale contained

a paper cutting task (suggested by Henri in 1898, according to Wolf, 1973, p. 150) in which the

examiner folded a paper and then cut a portion out. The examinee then had to draw what the

paper would look like unfolded. The SB5 and current Wechsler scales do not contain this subtest,

but the current version of the Cognitive Abilities Test (Lohman & Lakin, 2017) does. Lumosity,

a company that creates “brain training” computer programs, even calls this task “Thurstone’s

punched holes,” despite the fact that Thurstone did not invent the task (Simons et al., 2016).

Likewise, Ramful, Lowrie, and Logan (2016) stated that this test had a 1970s origin—misstating

the true creation of the test by over seven decades. Thus, it is likely that many other subtests have

origins that are often misunderstood by psychologists.

Our historical research showed that some of the item formats found on the original Binet-

Simon scale predate their inclusion on the famous 1905 Binet intelligence scale. Binet’s first

article on the cognitive development of children was published in 1890 (Wolf, 1973, p. 81), and

he pursued this line of work in almost total seclusion for over a decade (Wolf, 1973). Binet’s

lengthy practice in testing children’s cognition provided knowledge and experience that he drew

upon when he created his scales—a fact that has been noted by others scrutinizing the historical

record (e.g., Nicolas, Andrieu, Croizet, Sanitioso, & Burman, 2013). This contrasts with the
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 28

overly simplistic version of the history of testing that psychology students are often exposed to in

textbooks, which portray Binet’s 1905 scale as a sudden revolution in psychological

measurement (e.g., Coaley, 2014; Kaplan & Saccuzzo, 2018). This version of history also

ignores the contributions of individuals like Itard, Séguin, Henri, Damaye, and others who

provided Binet with item formats that he would use in his scales.

Despite our best efforts to identify the origins of different intelligence scale subtests, we

do not claim to have read every scholarly article about intelligence or cognitive testing that was

published in the early decades of this field. Although we did successfully trace most subtests to

publications that predate their first appearance on the SB or a Wechsler test, there is still the

possibility that some tests appeared even earlier than we realized.

Regardless of specific changes to item content or the exact mix of subtests on an

intelligence test, the majority of the subtests on the current versions of the Wechsler and

Stanford-Binet intelligence scales have origins dating back more than a century. We encourage

psychologists who use, study, or revise these tests—and other intelligence tests—to be aware of

this lengthy history. For many of these subtests the psychometric literature extends far beyond

the test manuals for the Wechsler and Stanford-Binet tests. Additionally, understanding the

origins of the subtests on modern intelligence tests can also help psychologists appreciate the

quality of these subtests, while also recognizing their shortcomings.


FIRST PUBLICATIONS OF SB & WECHSLER TESTS 29

References

Barbizet, J., & Cany, E. (1968). Clinical and psychometrical study of a patient with memory

disturbances. International Journal of Neurology, 7, 44-54.

Bell, J. C. (1921). Group tests of intelligence. An annotated list. Journal of Educational

Psychology, 12, 103-108. doi:10.1037/h0073682

Binet, A., (1911/1916). New investigations upon the measure of the intellectual level among

school children (E. S. Kite, trans.). In A. Binet & T. Simon, The development of

intelligence in children (the Binet-Simon Scale) (pp. 274-329). Baltimore, MD: Williams

& Wilkins.

Binet, A., & Simon, T. (1905a/1916). New methods for the diagnosis of the intellectual level of

subnormals (E. S. Kite, trans.). In A. Binet & T. Simon, The development of intelligence

in children (the Binet-Simon Scale) (pp. 9-36). Baltimore, MD: Williams & Wilkins.

Binet, A., & Simon, T. (1905b/1916). Upon the necessity of establishing a scientific diagnosis of

inferior states of intelligence (E. S. Kite, trans.). In A. Binet & T. Simon, The

development of intelligence in children (the Binet-Simon Scale) (pp. 37-90). Baltimore,

MD: Williams & Wilkins.

Binet, A., & Simon, T. (1908/1916). The development of intelligence in the child (E. S. Kite,

trans.). In A. Binet & T. Simon, The development of intelligence in children (the Binet-

Simon Scale) (pp. 182-273). Baltimore, MD: Williams & Wilkins.

Blair, F. X. (1957). A study of the visual memory of deaf and hearing children. American Annals

of the Deaf, 102, 254-263.

Blin, E. (1902). Les débilités mentales. Revue de Psychiatrie, 8, 337-356.


FIRST PUBLICATIONS OF SB & WECHSLER TESTS 30

Boake, C. (2002). From the Binet-Simon to the Wechsler-Bellevue: Tracing the history of

intelligence testing. Journal of Clinical and Experimental Neuropsychology, 24, 383-405.

doi:10.1076/jcen.24.3.383.981

Bonser, F. G. (1910). The reasoning ability of children of the fourth, fifth, and sixth school

grades (Teachers College Contributions to Education, no. 37). New York, NY: Teachers

College, Columbia University.

Brydges, C. R., Reid, C. L., Fox, A. M., & Anderson, M. (2012). A unitary executive function

predicts intelligence in children. Intelligence, 40, 458-469.

doi:10.1016/j.intell.2012.05.006

Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies. New York,

NY: Cambridge University Press.

Coaley, K. (2014). An introduction to psychological assessment & psychometrics. Thousand

Oaks, CA: Sage.

Cronbach, L. J. (1975). Five decades of public controversy over mental testing. American

Psychologist, 30, 1-14. doi:10.1037/0003-066X.30.1.1

Damaye, H. (1903). Eassai de diagnostic entre les états de débilités mentales. Paris, France:

Steinheil.

Daneman, M., & Carpenter, P. A. (1980). Individual differences in working memory and

reading. Journal of Verbal Learning and Verbal Behavior, 19, 450-466.

doi:10.1016/S0022-5371(80)90312-6

Dearborn, W. F. (1910). Experiments in learning. Journal of Educational Psychology, 1, 373-

388. doi:10.1037/h0073531
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 31

de Ribaupierre, A., & Lecerf. T. (2006). Relationships between working memory and

intelligence from a developmental perspective: Convergent evidence from a neo-

Piagetian and a psychometric approach. European Journal of Cognitive Psychology, 18,

109–137. doi:10.1080/09541440500216127

Frank, G. (1983). The Wechsler enterprise: An assessment of the development, structure, and use

of the Wechsler tests of intelligence. New York, NY: Pergamon Press.

Gould, S. J. (1981). The mismeasure of man. New York, NY: W. W. Horton.

Hall, G. S. (1893). The contents of children’s minds on entering school. New York, NY: E. L.

Kellogg & Co.

Healy, W. (1914). A pictorial completion test. Psychological Review, 21, 189-203.

doi:10.1037/h0075712

Jacobs, J. (1887). Experiments on “prehension.” Mind, 12(45), 75-79. doi:10.2307/1411258

Kaplan, R. M., & Saccuzzo, D. P. (2018). Psychological testing: Principles, applications, and

issues (9th ed.). Boston, MA: Cengage Learning.

Knox, H. A. (1913). The differentiation between moronism and ignorance. New York Medical

Journal, 98, 564-566.

Kohs, S. C. (1920). The block-design tests. Journal of Experimental Psychology, 3, 357-376.

10.1037/h0074466

Linn, R. L. (1986). Educational testing and assessment: Research needs and policy issues.

American Psychologist, 41, 1153-1160. doi:10.1037/0003-066X.41.10.1153

Lohman, D. F., & Lakin, J. M. (2017). Cognitive Abilities Test (Form 8). Boston, MA:

Houghton-Mifflin Harcourt.
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 32

Matarazzo, J. D. (1972). Wechsler’s measurement and appraisal of adult intelligence (5th ed.).

Baltimore, MD: Williams & Wilkins Company.

Maurer, D. W. (1947). The argot of the three-shell game. American Speech, 22, 161-170.

doi:10.2307/3181790.

Naglieri, J. A. (2007). Traditional IQ: 100 years of misconception and its relationship to minority

representation in gifted programs. In J. VanTassel-Baska (Ed.), Alternative assessments

with gifted and talented students (pp. 67-88). Waco, TX: Prufrock Press.

Nicolas, S., Andrieu, B., Croizet, J.-C., Sanitioso, R. B., & Burman, J. T. (2013). Sick? Or slow?

On the origins of intelligence as a psychological object. Intelligence, 41, 699-711.

doi:10.1016/j.intell.2013.08.006

Norgate, M. (2007). Cutting borders: Dissected maps and the origins of the jigsaw puzzle.

Cartographic Journal, 44, 342-350. doi:10.1179/000870407X241908

Penrose, L. S., & Raven, J. C. (1936). A new series of perceptual tests: Preliminary

communication. British Journal of Medical Psychology, 16, 97-104. doi:10.1111/j.2044-

8341.1936.tb00690.x

Peterson, J. (1969). Early conceptions and tests of intelligence. Westport, CT: Greenwood Press,

Publishers. (Originally printed in 1926).

Porteus, S. D. (1915). Mental tests for feeble-minded: A new series. Journal of Psycho-

Asthenics, 19, 200-213.

Porteus, S. D. (1965). Porteus maze test: Fifty years' application. Palo Alto, CA: Pacific Books.

Pyle, W. H. (1913). The examination of school children: A manual of directions and norms. New

York, NY: The Macmillan Company.


FIRST PUBLICATIONS OF SB & WECHSLER TESTS 33

Ramful, A., Lowrie, T., & Logan, T. (2016). Measurement of spatial ability: Construction and

validation of the spatial reasoning instrument for middle school students. Journal of

Psychoeducational Assessment, 35, 709-727. doi:10.1177/0734282916659207

Raven, J. C. (1939). The R.E.C.I. series of perceptual tests: An experimental survey. British

Journal of Medical Psychology, 18, 16-34. doi:10.1111/j.2044-8341.1939.tb00705.x

Richardson, J. T. E. (2005). Knox’s cube imitation test: A historical review and an experimental

analysis. Brain and Cognition, 59, 183-213. doi:10.1016/j.bandc.2005.06.001

Richardson, J. T. E. (2011). Howard Andrew Knox: Pioneer of intelligence testing at Ellis

Island. New York, NY: Columbia University Press.

Roid, G. H., & Barram, R. A. (2004). Essentials of Stanford-Binet intelligence scales (SB5)

assessment. Hoboken, NJ: John Wiley & Sons.

Rossolimo, G. (1911a). Mental profiles: A quantitative method of expressing psychological

processes in a normal and pathological cases. The Journal of Experimental Pedagogy, 1,

211-214.

Rossolimo, G. (1911). Die psychologischen profil. Klinik für psychische und nervöse

Krankheiten, 6, 249-326.

Simons, D. J., Boot, W. R., Charness, N., Gathercole, S. E., Chabris, C. F., Hambrick, D. Z., &

Stine-Morrow, E. A. L. (2016). Do “brain-training” programs work? Psychological

Science in the Public Interest, 17, 103-186. doi:10.1177/1529100616661983

Spearman, C. (1904). "General intelligence," objectively determined and measured. American

Journal of Psychology, 15, 201-293. doi: 10.2307/1412107

The Special Class Teacher’s Club. (1917). The Boston way: Plans for the development of the

individual child. Concord, NH: The Rumford Press.


FIRST PUBLICATIONS OF SB & WECHSLER TESTS 34

Stone, C. W. (1908). Arithmetical abilities and some factors determining them (Contributions to

education Teachers College series No. 19). New York, NY: Teachers College, Columbia

University.

Sylvester, R. H. (1913). The form board test. Princeton, NJ: Psychological Review Company.

Terman, L. M. (1905). A study in precocity and prematuration. The American Journal of

Psychology, 16, 145-183. doi:10.2307/1412123

Terman, L. M. (1916). The measurement of intelligence: An explanation of and a complete guide

for the use of the standard revision and extension of The Binet-Simon Intelligence Scale.

Cambridge, MA: Houghton Mifflin Company.

Terman, L. M. (1924). The mental test as a psychological method. Psychological Review, 31, 93-

117. doi:10.1037/h0070938

Terman, L. M., & Chamberlain, M. B. (1918). Twenty three serial tests of intelligence and their

intercorrelations. Journal of Applied Psychology, 2, 341-354. doi:10.1037/h0072077

Terman, L. M., & Merrill, M. A. (1937). Measuring intelligence: A guide to the administration

of the new revised Stanford-Binet tests of intelligence.

Thorndike, R. L. (1975). Mr. Binet's test 70 years later. Educational Researcher, 4(5), 3-7.

doi:10.2307/1174855

Warne, R. T. (2019). An evaluation (and vindication?) of Lewis Terman: What the father of

gifted education can teach the 21st century. Gifted Child Quarterly, 63, 3-21.

doi:10.1177/0016986218799433

Warne, R. T., Burton, J. Q., Gibbons, A., & Melendez, D. A. (2018). Stephen Jay Gould’s

analysis of the Army Beta test in The Mismeasure of Man: Distortions and
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 35

misconceptions regarding a pioneering mental test. Journal of Intelligence, 7, 6.

doi:10.3390/jintelligence7010006.

Wechsler, D. (1944). The measurement of adult intelligence (3rd ed.). Baltimore, MA: Williams

& Wilkins.

Wechsler, D. (1991). The Wechsler Intelligence Scale for Children (3rd ed.). San Antonio, TX:

The Psychological Corporation.

Wechsler, D. (2002). The Wechsler Preschool and Primary Scale of Intelligence (3rd ed.). San

Antonio, TX: The Psychological Corporation.

Wechsler, D. (2012). The Wechsler Preschool and Primary Scale of Intelligence (4rd ed.). San

Antonio, TX: The Psychological Corporation.

Weiss, L. G., Saklofske, D. H., Holdnack, J. A., & Prifitera, A. (2016). WISC-V assessment and

interpretation: Scientist-practitioner perspectives. London, UK: Academic Press.

Wolf, T. H. (1973). Alfred Binet. Chicago, IL: The University of Chicago Press.

Wongupparaj, P., Wongupparaj, R., Kumari, V., & Morris, R. G. (2017). The Flynn effect for

verbal and visuospatial short-term and working memory: A cross-temporal meta-analysis.

Intelligence, 64, 71-80. doi:10.1016/j.intell.2017.07.006

Yerkes, R. M. (1921). Psychological examining in the United States army. Washington, DC:

Government Printing Office.

Yoakum, C. S., & Yerkes, R. M. (1920). Army mental tests. New York, NY: Henry Holt and

Company.

Young, K. (1924). The history of mental testing. The Pedagogical Seminary, 31, 1-48,

doi:10.1080/08919402.1924.10532922
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 36

Table 1
Subtests on the Current Versions of the Stanford-Binet and Wechsler Tests
Subtest Namesa SB5 WPPSI-IV WISC-V WAIS-IV
Arithmetic/Verbal and Nonverbal X X X
Quantitative Reasoning
Block Design X X X
Block Span X
Cancellation X X X
Coding/Animal Coding X X X
Comprehension X X X
Delayed Response X
Digit Span/Picture Span/Letter-Number X X
Sequencing
Early Reasoning X
Figure Weights X X
Form Board and Form Patterns/Visual X X X X
Puzzles/Object Assembly
Information X X X
Last Word X
Matrices/Object Series/Matrix Reasoning X X X X
Memory for Sentences X
Picture Absurdities X
Picture Completion X
Picture Concepts X X
Picture Memory/Picture Naming X
Position and Direction X
Procedural Knowledge X
Similarities X X X
Symbol Search/Bug Search X X X
Verbal Analogies X
Verbal Absurdities X
Vocabulary/Receptive Vocabulary X X X X
Zoo Locations X
a
Some subtests are listed as having multiple names because subtests from different
instruments may have different names but formats that are so similar that we
considered the subtests to be the same.
b
Subtests are listed alphabetically.
FIRST PUBLICATIONS OF SB & WECHSLER TESTS 37

Figure 1. Examples of the “Unfinished Pictures” task, the forerunner of the modern WAIS-IV

Picture Completion subtest (Binet & Simon 1908/1916, p. 208).


Running head: FIRST PUBLICATIONS OF SB & WECHSLER TESTS 38

Figure 2. Timeline of our proposed candidates for the first known publication of Stanford-Binet 5 and modern Wechsler subtests.

You might also like