Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

The Impact of Language Experience on Language and Reading

Topics in Language Disorders, 2018
...Read more
Top Lang Disorders Vol. 38, No. 1, pp. 66–83 Copyright c 2018 Wolters Kluwer Health, Inc. All rights reserved. The Impact of Language Experience on Language and Reading A Statistical Learning Approach Mark S. Seidenberg and Maryellen C. MacDonald This article reviews the important role of statistical learning for language and reading develop- ment. Although statistical learning—the unconscious encoding of patterns in language input—has become widely known as a force in infants’ early interpretation of speech, the role of this kind of learning for language and reading comprehension in children has received less attention. In fact, the implicit learning of co-occurrences of words, sentence structures, and other components of language forms a critical part of children’s language comprehension and fluent reading. Beyond introducing basic information about language statistics, the article offers a discussion of how vari- ability in the amount and nature of language experience can affect language development and literacy, including variation owing to the amount of language input in the child’s linguistic envi- ronment and the variable nature of input for children who are exposed to multiple languages or multiple dialects. Key words: bilingualism, dialect, individual differences, language variation, statistical learning, vocabulary T HIS ARTICLE provides an overview of a theoretical framework for examining the impact of experiential variability on children’s spoken language comprehension and learn- ing to read. Researchers now know that chil- dren’s early language experience is more vari- able than had previously been recognized, and that these differences have substantial effects on children’s progress in learning to read, find- ings that have also become widely known to the general public. Other factors aside, Authors’ Affiliation: Department of Psychology, University of Wisconsin-Madison. Preparation of this work was supported by a University of Wisconsin Vilas Research Professorship to MSS and a Wisconsin Alumni Research Foundation award to MCM. The authors declare no conflicts of interest. Corresponding Author: Mark S. Seidenberg, PhD, Department of Psychology, University of Wisconsin- Madison, Brogden Hall, 1202 West Johnson St, Madison, WI 53706 (seidenberg@wisc.edu). DOI: 10.1097/TLD.0000000000000144 children whose spoken language is more ad- vanced learn to read more easily (McCardle, Scarborough, & Catts, 2001); children with lesser skills are at greater risk for reading dif- ficulties; and children who do not attain suffi- cient reading skill have difficulty acquiring the many other types of knowledge that depend on print. These include math, given the heavy linguistic emphasis in modern curricula (e.g., explaining your work, story problems, math problems involving real-world situations de- scribed in words). They also include language itself, because reading is the primary means by which linguistic knowledge expands through the life span. In this article, we examine variability in lan- guage experience from the perspective of sta- tistical learning, an important type of learning that people engage in from birth. Statistical learning is the largely unconscious process of learning the patterns of one’s environment— the probabilities that events will occur, or oc- cur together, and in which sequences (Lany & Saffran, 2013; Seidenberg, 2017). The range of “events” that can be learned is broad, such Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. 66
Impact of Language Experience 67 as the sequence of green, yellow, and red traf- fic lights; the pairing of the word “tiger” and a picture of a tiger in a child’s book; and the fact that IN is a more common letter sequence in English than IE, even though the letter E is overall more common than N in written English. Because statistical learning occurs over the entirety of an individual’s experi- ence, the amount and variety of that experi- ence greatly affects what is learned. Statistical learning provides a broad theoretical frame- work for evaluating how differences in spo- ken language and reading experience affect children’s development. The structure of the article is as follows: We first provide an overview of the statisti- cal learning approach to language, including speech and reading. Statistical patterns exist within and between multiple levels of linguis- tic structure. This information plays a crucial role in learning and using spoken language. It is implicated in reading in three respects. First, variability in the child’s exposure to spoken language, associated with socioeco- nomic status and other variables, has down- stream effects on linking print and speech. Second, learning about the written code— orthography—is itself a statistical learning problem. Children learn about the frequen- cies and distributions of letters, which pat- terns form words, and how those patterns re- late to spoken language. This topic has been discussed extensively elsewhere (Castles & Nation, 2006; Grainger, 2008; Seidenberg, 2011). Third, the statistical properties of texts and speech differ. Texts employ vocabulary and sentence structures that rarely occur in speech. These characteristics have been termed “academic” or “school” language, and they are often seen as a major hurdle for many children (Bunch, Abram, Lotan, & Vald´ es, 2001). With this overview in hand, we then con- sider the impact of individual differences in experience: the amount and variety of the child’s spoken language experience and vari- ability in the type or types of linguistic code(s) to which the child is exposed. Statistical learn- ing provides a perspective on how these var- ied experiences affect language development and learning to read. The approach is also rele- vant to the important question of how experi- ence can be structured to promote success in learning to read and educational achievement in general, especially for children whose lan- guage experience is bilingual or bidialectal. STATISTICAL LEARNING Until recently, research on how languages are acquired and used was conducted within the framework established by Noam Chom- sky. This framework emphasized the struc- ture of language (grammatical competence) rather than facts about how language is used (performance). Researchers sought to iden- tify the universal and language-specific prop- erties of language, focusing on grammatical- ity (the well-formedness of utterances) rather than their use in the varied circumstances of people’s lives. The apparent fact that chil- dren converge on knowledge of a language despite substantial differences in experience was taken as evidence that essential aspects of grammar are innate rather than learned. With this understanding of grammar in hand, one could then ask how children acquire knowl- edge of a particular language, how this knowl- edge is used in comprehending and producing language, and how language is represented in the brain. Although there had previously been objec- tions to this view (e.g., Bates & MacWhinney, 1987), in the 1990s some researchers began to advance an alternative. The competence approach excluded information about perfor- mance; however, the explanations for how language is acquired and used may have been thrown out with the performance bathwater (Seidenberg, 1997). With the availability of tools for collecting and analyzing large lan- guage corpora and the creation of archives such as the Child Language Data Exchange System (CHILDES, MacWhinney, 2000) and the Penn Treebank (Marcus, Marcinkiewicz, & Santorini, 1993), it became apparent that language as it is used exhibits myriad statisti- cal regularities: patterns in the use of elements Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited.
Top Lang Disorders Vol. 38, No. 1, pp. 66–83 c 2018 Wolters Kluwer Health, Inc. All rights reserved. Copyright  The Impact of Language Experience on Language and Reading A Statistical Learning Approach Mark S. Seidenberg and Maryellen C. MacDonald This article reviews the important role of statistical learning for language and reading development. Although statistical learning—the unconscious encoding of patterns in language input—has become widely known as a force in infants’ early interpretation of speech, the role of this kind of learning for language and reading comprehension in children has received less attention. In fact, the implicit learning of co-occurrences of words, sentence structures, and other components of language forms a critical part of children’s language comprehension and fluent reading. Beyond introducing basic information about language statistics, the article offers a discussion of how variability in the amount and nature of language experience can affect language development and literacy, including variation owing to the amount of language input in the child’s linguistic environment and the variable nature of input for children who are exposed to multiple languages or multiple dialects. Key words: bilingualism, dialect, individual differences, language variation, statistical learning, vocabulary T HIS ARTICLE provides an overview of a theoretical framework for examining the impact of experiential variability on children’s spoken language comprehension and learning to read. Researchers now know that children’s early language experience is more variable than had previously been recognized, and that these differences have substantial effects on children’s progress in learning to read, findings that have also become widely known to the general public. Other factors aside, Authors’ Affiliation: Department of Psychology, University of Wisconsin-Madison. Preparation of this work was supported by a University of Wisconsin Vilas Research Professorship to MSS and a Wisconsin Alumni Research Foundation award to MCM. The authors declare no conflicts of interest. Corresponding Author: Mark S. Seidenberg, PhD, Department of Psychology, University of WisconsinMadison, Brogden Hall, 1202 West Johnson St, Madison, WI 53706 (seidenberg@wisc.edu). DOI: 10.1097/TLD.0000000000000144 children whose spoken language is more advanced learn to read more easily (McCardle, Scarborough, & Catts, 2001); children with lesser skills are at greater risk for reading difficulties; and children who do not attain sufficient reading skill have difficulty acquiring the many other types of knowledge that depend on print. These include math, given the heavy linguistic emphasis in modern curricula (e.g., explaining your work, story problems, math problems involving real-world situations described in words). They also include language itself, because reading is the primary means by which linguistic knowledge expands through the life span. In this article, we examine variability in language experience from the perspective of statistical learning, an important type of learning that people engage in from birth. Statistical learning is the largely unconscious process of learning the patterns of one’s environment— the probabilities that events will occur, or occur together, and in which sequences (Lany & Saffran, 2013; Seidenberg, 2017). The range of “events” that can be learned is broad, such 66 Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. Impact of Language Experience as the sequence of green, yellow, and red traffic lights; the pairing of the word “tiger” and a picture of a tiger in a child’s book; and the fact that IN is a more common letter sequence in English than IE, even though the letter E is overall more common than N in written English. Because statistical learning occurs over the entirety of an individual’s experience, the amount and variety of that experience greatly affects what is learned. Statistical learning provides a broad theoretical framework for evaluating how differences in spoken language and reading experience affect children’s development. The structure of the article is as follows: We first provide an overview of the statistical learning approach to language, including speech and reading. Statistical patterns exist within and between multiple levels of linguistic structure. This information plays a crucial role in learning and using spoken language. It is implicated in reading in three respects. First, variability in the child’s exposure to spoken language, associated with socioeconomic status and other variables, has downstream effects on linking print and speech. Second, learning about the written code— orthography—is itself a statistical learning problem. Children learn about the frequencies and distributions of letters, which patterns form words, and how those patterns relate to spoken language. This topic has been discussed extensively elsewhere (Castles & Nation, 2006; Grainger, 2008; Seidenberg, 2011). Third, the statistical properties of texts and speech differ. Texts employ vocabulary and sentence structures that rarely occur in speech. These characteristics have been termed “academic” or “school” language, and they are often seen as a major hurdle for many children (Bunch, Abram, Lotan, & Valdés, 2001). With this overview in hand, we then consider the impact of individual differences in experience: the amount and variety of the child’s spoken language experience and variability in the type or types of linguistic code(s) to which the child is exposed. Statistical learning provides a perspective on how these var- 67 ied experiences affect language development and learning to read. The approach is also relevant to the important question of how experience can be structured to promote success in learning to read and educational achievement in general, especially for children whose language experience is bilingual or bidialectal. STATISTICAL LEARNING Until recently, research on how languages are acquired and used was conducted within the framework established by Noam Chomsky. This framework emphasized the structure of language (grammatical competence) rather than facts about how language is used (performance). Researchers sought to identify the universal and language-specific properties of language, focusing on grammaticality (the well-formedness of utterances) rather than their use in the varied circumstances of people’s lives. The apparent fact that children converge on knowledge of a language despite substantial differences in experience was taken as evidence that essential aspects of grammar are innate rather than learned. With this understanding of grammar in hand, one could then ask how children acquire knowledge of a particular language, how this knowledge is used in comprehending and producing language, and how language is represented in the brain. Although there had previously been objections to this view (e.g., Bates & MacWhinney, 1987), in the 1990s some researchers began to advance an alternative. The competence approach excluded information about performance; however, the explanations for how language is acquired and used may have been thrown out with the performance bathwater (Seidenberg, 1997). With the availability of tools for collecting and analyzing large language corpora and the creation of archives such as the Child Language Data Exchange System (CHILDES, MacWhinney, 2000) and the Penn Treebank (Marcus, Marcinkiewicz, & Santorini, 1993), it became apparent that language as it is used exhibits myriad statistical regularities: patterns in the use of elements Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. 68 TOPICS IN LANGUAGE DISORDERS/JANUARY–MARCH 2018 within and between levels of linguistic structure, including phonology, morphology, vocabulary, and grammar, as well as in the relations between language and the contexts in which utterances occur. These statistical regularities turn out to be crucial: Languages can be learned and used because they exhibit these patterns and humans are capable of learning and using them. Children engage in statistical learning from birth, acquiring knowledge about the patterns of common events and their co-occurrences. For example, when toddlers knock a spoon off their high chair tray or drop a piece of banana over the side, they are learning about gravity—that unsupported objects will fall— and they also may be learning that dropping objects gets the attention of adults in the room. As we will show, much of language is learned in a similar manner. Statistical learning is implicit, meaning that it occurs without conscious intention to learn and without awareness that learning is taking place. It occurs in the background as we engage in activities such as reading, speaking, and acting. It is sensitive to the amount and variety of experience, and thus to individual differences in language background. Learning to read differs from learning a first language (or two) in many respects, but it is similar in an important one: it too depends on statistical learning. The beginning reader’s task is to learn how orthography—the written code—relates to the spoken language they already know. The spellings of words exhibit complex statistical patterning, as do the mappings between spelling to sound. Because the writing system is alphabetic, spelling and sound are more highly correlated than spelling and meaning (see Seidenberg, 2017, for detailed discussion). The same implicit learning mechanisms are employed in acquiring these types of knowledge as in acquiring a first language. Statistical learning is not the only way that people acquire knowledge; we also learn via explicit experience—instruction and feedback, exploration and discovery—with intention and awareness. These types of learn- ing are closely intertwined, each affecting the other. Reading depends more on explicit instruction than does spoken language, but the difference is again a matter of degree. Learning the inflectional systems in languages such as Finnish and Hungarian, which are much more complex than in English, requires extended explicit instruction in school, including the use of textbooks and other written materials. In contrast, learning to read those languages requires less instruction because their writing systems have few inconsistencies of the sort that are so prominent in English. Statistical properties of language Spoken and written language exhibit innumerable statistical patterns—inhomogeneities in the frequencies and co-occurrence of elements. Table 1 lists a number of statistical patterns with references to relevant research. The table is not an exhaustive list, and it does not do justice to the patterns that exist between levels. It also ignores how statistical patterns are altered by differences in language experience. However, the examples convey that statistical patterns exist at all levels of linguistic structure, ranging from the spoken and written forms of words, to combinations of words, to entire sentences. The table provides a snapshot of what is now a vast body of research conveying that from infancy onward, people are implicitly learning and using statistical knowledge to understand spoken and written language and their world around them (e.g., Conway & Christiansen, 2005; Saffran, Aslin, & Newport, 1996; Saffran, Johnson, Aslin, & Newport, 1999; Wells, Christiansen, Race, Acheson, & MacDonald, 2009). The table stops at the level of comprehending sentences, but it also could have included statistical properties of entire texts. Such properties have been used to identify the authors of texts where the authorship is disputed (Jockers & Witten, 2010), and they are the basis for automatic procedures for grading documents such as freshman English essays (Shermis & Burstein, 2013). Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. Impact of Language Experience 69 Table 1. Several types of language statistics in English that are crucial for oral and written language use Language Level(s) Example Relevance Phoneme positions /h/ does not occur at ends of words, /␩/ does not occur at word beginnings Phoneme and letter transition probabilities Syllable stress assignment in reading The phoneme sequence /nv/ is rare compared with /nt/; the same holds for the corresponding letter sequences NV and NT. Pronunciation of RECORD as REcord vs. reCORD varies with part of speech Word meaning GROUND Conceptual combination = statue depicting a bird; MARBLE STATUE = statue made of marble (not depicting it), MARBLE CATALOG = catalog describing marble, not made of it. Collocations in American English include: BACK IN THE DAY, NO WORRIES, GOOD TO GO, BRUSH YOUR TEETH, ON The statistics of phoneme locations in words support word recognition and identifying word boundaries in the speech stream (Vitevitch, Luce, Pisoni, & Auer, 1999). Sequences with low-transition probability are likely candidates for word and syllable boundaries in speech and syllable boundaries in reading (e.g., CANVAS, CAN VOICE; (Seidenberg, 1987). Readers use sentence context to identify stress and part of speech even in silent reading. Patterns are only probabilistic: ANCHOR, e.g., has the same stress pattern in both noun and verb forms (Seidenberg, 2017). Most common content words in English are ambiguous; meanings often belong to different parts of speech. Comprehenders must use semantic and syntactic context to identify the intended interpretation. Context is statistical—the floor sense of GROUND co-occurs with fell on the, and the adjective sense co-occurs with foods such as meat and coffee (MacDonald, 1993; Seidenberg, Tanenhaus, Leiman, & Bienkowski, 1982). Compound nouns, such as BIRD STATUE, MOUNTAIN MAGAZINE, ANCHOR LOCK, have many possible meanings but also probabilistic regularities that help comprehenders settle on the most likely interpretation (Murphy, 1990). Collocation (n) floor vs. GROUND (n) background vs. GROUND (v) past tense of grind vs. GROUND (v) conduct electricity GROUND (adj.) pulverized BIRD STATUE THE OTHER HAND Pronoun ambiguity Maria told Sue that she . . . SHE can refer to either Maria or Sue Collocations are high-frequency word sequences. These include but are not limited to idioms. Whereas frequency effects of individual words are well known to language researchers, there is growing recognition that comprehenders also track collocation frequencies and use them in speech and reading (Arnon & Snider, 2010). Pronouns are extremely common in speech and texts but create ambiguities. Statistical regularities, such as pronouns more often referring to grammatical subjects or other prominent nouns, guide children’s interpretation (Arnold, Brown-Schmidt, & Trueswell, 2007). Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. 70 TOPICS IN LANGUAGE DISORDERS/JANUARY–MARCH 2018 Although the cases listed in Table 1 involve different types of information, they are similar in this respect: at each level, a finite number of elements (say, letters) are combined to form a much larger number of units (e.g., words). The combination of elements is highly constrained: only a subset of the possible combinations is allowed. Millions more words can be formed by combining the letters of the alphabet than the 30,000 or so in a person’s vocabulary, for example. The same holds for combining words to form multiword sequences. For example, the words WE, NEED, and TO form one very common trigram WE NEED TO, whereas other combinations (WE TO NEED, etc.) are rare (though they may well occur, e.g., WHO ARE WE TO NEED HIM?). In all of these cases, the frequency distributions are highly skewed: a small proportion of the patterns (letters, phonemes, words, sequences of words) are used with high frequency, and then there is a long tail of combinations that are used much less frequently. Statistical patterns arise from constraints on combining elements that arise from a variety of sources, both endogenous (e.g., human information processing and learning capacities) and exogenous (e.g., systematic characteristics of the world in which language is used). Table 1 lists many statistical regularities separately, but a critical feature of language statistics is that they are correlated rather than independent, so that regularities at one level are informative at many others. For example, knowing that a word contains the morpheme BELL provides information about other words in which it occurs: BELLS, BELLED, COWBELL, BELLBOTTOMS. Similarly, BLUE occurs in BLUEBIRD, BLUEGILL, and BLUEBERRY. These patterns are helpful in learning and using new words—BLUEBELL, say. The word BELL is correlated with other linguistic and extra-linguistic information as well: bells are objects that are usually rung; bell ringing usually will involve an overt or implicit agent to make it happen (THE CUSTOMER RANG THE BELL; THE BELL RANG); bells are used for some purposes (as alarms; to make music) but not for many others (cracking nuts, recording a lecture). All these patterns are probabilistic, meaning that they afford a range of possibilities that differ in their probability of occurring, and violations of the dominant patterns abound (dumbbells are neither dumb nor bells; a bell might be used to crack a nut in a crunch; a bellboy is a person who carries your luggage, not your bells, and probably is not a boy, either). Language, like the world, is quasiregular (Seidenberg & McClelland, 1989): mostly predictable, though not entirely, and thus probabilistic patterns are helpful. This partial analysis of a single word, BELL, illustrates an important point about what it means to know a word. Researchers tend to think of words as countable entries in a mental dictionary. The contents of that mental dictionary are assessed using tests such as hearing a word like “bell” and picking out the corresponding picture, as on the Peabody Picture Vocabulary Test (PPVT, Dunn & Dunn, 2007). The entry for a word specifies its form (the pronunciation of BELL, and later, its spelling), grammatical category (noun), and meaning (instrument that produces sounds by being struck). However, our knowledge of a word is not like a dictionary entry. A word is more like a hook on which to hang many types of information: multiple meanings, senses, and grammatical functions of the word (e.g., verb senses of BELL); encyclopedic knowledge (facts about where bells are made or found); the kinds of words that are likely or necessary to occur with a word in different contexts; and other things we know as well. Children who know the word BELL can differ greatly in how much information they associate with it, a property that Perfetti (2007) termed “lexical quality.” Moreover, much of our grammatical knowledge is associated with individual words (MacDonald, Pearlmutter, & Seidenberg, 1994), specifying the kinds of sentence structures they occur in and their grammatical and thematic functions. Words are not just discrete building blocks out of which sentence meanings are formed, nor are they processed as such. In reading or hearing each word, we are generating, confirming, and revising Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. Impact of Language Experience expectations about other words. Sequences of words are comprehended by converging on the interpretation that best satisfies the expectations words carry about each other. The problem is greatly simplified by knowing the statistics: that is, which patterns are likely to occur and, just as important, which of the many possible patterns can be ignored because they occur very rarely or not at all. LEARNING AND USING LANGUAGE STATISTICS People are good at using language statistics and yet largely unaware of the mechanisms involved, because learning and using language statistics are unconscious and automatic. To get a sense of what is going on under the hood, it is helpful to walk through a simple example. Consider the following sentence, which is from Lost in the Tunnel of Time by Sharon Draper, a chapter book for 8–12-year-old readers: “The early bell seemed to hear him, for the signal to go into the building sounded just as he spoke” (Draper, 2011a, p. 4). Even the first word, THE, carries several probabilistic cues to what is coming: first, that a noun will occur within the next few words; second, that the very next word is likely to be a noun or part of a noun phrase, such as an adjective; third, that the noun is likely to be one describing something alive such as GIRL or DOG, because that is the most common pattern for the subject of a sentence. The second word, EARLY, confirms the existence of a noun phrase but also reduces the expectation that the upcoming noun will describe a living thing. EARLY is an adjective that usually modifies inanimate things (such as MORNINGS, BELLS, and RESULTS) more often than it modifies living things (such as BIRDS). EARLY also creates a new ambiguity because it can refer to a recent time (this morning’s early bell) or the distant past (early Stone Age tools). The next word, BELL, reduces the time ambiguity because the text is more likely to be about a recent event, such as the ringing of 71 a school bell, rather than a bell of historical interest. The probabilities would change, of course, if the book were “Bronze Age Chinese Bells” rather than a novel. At this point, the reader has found a likely, interpretable noun phrase, THE EARLY BELL, although it takes an additional word, the verb SEEMED, to confirm that the noun phrase ends there (it could have continued with another noun, as in THE EARLY BELL SCHEDULE). This process of generating expectations and picking the most probable option continues: the verb SEEMED suggests an upcoming adjective (as in THE EARLY BELL SEEMED LOUD), but the sequence SEEMED TO rapidly shifts expectations to an upcoming verb. The verb expectation is then confirmed but with an unexpected word, HEAR. The author confounds her readers because their knowledge of the world tells them that a bell, an inanimate object, can be heard but cannot hear. Readers or listeners may experience momentary uncertainty about how to fit the words together, an effect the author likely would have anticipated. The final word of the first clause, HIM, indicates the object of the hearing (English uses HE for a male subject and HIM for an object). This unambiguous object marking supports the surprising interpretation that the bell is doing the hearing. To make sense of this, the reader/listener must then infer that the event has been described figuratively rather than literally. The phrase has now been assigned a plausible interpretation. Yet with all that progress, HIM creates a new ambiguity because the sentence occurs in scene in which two male characters are conversing. Additional work is required to determine the likely referent of HIM, and so it goes, as comprehension of the sentence and text proceeds. This analysis illustrates some of the probabilistic constraints that contribute to comprehending just a seven-word phrase. Comprehenders cannot predict exactly what will occur following THE EARLY BELL but they can assign probabilities to a limited range of likely possibilities based on prior experience. The example also illustrates the important Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. 72 TOPICS IN LANGUAGE DISORDERS/JANUARY–MARCH 2018 fact that statistical information constrains interpretation in the forward direction, so information from one word yields predictions about aspects of upcoming words, but also in the backward direction, where that same word may also be refining the interpretation of the words that were previously read. Comprehension proceeds in this way because language is deeply and pervasively ambiguous. Ambiguity is not limited to words with two obviously different meanings, such as PITCHER meaning a container or an athlete. Instead, every word contains some amount of ambiguity, and the perceiver must continuously use prior knowledge and the preceding context to predict and evaluate what is coming and also revise the interpretation of preceding material as necessary. In this way, an understanding of a sentence emerges from the unconscious process of using the statistics of language and the world to settle into the most likely sentence meaning. The key to the use of such statistics is how they are combined. The process can be illustrated by a simple game. What word am I thinking of? Something yellow: Not very informative. Many yellow things. Low probability of guessing correctly. Something kind of round: same thing. Lots of round things in the world. A kind of fruit: again, not a very informative clue taken in isolation. Lots of fruits; low probability of guessing correctly. All of these clues are weak taken independently, but the combination of yellow + round + fruit yields a very likely answer: lemon. The conjunction of cues is highly constraining because the facts are not independent: fruits are objects that have colors and sizes. Language is similar but on a much larger scale. Each level of linguistic structure exhibits statistical regularities, but the levels are not independent: each constrains what can occur at other levels. The ability to derive a high-probability event (e.g., the meaning of a phrase) from a combination of much lower probability ones occurs in language and in many other domains: it is a fundamental characteristic of human behavior. We make intelligible cognitive and linguistic mountains out of miserable statistical molehills, all the time, and effortlessly. The evidence that language users combine probabilistic constrains in this way derives from several sources. There is abundant research on children and adults’ use of statistical information in comprehending speech. This research has documented the use of simple statistics in very young children and the developmental progression that occurs as they learn more about their language and their world, which is marked by changes in the statistics that are used and increases in proficiency in using them (Lew-Williams & Fernald, 2007; Saffran et al., 1996). A few studies have tied children’s abilities to use spoken language statistics to language experience: children who hear more language, measured as number of words of child-directed speech, complexity of that speech, and amount of conversational turn taking, perceive speech more rapidly and accurately and combine statistical cues faster as well (Fernald, Marchman, & Weisleder, 2013). The bulk of the evidence for complex cue integration during sentence comprehension comes from studies of reading. We know that skilled adult readers use all the types of statistics illustrated in the EARLY BELL example and in Table 1. Psycholinguistic research has yielded a huge body of work on how adult readers integrate probabilities like these in comprehending sentences, and how variations in predictability affect reading patterns, as measured using eye-tracking and related methods (for review, see MacDonald & Hsiao, in press). There are few comparable studies of children’s reading, however, leaving a gap in what is known about how these skills develop and change over time. Evidence of adults’ exquisite use of statistics leads to the question of how we got here: how does a language learner or user know which of the myriad statistical properties of a language are relevant? This could be called the “richness of the stimulus” problem: Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. Impact of Language Experience languages afford many generalizations that children never make. In an earlier era, the explanation was that innate grammatical knowledge restricts the range of hypotheses about language, obviating the problem (Cook, 1991). The alternative is that the problem is solved by the statistical learning procedure itself. The use of statistical information in learning, perception, and cognition can be formalized in several related ways; see McClelland et al. (2010) and Griffiths, Chater, Kemp, Perfors, and Tenenbaum (2010) for comparisons between the Bayesian and connectionist approaches that are widely used in studies of language and cognition. The choice between approaches often depends on the type of question being asked and how much is known about the problem domain. In general, Bayesian models start with strong theories about which types of information (e.g., language statistics) are relevant to a behavior such as categorizing objects or predicting grammatical categories of words in sentences. They then specify the way to combine existing knowledge with new data (e.g., about a sentence being read) to reach optimal inferences. In language, however, we do not know in advance which statistics are relevant or their numerical values. Connectionist (aka “deep learning”) networks are particularly useful tools for examining how this knowledge develops incrementally with experience. To examine this, a computer simulation model is given a task such as recognizing a letter string as a specific word. Given the task, the architecture of the model, the corpus of training examples, and the learning algorithm, the model acts as a discovery procedure, determining which properties allow the task to be performed quickly and accurately (Seidenberg & MacDonald, 1999). The statistics that are relevant do not have to be specified in advance; the model solves the problem itself, such that over time, the model’s behavior is determined by statistical properties that allow the task to be performed. The same principles apply in networks that process sequences of words that form meaningful sentences (Elman, 1990). 73 Something similar happens in people. People are statistical learners. We cannot prevent ourselves from responding to patterns in the environment, registering repetitions and novelties, similarities and differences, the way things vary and covary, and then how the things that covary actually covary. Learning a language, followed by learning to read is a Big Data problem for humans. For language learners, the data are the millions of utterances to which they are exposed and their relations to situations and events in the world. Statistical learning is the process of distilling regularities from this mass of data—patterns, largely shared by speakers of a language, that enable communication to occur (Seidenberg, 2017). STATISTICAL LEARNING AND LANGUAGE VARIATION Statistical patterns become effective guides for comprehension only when humans are exposed to large numbers of examples, initially spoken and later written. The evidence that these patterns are critical to speaking and reading suggests that differences in the amount and variety of language experience should greatly affect performance. Of course, we know this to be true: characteristics of children’s early spoken language experience vary in ways that affect what they learn, with downstream effects on educational progress, especially in reading. In the following sections, we examine several important sources of individual differences in language and reading from a statistical learning perspective. Vocabulary learning is statistical learning The Hart and Risley (1995) study is the best known of what are now many studies to have examined variability in the quantity and variety of language in the learner’s environment (earlier such studies include Snow et al., 1976). There is an extensive literature linking language experience to socioeconomic status (Hoff, 2003) but also important studies demonstrating variability within lower socioeconomic status families (Pan, Rowe, Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. 74 TOPICS IN LANGUAGE DISORDERS/JANUARY–MARCH 2018 Singer, & Snow, 2005; Weisleder & Fernald, 2013). Although these studies involved characterizing multiple properties of the language environment, vocabulary gets the most attention, perhaps because it is easy to quantify: as the number of different words a child produces, or as performance on a simple word– picture matching test such as the PPVT. Vocabulary measured in these ways is an indicator of some characteristics of children’s linguistic knowledge, and the substantial individual differences that are apparent at the start of kindergarten are related to progress in learning to read (Cunningham & Stanovich, 1991; Walker, Greenwood, Hart, & Carta, 1994). It is also important to note that these assessments index children’s knowledge of the standard dialect of the language used in school but not their knowledge of another dialect or language and thus may underestimate many children’s linguistic knowledge. The emphasis on vocabulary might seem narrow, given that using language communicatively involves many other types of information. As we have observed, however, vocabulary assumes even greater importance when it is recognized that words are statistically linked to other words and to other levels of linguistic representation and thus carry information about the sentences in which they occur. Children with smaller vocabularies— assuming their vocabularies have been adequately assessed—do not simply know fewer words; they also know less about language and the world. Children who experience higher rates of child-directed speech not only have greater vocabulary knowledge but also comprehend speech more rapidly (Weisleder & Fernald, 2013). Vocabulary knowledge that is below age or grade-level expectations is troubling because it is hard to ameliorate. There simply is not sufficient time to explicitly teach hundreds of words. However, statistical learning provides a potential mechanism for building vocabulary more efficiently. Few of the words we know were learned through explicit instruction. We infer the meanings of words from the linguistic and extra-linguistic contexts in which they occur. Words with similar meanings tend to occur in similar contexts (Firth, 1957). Much can be inferred about the meaning of the word LYNX, for example, because it appears in the same statistical contexts as the related words LION and TIGER (even better if there is also a picture). Lila Gleitman and colleagues famously demonstrated that the meanings of verbs can be “bootstrapped” to a great extent from the syntactic contexts in which they occur (Fisher, Hall, Rakowitz, & Gleitman, 1994). Landauer and Dumais (1997) described how knowledge of language statistics prepares people to learn new words before they have been experienced. Even a limited amount of vocabulary instruction is helpful because learning a new word also facilitates the learning of other, related words (Beck, Perfetti, & McKeown, 1982). Impoverished language experience is a source of deficits in vocabulary and other areas, but enriching that experience can potentially accelerate learning without heavy dependence on instruction. We do not yet know how to do this effectively, though researchers are exploring methods that include promoting reading to children, providing training to parents, and monitoring rates of child-directed speech, as well as prompts for child-directed conversation in public spaces such as grocery stores (Hirsch-Pasek et al., 2015; Ridge, Weisberg, Ilgaz, Hirsh-Pasek, & Golinkoff, 2015; Suskind et al., 2016). New approaches that have different or broader impacts are likely to emerge from these efforts. Statistics of speech and reading Language statistics undergo a substantial change as the child transitions from prereader to independent reader because the statistics of written language differ markedly from those of spoken language. Words such as LIKE and YEAH occur more often in speech; THUS and WHICH occur more often in print. ACTUALLY is more common in speech; OBVIOUSLY in print. Lower frequency words such as DIVERGE and CONCLUDE are unlikely to be learned except through reading. Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. Impact of Language Experience Differences between the modalities extend beyond the distributions of words. Speech, especially to children, is often about the here and now, where objects are visible and recent events are part of the shared conversational context. A sentence such as DON’T DO THAT AGAIN! is short and simple, and it gains its full meaning via its relationship to the environment, namely, whatever it was that the child was doing. Texts are decontextualized, and the “world” is established with words and sentences, creating substantial differences compared with spoken language. One example is relative clauses, which give extra context about the people and objects mentioned in a text. They are commonly used in setting the scene in stories for children, such as the classic openings of fairy tales: ONCE UPON A TIME, THERE WAS AN OLD WOMAN [WHO LIVED IN A HUT [THAT WAS COVERED WITH FLOWERS]]. Here the relative clauses, shown in brackets, give extra information about the main character, the old woman, and also about her hut. Relative clauses exist in child-directed speech but they are quite rare (Montag & MacDonald, 2015), largely because the environmental context 75 often carries some of the meaning—we do not have to say the relative clause in “Pick up the toys [that you got out]” if we can simply point while saying “Pick up the toys.” Montag and MacDonald (2015) found that compared with child-directed speech, texts for young children contain far more relative clauses—a four-fold increase over speech for some types of relative clauses, and a 200-fold increase for other types. Some examples of the sorts of relative clauses that preschool children and elementary school readers encounter are shown in Table 2. The examples in Table 2 do not have particularly difficult vocabulary, but they are nonetheless likely to be challenging for young readers. The reason lies in language statistics: whereas relative clauses are rare in childdirected speech, they are much more common in texts for children. Sentence structures, including relative clauses, that occur mainly in texts are a type of “academic language.” Children must master these unfamiliar forms as they are learning to read. Although “academic language” is typically treated as something to be learned in school, Table 2. Relative clauses in picture books and texts for young independent readers Source and Reading Level Asim (2006). Whose Knees Are These? (Picture book for reading to 1–3-year-olds). McQuinn (2006). Lola at the Library (picture book for reading to 2–5-year-olds). Thomson (2010). Keena Ford and the Field Trip Mix-up, an early reader chapter book for 6–8-year-olds. Draper (2011b). Shadows of Caesar’s Creek (a chapter book for 7–12-year-old readers) Examples, With Relative Clauses [in Brackets] Knees [like these] don’t grow on trees. She put all the books [she borrowed last week] in her backpack. But it turns out a thesaurus is just a book of words [that mean the same thing as other words] . . . . I know about student council because this fifth grader [named Lamont] [who walks me home] was on student council last year. Using the remains of an old fence [that the boys had found in Ziggy’s backyard], they had built the clubhouse themselves the previous summer. They had cut holes [that looked a lot like windows] in two side walls . . . . Note. Reading level retrieved from Amazon.com Web site. Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. 76 TOPICS IN LANGUAGE DISORDERS/JANUARY–MARCH 2018 a surprising finding from recent research is that children are often introduced to it much earlier via shared reading with caregivers. The fact that shared reading supports reading development is widely recognized (Payne, Whitehurst, & Angell, 1994) and promoted (see, e.g., “Reading is Fundamental” and other such public service announcements). The activity introduces children to literacy, familiarizing them with the format and content of books and demonstrating that print represents language. The activity is highly valued as a way to generate interest in reading and motivation to learn how to read. It is less widely recognized that books for reading to and with young children provide exposure to language that rarely occurs in child-directed speech (Montag, Jones, & Smith, 2015). Such texts can enrich children’s vocabulary because they contain a wider range of words than in speech, and the words occur in more variable contexts. Moreover, as Table 2 illustrates, picture books for adults to read to children can have an abundance of complex relative clauses and other constructions that are thought of as “academic language” precisely because they rarely occur in everyday child-directed speech. The same is true for many other sentence types that are not themselves syntactically complex but are nonetheless much more common in text than speech. In Peter’s Chair by Ezra Jack Keats (1998), the protagonist, a boy named Peter, says, “We’ll take my blue chair, my toy crocodile, and the picture of me when I was a baby,” a construction a young child would rarely produce or hear. While definitive data are not yet available, these observations suggest that, in addition to its other functions, shared reading of picture books is important because it is a mechanism for familiarizing children with complex sentence structures they would not otherwise hear. As with childdirected speech, adults’ shared book reading with children is highly variable, both in lower SES families (Payne et al., 1994) and in higher SES families with high levels of parent education (Montag & Smith, 2017). An obvious path for future research is to investigate the shifts in vocabulary and sentence structure statistics that come from typical picture books intended for shared book reading and to examine the effects of children’s exposure to these books on their early reading. Variability in type of language experience Most studies of language and reading in the United States have focused on mainly White, middle-class individuals who are monolingual speakers of the mainstream (“standard”) dialect (i.e., people very much like the researchers who usually conduct the studies). Language experiences frequently differ greatly from this case, however. There is a large literature on bilingualism, referring to children exposed to more than one language, dating back many decades, and a much smaller though growing literature on children who speak a minority dialect. In the United Sates, this dialect is most often African American English (AAE). Minority and “standard” dialects and their educational implications also have been studied in many other countries and languages (Latomaa & Nuolijärvi, 2002; Levin, Saiegh-Haddad, Hende, & Ziv, 2008). Dialects spoken by mainly low-income minority groups have been studied in Australia and Canada (Siegel, 2010). Here we consider how the statistical learning framework, which was also developed with the monolingual/mainstream dialect situation in mind, applies to more complex language learning environments. It is difficult to generalize about “bilingualism” because it covers such an enormous range of circumstances. Consider the following examples: a. Typological distinctiveness of the languages. English overlaps more with Spanish than with Mandarin Chinese. Learning English and either of these languages is more difficult than learning two Romance languages such as Spanish, Italian, and French, which overlap much more. b. Experiential variables such as timing of exposure to the two languages; amount of exposure; conditions under which the Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. Impact of Language Experience languages are used including who speaks the language(s) in the home; the school language environment and curriculum; incentives and disincentives to learn a second language; the socioeconomic status of the speakers (e.g., becoming a Spanish–English bilingual is different for middle-income English speakers learning Spanish vs. lower income Spanish speakers learning English); and many others. The combinations of these (and other) factors create a huge, varied space of bilingual learning conditions. Disagreements about best practices in bilingual education (Hakuta, 1999; Kim, Hutchison, & Winsler, 2015) may reflect the fact that no procedure works equally well under all these conditions. What can be said is that the learning environment is more complex for children who are exposed to two languages. The child (or adult) is learning two systems for expressing the same things; a monolingual learns one. Acquiring a language depends on the amount and variety of linguistic experience. For bilinguals, that experience is split, in varying proportions, between the languages. Bilingual children’s knowledge of each language typically lags behind monolingual children’s knowledge of their single language; this finding follows naturally from the experiencedependent character of learning. Over the longer term, these initial costs may be superseded by the benefits of being bilingual (Bialystok, Craik, & Luk, 2012). Among other things (such as learning about the conditions under which different languages are used), the bilingual individual is learning two sets of statistics. Very little is known about this process. Consider, again, vocabulary. There is a long history of research on whether bilinguals represent the words in the two languages in separate “coordinate” lexicons or in an integrated “compound” lexicon. For example, CAMPANA is a Spanish word for BELL. A Spanish–English bilingual might develop an integrated lexicon in which BELL and CAMPANA are linked to their shared, musical instrument meaning, or they might develop sep- 77 arate lexicons for the two languages. Words such as BELL and CAMPANA are sometimes called translational equivalents: words in different languages with the same meaning. Although the meanings of the words are similar, they differ in most respects. For example, CAMPANA also refers to a hood for covering food (e.g., to keep flies away) but BELL does not. BELL can refer to both church bell and school bell; Spanish uses different words (church: CAMPANA; school: TIMBRE). All of the statistics governing the structures associated with each word and its co-occurrences with other words are radically different. However, the words are not completely unrelated. The properties of bells (the musical instruments), their purposes, the conditions under which they are used, and the fact that they are inanimate and, therefore, heard rather than hear are much the same. Acquiring these two sets of statistics is far more complex than memorizing translational equivalents or developing a compound versus coordinate lexicon. Encoding the similarities and maintaining the differences are hard learning problems and not well understood. Although there is little research on statistical language learning in bilinguals (for review, see Weiss, Poepsel, & Gerfen, 2015), one gains a sense of the complexity of the task by comparing monolingual speakers of the languages. For English-speaking children, encountering an article such as A or THE provides evidence that a noun is upcoming in speech or text, and children as young as 2 years of age can use their knowledge of the statistics of article usage to predict upcoming nouns, even though they do not yet produce articles in their own speech (Zangl & Fernald, 2007). Article usage statistics are different for Spanish-speaking children because the language uses grammatical gender to distinguish masculine and feminine nouns. Both adjectives and articles are also marked for gender in Spanish, so that the equivalent of English THE in Spanish is EL for masculine nouns and LA for feminine nouns. On the one hand, this system is more complex than using THE for every noun, and there is the additional burden of learning the gender for Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. 78 TOPICS IN LANGUAGE DISORDERS/JANUARY–MARCH 2018 every noun; on the other hand, encountering a gender-specific article can narrow the range of possible nouns in a sentence, especially in combination with other contextual cues. Spanish-monolingual toddlers can use the statistics of gendered article–noun pairings to anticipate a masculine noun when hearing EL but a feminine noun when hearing LA (Lew-Williams & Fernald, 2007). By contrast, adults who speak Spanish as a second language, typically having first been exposed to it in middle school, do not show evidence of taking advantage of the gender marking in this way (Lew-Williams & Fernald, 2010). A bilingual child is learning both sets of conditions, of course. Basic, though fascinating, questions remain to be addressed, such as whether bilingual children exhibit similar effects in a given language as are seen in studies of monolinguals. It seems clear that people have difficulty learning properties of a second language that are unattested or strikingly different than in their first language (e.g., Mandarin speakers’ difficulties with articles such as THE; English speakers’ difficulties with tone in learning Mandarin). Less is known about whether degree of exposure to properties of one language, such as the predictive utility of gendered articles in Spanish, affects speech or reading in a language that lacks that property. Similarly, Spanish-English and MandarinEnglish bilingual children who are learning to read relative clauses in English may face additional burdens compared with their monolingual English peers, because the properties of relative clauses in Spanish and Mandarin differ from those in English (and from each other). Spanish has relative clause options that do not exist in English, including alternative “impersonal” forms with different word orders, as well as the common use of relative clause markers that are similar to WHO versus WHOM, which can help disambiguate subject and object roles in relative clauses (Gennari, Mirković, & MacDonald, 2012), whereas WHOM is largely absent from Englishspeaking children’s spoken language input. Mandarin is radically different: Whereas English and Spanish place the relative clause after the noun it modifies, as in CATS [THAT SLEEP ON THE BENCH], Mandarin places the relative clause before the noun it modifies, something like [SLEEP ON THE BENCH THAT] CATS. Again, although there is some evidence about how the language-specific conditions are learned by native speakers, little is known about learning them in two such languages. Dialect variation presents many of the same issues. It is a sociolinguistic truism that languages and dialects exist on a continuum without a discrete boundary between them. We should then expect issues related to bilingual experience to be relevant to experience with two dialects. Just as languages vary in degree of overlap, so do the dialects of a language. Globally, English has numerous major dialects (Szmrecsanyi & Kortmann, 2009). Different dialects function as the “standard” in different regions, with other dialects diverging from the standard in varying degrees. As in the bilingual case, the bidialectal experience is highly variable, involving many of the same factors such as age and amount of exposure, the contexts in which the dialects are used, socioeconomic status of the speakers, and others. Here too the range of circumstances discourages broad generalizations. Although bilingual and bidialectal environments differ in many respects, they share the fact that the child has more to learn than a monolingual/monodialectal child, and that the child is learning alternatives to things he or she can already say. Learning two dialects clearly represents another complex task. In the United States, the “nonstandard” dialect that has been studied most is AAE. The linguistic properties of the dialect (and its regional variants) have been described with numerous examples of their use, and there are many studies of children’s AAE usage (e.g., in relation to their progress in reading; see Washington, Terry, & Seidenberg, 2013, for a review). Characterizations of the statistical properties of a language such as English usually look at general patterns seen in large corpora, ignoring dialectal variation. There is, as yet, almost no research applying statistical learning concepts to either AAE itself or to Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. Impact of Language Experience the bidialectal learning environment, such as when AAE is mixed with the more “standard” dialect, often called Mainstream American English (MAE). Nonetheless, descriptions of linguistic features of AAE make it clear that they entail language statistics that differ from the mainstream dialect. Consider, for example, a sentence that begins “Why did the boy . . . ..” An MAE speaker learns a probability distribution for possible continuations, which includes likely ones such as SMILE, GO, ASK, and so on. Although this distribution is likely to be similar for AAE speakers, it will also differ because it includes AAE-specific constructions such as “Why did the boy didn’t stop?” (Washington, 2001). How much these probability distributions differ between dialects and how much they vary across speakers of the dialects are not known because the research has not been conducted. From a statistical learning perspective, using both AAE and MAE requires two sets of overlapping but nonidentical statistics. Predictions about the word that follows “Why did the boy . . . .” depend on dialect knowledge but also sociolinguistic factors such as who is speaking and in what context (e.g., home vs. school). The learning task may be more difficult in some respects than in the case of two languages because the dialects, which are variants of the same language, differ but nonetheless overlap a great deal. Roughly speaking, the minimal overlap between the grammars of, say, English and Spanish means that there is relatively little facilitation or interference in learning one compared with the other (even less for English and Mandarin). The partial overlap between dialects of English facilitates learning of transparently similar forms, but it is also a source of interference when something that applies in one dialect is disallowed or disfavored in the other. In short, the dual-dialect language environment introduces additional ambiguities at many levels of linguistic structure, the resolution of which depends on knowledge of the relevant dialect-specific statistics. Many children successfully manage the demands of learning and using two languages or 79 dialects but outcomes vary greatly. The statistical learning framework could potentially provide important evidence about the conditions that promote or interfere with managing dialect (and language) differences. We have conducted one demonstration (Brown et al., 2015). The study focused on one of the simple differences between AAE and MAE, the statistics regarding alternative pronunciations of many words. Pronunciations can differ because, as one example, AAE permits optional reduction of consonant clusters. Thus, TEST can be pronounced “tes,” omitting the final /t/ in the MAE pronunciation; COLD can be pronounced “cole” which is homophonic with COAL. The conditions governing the deletion of phonemes are complex and speakers differ in the extent to which they employ this option (Washington, 2001). The existence of alternative pronunciations complicates learning in two respects. First, the AAE speaker has to learn alternative pronunciations of the same word, including an increased number of homophones such as “cole.” Whether this presents a significant challenge compared with accommodating pronunciations that differ in pitch, rate, or other speech qualities is not known. Differences in the number of phonemes in two pronunciations definitely become prominent when the child learns to read. Beginning readers learn how spellings represent spoken words. Acquiring these spelling–sound correspondences is a classic statistical learning problem (Seidenberg & McClelland, 1989), already complex because of the properties of written English and even more complex for AAE speakers because spellings such as COLD map onto two pronunciations. Brown et al. (2015) developed a simple computational model to compare the single versus dual dialect conditions. The main finding was that models that had learned the AAE pronunciations of words for speech had more difficulty learning the MAE pronunciations for reading. Learning proceeded more slowly than in the single-dialect case because there was more to learn. The model yielded two other findings: first, that performance of Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. 80 TOPICS IN LANGUAGE DISORDERS/JANUARY–MARCH 2018 the dual-dialect model eventually caught up to the single-dialect model with sufficient experience; second, that learning occurred more rapidly when the model included reliable contextual cues indicating whether to use MAE or AAE. Such effects are not enormously surprising but they are not widely recognized either. One of the major differences between bilingual and bidialectal situations is in how behavior is interpreted. African American English and MAE overlap more than two languages, but the learning demands are similar in many respects. These similarities are not reflected in educational research, practices, or policies. The concept of bilingualism is clear to most people from examples such as individuals who can fluently speak both English and another language (such as Spanish). There is less familiarity with the bidialectal situation. People’s lay concepts of dialect frequently fail to distinguish the linguistic properties of dialects (e.g., that they are comparable in complexity and expressive power) from the sociolinguistic status of dialects (e.g., that one may be institutionalized as the high prestige “standard,” compared with lower status variants). The educational needs of bilingual children have been the focus of extensive research, resulting in programs that utilize a variety of strategies to promote acquisition of the school language, but the corresponding dialect issues remain politicized, understudied, and underrecognized in the United States. When children who speak a different language in the home lag behind monolingual peers in speech and reading, the behavior can be attributed to the fact that they are English learners. African American English speakers may lag for similar reasons but that is rarely taken into account. Many speakers of AAE, educators, and observers still view AAE as a deficient version of English, despite decades of research showing this to be false. In contrast, a child who speaks a language such as Spanish in the home is not said to have learned “bad English.” Whereas the linguistic distinction between language and dialect is a matter of degree, educational policies and practices treat them dichotomously. Both bilingual and bidialectal learning are areas where the statistical learning approach could help identify how to structure children’s experience to promote successful learning. It could be determined, for example, how to take advantage of the child’s existing knowledge of language to promote learning a second language or dialect. It could also be used to avoid practices that make learning the second code more difficult. For example, a high level of skill with one code may result in “entrenchment” that makes it difficult to accommodate a second code, which Seidenberg and Zevin (2006) termed “the paradox of success.” Their simulation models suggested that this negative by-product of language proficiency can be avoided by interleaving experience with a second language relatively early, before the first language becomes highly overlearned. CONCLUSIONS The statistical learning approach to language and reading was developed in the context of research on monolingual speakers of “standard” language but is ripe for extension to variable learning environments. A theoretical approach that views language learning and use as the result of a large number of interacting statistical constraints can take into account a broader range of language experiences, as well as the impact of extra-linguistic factors (such as differences in background knowledge). Several approaches to statistical learning and probabilistic decision making exist and are being applied to a wide range of cognitive and linguistic phenomena (Clark, 2013; McClelland et al., 2010; Perfors, Tenenbaum, Griffiths, & Xu, 2011). These approaches have not yet had much influence on the study of variability in language experience and its impact on learning to read, but the field is wide-open and the potential payoffs for both theory and practice are huge. Aside from the need for more researchers with the relevant skills to examine these Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. Impact of Language Experience issues, the greatest need is for additional facts about the variability of language experience and use. The statistical learning approach resulted from advances in understanding human learning and in the development of computational and quantitative methods for analyzing large language corpora. Studies such as Hart and Risley (1995) look antiquated 81 by modern standards: what was a heroic effort for the era now looks like a relatively small data set from an idiosyncratic set of families. Bringing statistical learning together with language variation requires gathering much larger samples of language behavior from a much wider range of individuals. REFERENCES Arnold, J. E., Brown-Schmidt, S., & Trueswell, J. (2007). Children’s use of gender and order-of-mention during pronoun comprehension. Language and Cognitive Processes, 22(4), 527–565. doi:10.1080/016 90960600845950 Arnon, I., & Snider, N. (2010). More than words: Frequency effects for multi-word phrases. Journal of Memory and Language, 62(1), 67–82. doi:10.1016/j .jml.2009.09.005 Asim, J. (2006). Whose knees are these? New York, NY: LB Kids. Bates, E., & MacWhinney, B. (1987). Competition, variation, and language learning. In B. MacWhinney (Ed.), Mechanisms of language acquisition (pp. 157–193). Hillsdale, NJ: Earlbaum. Beck, I. L., Perfetti, C. A., & McKeown, M. G. (1982). Effects of long-term vocabulary instruction on lexical access and reading comprehension. Journal of Educational Psychology, 74(4), 506–521. doi:10.1037/ 0022-0663.74.4.506 Bialystok, E., Craik, F. I. M., & Luk, G. (2012). Bilingualism: Consequences for mind and brain. Trends in Cognitive Sciences, 16(4), 240–250. doi:10.1016/j .tics.2012.03.001 Brown, M. C., Sibley, D. E., Washington, J. A., Rogers, T. T., Edwards, J. R., MacDonald, M. C., et al. (2015). Impact of dialect use on a basic component of learning to read. Frontiers in Psychology, 6, 196. doi:10.3389/ fpsyg.2015.00196 Bunch, G. C., Abram, P. L., Lotan, R. A., & Valdés, G. (2001). Beyond sheltered instruction: Rethinking conditions for academic language development. TESOL Journal, 10(2–3), 28–33. doi:10.1002/j.19493533.2001.tb00031.x Castles, A., & Nation, K. (2006). How does orthographic learning happen. In S. Andrews (Ed.), From inkmarks to ideas: Current issues in lexical processing. Hove, Sussex, UK: Psychology Press. Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. The Behavioral and Brain Sciences, 36(3), 181–204. doi:10.1017/S0140525X12000477 Conway, C. M., & Christiansen, M. H. (2005). Modalityconstrained statistical learning of tactile, visual, and auditory sequences. Journal of Experimental Psychology Learning, Memory, and Cognition, 31(1), 24–39. doi:10.1037/0278-7393.31.1.24 Cook, V. J. (1991). The poverty-of-the-stimulus argument and multicompetence. Interlanguage Studies Bulletin (Utrecht), 7(2), 103–117. doi:10.1177/ 026765839100700203 Cunningham, A. E., & Stanovich, K. E. (1991). Tracking the unique effects of print exposure in children: Associations with vocabulary, general knowledge, and spelling. Journal of Educational Psychology, 83(2), 264–274. doi:10.1037/0022-0663.83.2.264 Draper, S. M. (2011a). Lost in the tunnel of time (Reissue edition). New York, NY: Aladdin. Draper, S. M. (2011b). Shadows of Caesar’s creek (Reissue edition). New York, NY: Aladdin. Dunn, L. M., & Dunn, D. M. (2007). PPVT-4: Peabody picture vocabulary test. Minneapolis, MN: Pearson Assessments. Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179–211. Fernald, A., Marchman, V. A., & Weisleder, A. (2013). SES differences in language processing skill and vocabulary are evident at 18 months. Developmental Science, 16(2), 234–248. doi:10.1111/desc. 12019 Firth, J. R. (1957). A synopsis of linguistic theory, 1930– 1955. In Philological Society (Ed.), Studies in linguistic analysis. Oxford: Blackwell. Fisher, C., Hall, D. G., Rakowitz, S., & Gleitman, L. (1994). When it is better to receive than to give: Syntactic and conceptual constraints on vocabulary growth. Lingua, 92(Suppl. C), 333–375. doi:10.1016/00243841(94)90346-8 Gennari, S. P., Mirković, J., & MacDonald, M. C. (2012). Animacy and competition in relative clause production: A cross-linguistic investigation. Cognitive Psychology, 65(2), 141–176. doi:10.1016/j.cogpsych .2012.03.002 Grainger, J. (2008). Cracking the orthographic code: An introduction. Language and Cognitive Processes, 23(1), 1–35. doi:10.1080/01690960701578013 Griffiths, T. L., Chater, N., Kemp, C., Perfors, A., & Tenenbaum, J. B. (2010). Probabilistic models of Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. 82 TOPICS IN LANGUAGE DISORDERS/JANUARY–MARCH 2018 cognition: Exploring representations and inductive biases. Trends in Cognitive Sciences, 14(8), 357– 364. Hakuta, K. (1999). The debate on bilingual education [Editorial]. Journal of Developmental, 20(1), 36–37. Hart, B., & Risley, T. R. (1995). Meaningful differences in the everyday experience of young American children. Baltimore, MD: Paul H. Brookes Publishing. Hirsch-Pasek, K., Adamson, L. B., Bakeman, R., Owen, M. T., Golinkoff, R. M., Pace, A., et al. (2015). The contribution of early communication quality to low-income children’s language success. Psychological Science, 26(7), 1071–1083. doi:10.1177/0956797615581493 Hoff, E. (2003). The specificity of environmental influence: Socioeconomic status affects early vocabulary development via maternal speech. Child Development, 74(5), 1368–1378. Jockers, M. L., & Witten, D. M. (2010). A comparative study of machine learning methods for authorship attribution. Literary and Linguistic Computing, 25(2), 215–223. doi:10.1093/llc/fqq001 Keats, E. J. (1998). Peter’s chair (Reprint edition). New York, NY: Puffin Books. Kim, Y. K., Hutchison, L. A., & Winsler, A. (2015). Bilingual education in the United States: An historical overview and examination of two-way immersion. Educational Review, 67(2), 236–252. doi:10.1080/00131911.2013.865593 Landauer, T. K., & Dumais, S. T. (1997). A solution to Plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104(2), 211–240. doi:10.1037/0033-295X.104.2.211 Lany, J., & Saffran, J. R. (2013). Statistical learning mechanisms in infancy. In J. Rubenstein & P. Rakic (Eds.), Comprehensive developmental neuroscience: Neural circuit development and function in the brain (Vol. 3, pp. 231–248). Cambridge, MA: Academic Press. Latomaa, S., & Nuolijärvi, P. (2002). The language situation in Finland. Current Issues in Language Planning, 3(2), 95–202. doi:10.1080/1466420020 8668040 Levin, I., Saiegh-Haddad, E., Hende, N., & Ziv, M. (2008). Early literacy in Arabic: An intervention study among Israeli Palestinian kindergartners. Applied Psycholinguistics, 29(3), 413–436. doi:10.1017/ S0142716408080193 Lew-Williams, C., & Fernald, A. (2007). Young children learning Spanish make rapid use of grammatical gender in spoken word recognition. Psychological Science, 18(3), 193–198. doi:10.1111/j.14679280.2007.01871.x Lew-Williams, C., & Fernald, A. (2010). Real-time processing of gender-marked articles by native and non-native Spanish speakers. Journal of Memory and Language, 63(4), 447–464. doi:10.1016/j.jml.2010.07.003 MacDonald, M. C. (1993). The interaction of lexical and syntactic ambiguity. Journal of Memory and Language, 32(5), 692–715. doi:10.1006/jmla.1993.1035 MacDonald, M. C., & Hsiao, Y. (in press). Sentence comprehension. In Oxford Handbook of Psycholinguistics. Oxford, UK: Oxford University Press. MacDonald, M. C., Pearlmutter, N. J., & Seidenberg, M. S. (1994). The lexical nature of syntactic ambiguity resolution. Psychological Review, 101(4), 676–703. doi:10.1037/0033-295X.101.4.676 MacWhinney, B. (2000). The childes project (3rd ed.). Mahwah, NJ: Psychology Press. Marcus, M. P., Marcinkiewicz, M. A., & Santorini, B. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), 313–330. McCardle, P., Scarborough, H. S., & Catts, H. W. (2001). Predicting, explaining, and preventing children’s reading difficulties. Learning Disabilities Research & Practice, 16(4), 230–239. McClelland, J. L., Botvinick, M. M., Noelle, D. C., Plaut, D. C., Rogers, T. T., Seidenberg, M. S., et al. (2010). Letting structure emerge: Connectionist and dynamical systems approaches to cognition. Trends in Cognitive Sciences, 14(8), 348–356. doi:10.1016/j.tics.2010.06.002 McQuinn, A. (2006). Lola at the library (1st ed.). Watertown, MA: Charlesbridge. Montag, J. L., Jones, M. N., & Smith, L. B. (2015). The words children hear: Picture books and the statistics for language learning. Psychological Science, 26(9), 1489–1496. doi:10.1177/0956797615594361 Montag, J. L., & MacDonald, M. C. (2015). Text exposure predicts spoken production of complex sentences in 8- and 12-year-old children and adults. Journal of Experimental Psychology. General, 144(2), 447–468. doi:10.1037/xge0000054 Montag, J. L., & Smith, L. B. (2017). Picture book reading in the lives of 18-30 month old children: A diary study. Paper presented at the 39th Annual Cognitive Science Society Meeting, London, England. Murphy, G. L. (1990). Noun phrase interpretation and conceptual combination. Journal of Memory and Language, 29(3), 259–288. doi:10.1016/0749596X(90)90001-G Pan, B. A., Rowe, M. L., Singer, J. D., & Snow, C. E. (2005). Maternal correlates of growth in toddler vocabulary production in low-income families. Child Development, 76(4), 763–782. Payne, A. C., Whitehurst, G. J., & Angell, A. L. (1994). The role of home literacy environment in the development of language ability in preschool children from low-income families. Early Childhood Research Quarterly, 9(3), 427–440. doi:10.1016/08852006(94)90018-3 Perfetti, C. (2007). Reading ability: Lexical quality to comprehension, Scientific Studies of Reading, 11(4), 357– 383. https://doi.org/10.1080/10888430701530730 Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited. Impact of Language Experience Perfors, A., Tenenbaum, J. B., Griffiths, T. L., & Xu, F. (2011). A tutorial introduction to Bayesian models of cognitive development. Cognition, 120(3), 302–321. doi:10.1016/j.cognition.2010.11.015 Ridge, K. E., Weisberg, D. S., Ilgaz, H., Hirsh-Pasek, K. A., & Golinkoff, R. M. (2015). Supermarket speak: Increasing talk among low-socioeconomic status families. Mind, Brain, and Education, 9(3), 127–135. doi:10.1111/mbe.12081 Saffran, J. R., Aslin, R. N., & Newport, E. L. (1996). Statistical learning by 8-month-old infants. Retrieved from https://books-google-com.ezproxy.library.wisc.edu/ books?hl=en&lr=&id=n1cd70T4WxIC&oi=fnd&pg= PA90&dq=elissa+newport&ots=1XlC2tMNSu&sig= BF7rz 5i82 rUrTYBDcPYqQaOGk Saffran, J. R., Johnson, E. K., Aslin, R. N., & Newport, E. L. (1999). Statistical learning of tone sequences by human infants and adults. Cognition, 70(1), 27–52. Seidenberg, M. S. (2017). Language at the speed of sight: How we read, why so many can’t, and what can be done about it. New York, NY: Basic Books. Seidenberg, M. S. (1987). Sublexical structures in visual word recognition: Access units or orthographic redundancy? In M. Coltheart (Ed.), Attention & performance XII: Reading. London, England: Earlbaum. Seidenberg, M. S. (1997). Language acquisition and use: Learning and applying probabilistic constraints. (Cover story). Science, 275(5306), 1599. Seidenberg, M. S. (2011). Reading in different writing systems: One architecture, multiple solutions. In P. McArdle, B. Miller, J. R. Lee, & O. Tzeng (Eds.), Dyslexia across languages: Orthography and the brain–gene–behavior link (pp. 146–168). Baltimore, MD: Paul H. Brookes Publishing. Seidenberg, M. S., & MacDonald, M. C. (1999). A probabilistic constraints approach to language acquisition and processing. Cognitive Science, 23(4), 569–588. https://doi.org/10.1016/S0364-0213(99)00016-6 Seidenberg, M. S., & McClelland, J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96(4), 523–568. Seidenberg, M. S., Tanenhaus, M. K., Leiman, J. M., & Bienkowski, M. (1982). Automatic access of the meanings of ambiguous words in context: Some limitations of knowledge-based processing. Cognitive Psychology, 14(4), 489–537. Seidenberg, M. S., & Zevin, J. D. (2006). Connectionist models in developmental cognitive neuroscience: Critical periods and the paradox of success. In Y. Munakata & M. Johnson (Eds.), Attention & performance XXI: Processes of change in brain and cognitive development (pp. 585–612). Oxford, England: Oxford University Press. Shermis, M. D., & Burstein, J. (Eds.). (2013). Handbook of automated essay evaluation: Current applications and new directions. New York, NY: Routledge/Taylor & Francis Group. 83 Siegel, J. (2010). Second dialect acquisition. Cambridge, New York: Cambridge University Press. Snow, C. E., Arlman-Rupp, A., Hassing, Y., Jobse, J., Joosten, J., & Vorster, J. (1976). Mothers’ speech in three social classes. Journal of Psycholinguistic Research, 5, 1–20. Suskind, D. L., Leffel, K. R., Graf, E., Hernandez, M. W., Gunderson, E. A., Sapolich, S. G., et al. (2016). A parent-directed language intervention for children of low socioeconomic status: a randomized controlled pilot study. Journal of Child Language, 43(2), 366– 406. doi:10.1017/S0305000915000033 Szmrecsanyi, B., & Kortmann, B. (2009). The morphosyntax of varieties of English worldwide: A quantitative perspective. Lingua, 119(11), 1643–1663. Thomson, M. (2010). Keena Ford and the field trip mixup. New York, NY: Puffin Books. Vitevitch, M. S., Luce, P. A., Pisoni, D. B., & Auer, E. T. (1999). Phonotactics, neighborhood activation, and lexical access for spoken words. Brain and Language, 68(1–2), 306–311. doi:10.1006/brln .1999.2116 Walker, D., Greenwood, C., Hart, B., & Carta, J. (1994). Prediction of school outcomes based on early language production and socioeconomic factors. Child Development, 65(2), 606–621. doi:10.2307/1131404 Washington, J. A. (2001). Early literacy skills in AfricanAmerican children: Research considerations. Learning Disabilities Research & Practice, 16(4), 213–221. doi:10.1111/0938-8982.00021 Washington, J. A., Terry, N. P., & Seidenberg, M. S. (2013). Language variation and literacy learning: The case of African American English. In C. A. Stone, R. Silliman, B. J. Ehren, & K. Apel (Eds.), Handbook of language and literacy: Development and disorders (2nd ed., pp. 204–221). New York, NY: Guilford Press. Weisleder, A., & Fernald, A. (2013). Talking to children matters: Early language experience strengthens processing and builds vocabulary. Psychological Science, 24(11), 2143–2152. https://doi.org/10.1177/ 0956797613488145 Weiss, D. J., Poepsel, T. J., & Gerfen, C. (2015). Tracking multiple inputs: The challenge of bilingual statistical learning. In P. Rebuschat (Ed.), Implicit and explicit learning of languages (pp. 167–190). Amsterdam, Netherlands: John Benjamins. Wells, J. B., Christiansen, M. H., Race, D. S., Acheson, D. J., & MacDonald, M. C. (2009). Experience and sentence processing: Statistical learning and relative clause comprehension. Cognitive Psychology, 58(2), 250–271. doi:10.1016/j.cogpsych.2008.08.002 Zangl, R., & Fernald, A. (2007). Increasing flexibility in children’s online processing of grammatical and nonce determiners in fluent speech. Language Learning and Development: The Official Journal of the Society for Language Development, 3(3), 199–231. doi:10.1080/15475440701360564 Copyright © 2018 Wolters Kluwer Health, Inc. Unauthorized reproduction of this article is prohibited.