Natural Language Semantics: Formation and Valuation
()
About this ebook
This textbook offers a comprehensive introduction to the fundamentals of those approaches to natural language semantics that use the insights of logic. Many other texts on the subject focus on presenting a particular theory of natural language semantics. This text instead offers an overview of the empirical domain (drawn largely from standard descriptive grammars of English) as well as the mathematical tools that are applied to it. Readers are shown where the concepts of logic apply, where they fail to apply, and where they might apply, if suitably adjusted.
The presentation of logic is completely self-contained, with concepts of logic used in the book presented in all the necessary detail. This includes propositional logic, first order predicate logic, generalized quantifier theory, and the Lambek and Lambda calculi. The chapters on logic are paired with chapters on English grammar. For example, the chapter on propositional logic is paired with a chapter on the grammar of coordination and subordination of English clauses; the chapter on predicate logic is paired with a chapter on the grammar of simple, independent English clauses; and so on.
The book includes more than five hundred exercises, not only for the mathematical concepts introduced, but also for their application to the analysis of natural language. The latter exercises include some aimed at helping the reader to understand how to formulate and test hypotheses.
Related to Natural Language Semantics
Related ebooks
Analysis of a Medical Research Corpus: A Prelude for Learners, Teachers, Readers and Beyond Rating: 0 out of 5 stars0 ratingsAn English Grammar Rating: 1 out of 5 stars1/5Literature with a Small 'l': Developing Thinking Skills in Language Teaching and Learning Rating: 0 out of 5 stars0 ratingsA First Book in Writing English Rating: 0 out of 5 stars0 ratingsBVP Bundle (While We're on the Topic, Nature of Language, Language Acquisition in a Nutshell) Rating: 0 out of 5 stars0 ratingsAn Introduction to Logic and Scientific Method Rating: 3 out of 5 stars3/5The Linguistic Toolkit for Teachers of English: Discovering the Value of Linguistics for Foreign Language Teaching Rating: 0 out of 5 stars0 ratingsAcademic Reading Circles Rating: 4 out of 5 stars4/5While We're On the Topic: BVP on Language, Acquisition, and Classroom Practice Rating: 5 out of 5 stars5/5Adapting Approaches and Methods to Teaching English Online: Theory and Practice Rating: 0 out of 5 stars0 ratingsPragmatics and Semantics: An Empiricist Theory Rating: 0 out of 5 stars0 ratingsMessaging: beyond a lexical approach in ELT Rating: 5 out of 5 stars5/5A Profile of Mathematical Logic Rating: 0 out of 5 stars0 ratingsInternational Relations: A Self-Study Guide to Theory Rating: 0 out of 5 stars0 ratingsSociocultural Theory and L2 Instructional Pragmatics Rating: 2 out of 5 stars2/5Transforming Practices for the English as a Foreign Language Classroom Rating: 0 out of 5 stars0 ratingsThe Literature Toolbox: Practical Strategies for Exploring Texts Rating: 0 out of 5 stars0 ratingsContrastive Linguistics Rating: 0 out of 5 stars0 ratingsCollins Cobuild English Grammar Rating: 4 out of 5 stars4/5Education Essays: Volume 5 Rating: 0 out of 5 stars0 ratingsThe Universe Today: Our Current Understanding and How It Was Achieved Rating: 0 out of 5 stars0 ratingsTask-based grammar teaching of English: Where cognitive grammar and task-based language teaching meet Rating: 3 out of 5 stars3/5The English Language Rating: 3 out of 5 stars3/5Intelligent and Effective Learning Based on the Model for Systematic Concept Teaching Rating: 5 out of 5 stars5/5The Silent Readers: Sixth Reader Rating: 0 out of 5 stars0 ratingsTEFL Practices: Scenarios for Research and Reflection Rating: 0 out of 5 stars0 ratingsPedagogics as a System Rating: 0 out of 5 stars0 ratingsTeaching English in Science Rating: 2 out of 5 stars2/5
Linguistics For You
The Elements of Style, Fourth Edition Rating: 5 out of 5 stars5/5Verbal Judo, Second Edition: The Gentle Art of Persuasion Rating: 4 out of 5 stars4/5Dictionary of Word Origins Rating: 4 out of 5 stars4/5We Need to Talk: How to Have Conversations That Matter Rating: 4 out of 5 stars4/5Dark Matter of the Mind: The Culturally Articulated Unconscious Rating: 4 out of 5 stars4/5The Mother Tongue: English and How it Got that Way Rating: 4 out of 5 stars4/5The Origin of Names, Words and Everything in Between Rating: 3 out of 5 stars3/5The Essential Chomsky Rating: 4 out of 5 stars4/5Wordslut: A Feminist Guide to Taking Back the English Language Rating: 4 out of 5 stars4/5500 Beautiful Words You Should Know Rating: 5 out of 5 stars5/5The Only Grammar Book You'll Ever Need: A One-Stop Source for Every Writing Assignment Rating: 4 out of 5 stars4/5The American Heritage Dictionary of Idioms: American English Idiomatic Expressions & Phrases Rating: 5 out of 5 stars5/5made in america: An Informal History of the English Language in the United States Rating: 4 out of 5 stars4/5Inspired Baby Names from Around the World: 6,000 International Names and the Meaning Behind Them Rating: 4 out of 5 stars4/5So to Speak: 11,000 Expressions That'll Knock Your Socks Off Rating: 5 out of 5 stars5/5Dark Psychology and Manipulation: Psychology, Relationships and Self-Improvement, #1 Rating: 4 out of 5 stars4/5On Language: Chomsky's Classic Works: Language and Responsibility and Reflections on Language Rating: 4 out of 5 stars4/5Metaphors Be With You: An A to Z Dictionary of History's Greatest Metaphorical Quotations Rating: 4 out of 5 stars4/5Reading Like a Writer: A Guide for People Who Love Books and for Those Who Want to Write Them Rating: 4 out of 5 stars4/5What Kind of Creatures Are We? Rating: 4 out of 5 stars4/5Proust and the Squid: The Story and Science of the Reading Brain Rating: 4 out of 5 stars4/5The Tyranny of Words Rating: 4 out of 5 stars4/5Reader, Come Home: The Reading Brain in a Digital World Rating: 4 out of 5 stars4/5The Stories of English Rating: 4 out of 5 stars4/5Yiddishkeit: Jewish Vernacular & the New Land Rating: 4 out of 5 stars4/5Watch Your Tongue: What Our Everyday Sayings and Idioms Figuratively Mean Rating: 0 out of 5 stars0 ratingsExtinct Languages Rating: 4 out of 5 stars4/5That's Not What I Meant!: How Conversational Style Makes or Breaks Relationships Rating: 0 out of 5 stars0 ratings
Reviews for Natural Language Semantics
0 ratings0 reviews
Book preview
Natural Language Semantics - Brendan S. Gillon
1
Language, Linguistics, Semantics: An Introduction
1 The Study of Language before the Twentieth Century
The field of linguistics has its origins in the study of grammar. The study of grammar arises when one attempts to describe the patterns of a language. Many civilizations undertook to describe their classical languages. The Hellenistic civilization undertook to describe Classical Greek. The civilization of classical India undertook to describe its language, Classical Sanskrit. The tradition initiated in Europe by the Greeks was carried on by the Romans, and later by the Medieval Europeans. When, at the end of the Middle Ages, Latin lost its grip as the intellectual lingua franca of Europe and the various European languages acquired the status of languages of culture, these languages came to be studied grammatically—always through the lens of Classical Latin. At the same time, the voyages of discovery brought Europeans into contact with the various languages of the world, and the need to teach those languages to other Europeans led in many cases to the formulation of grammars, based, not surprisingly, on the model of the grammars of Classical Latin. Initially, many authors writing such grammars sought to impose some uniformity on language usage through the prescription of rules they themselves invented. Later, many authors sought simply to describe the language usage they observed.
Indeed, nearly down to the present day, the grammar taught in school has had only tangential connections with the studies pursued by professional linguists; for most people prescriptive grammar has become synonymous with grammar,
and most educated people continue to regard grammar as an item of folk knowledge open to speculation by all, and in no way a discipline requiring special preparation such as is assumed for chemistry or law.
Another important development leading to the creation of the field of linguistics was the rise of philology.¹ At the end of the eighteenth century, Europeans came into extensive contact with India. Early on, it was observed that classical European languages bore a remarkable resemblance to Classical Sanskrit. This observation, reenforced by the well-recognized common origin of Latin for the Romance languages, as they have come to be known, led to the hypothesis that languages had a common origin. Thus, historical linguistics was born: the undertaking to reconstruct the language underlying a family of languages.
Even at the beginning of the nineteenth century, many fallacies about language still plagued its study. First, the older the state of a language, the better it is. Second, the principal goal in the study of language is to establish, or reestablish, its pristine form. Third, the categories of language are rational categories and are, to a greater or lesser extent, those enshrined in Latin or Greek.
2 The Birth of Linguistics
By the end of the nineteenth century, these assumptions had been fully abandoned by those who pursued the scholarly study of language. The discovery and study of non-Indo-European languages, especially those of the indigenous peoples of North America, had the immediate effect of showing the inapplicability of many of the grammatical concepts derived from Greek and Latin grammars that were used to treat the languages of Western Europe. Those interested in studying North American indigenous languages sought to devise new grammatical concepts whereby to describe these languages. This, in turn, had the effect of breaking the commonly made, but fallacious, link between grammar and reason, for one came to recognize that different languages achieve the same expressive ends in different ways and that these differences are arbitrary and hence not to be appraised as more or less reasonable.
This also had the effect of breaking the grip of prescriptivism over the study of language. Prescriptivism espouses the view that the purpose of grammar is to establish and to defend so-called correct usage. Prescriptivists usually maintain that what they prescribe and proscribe accord with the canons of reason. However, differences in grammatical patterns have nothing to do with what is reasonable. For the most part, the favoring of one choice over another is nothing more than a prejudice reflecting social class or a resistance to regularization. Thus, linguistics, securing for itself its own conceptual apparatus to be applied to the resolution of factual questions, became a fully autonomous, empirical discipline.
In the nineteenth century, fascination with the origin and evolution of language was preeminent. However, at the beginning of the twentieth century, Ferdinand de Saussure (1857–1913) observed in his Cours de linguistique générale (Bally and Sechehaye 1972) that the study of language could be undertaken from two distinct points of view: from the point of view of the structure of a language at any given moment in time (synchronic) and from the point of view of how a language’s structure changes over time (diachronic). During the twentieth century, two other important developments took place: the link between the study of language and the study of psychology, on the one hand, and the application of logic to the study of the structure of language and meaning, on the other. We turn to these developments now.
2.1 Linguistics and Psychology
The link between the study of language and psychology goes back to the beginning of modern psychology, when Wilhelm Wundt (1832–1920) established psychology as an empirical science, independent of philosophy (Wundt 1904). It was the American linguist, Leonard Bloomfield (1887–1949), who first viewed the study of language, or linguistics as it had come to be known, as a special branch of psychology.
At that time, psychology had come under the influence of behaviorism, a movement initiated by the American psychologist John Broadus Watson (1878–1958), which had a deep and lasting impact on a host of areas that investigate organisms and their behavior, including all of the social sciences (Watson 1925). Up to that time, animal behavior was explained in terms of instinct, while human behavior was explained in terms of the mind, its states and its activities. The primary method of investigation in human psychology was introspection. Behaviorism rejected all appeal to any unobservable entities such as either instincts, in the case of nonhuman animals, or the mind, its states and its activities, in the case of humans. Behaviorism admitted only observable and measurable data as objects worthy of scientific study. Behaviorists, accordingly, confined themselves to the study of an organism’s observable and measurable physical stimulus, its observable and measurable response elicited by the stimulus and any biological processes relating stimulus to response.
Bloomfield (1933, chap. 2) sought to make linguistics scientific by recasting it in terms of behaviorist psychology. While most linguists today still regard linguistics as a special domain within psychology, their view about what kind of theory a theory of psychology should be is very different from that of Bloomfield.
By the end of the Second World War, behaviorism was on the decline and the notion of instinct was being rehabilitated as a scientifically respectable one. Unlike earlier thinkers, who would ascribe instincts to animals with little or no experimental confirmation, the new advocates of instinct, or innate endowment, sought empirical confirmation of their ascriptions. Thus, early ethologists,² whose work began to appear right after the war, sought to determine whether certain forms of behavior exhibited by a species of animal were innate (instinctive) or acquired (learned). Ethologists are especially interested in the kinds of behavior characteristic of an animal where a determinate sequence of actions, once initiated, goes to completion. They call such a sequence of actions a fixed action pattern. The question arises: is such behavior innate or is it acquired? And if it is acquired, how is it acquired? In some cases, the behavior is innate, that is, the animal can perform the behavior in question once it has the required organs, for example, the collection of food. In other cases, the behavior is acquired. In cases of acquired behavior, the further question arises: does the young animal acquire the fixed action pattern from repeated exposure to the same behavior in the adults of its species or is the young animal innately predisposed to exhibit the behavior once it is triggered by an appropriate stimulus from its environment? A particularly good way to address such questions is the deprivation, or isolation, experiment.
In such experiments, one seeks to determine whether a young animal raised isolated from its conspecifics, or animals of the same species as it, thereby being deprived of exposure to their behavior, either fails altogether to be able to perform the behavior or manages only to perform it poorly. If it performs the behavior correctly, then one can reasonably conclude that the behavior is innate and not acquired. However, should it fail to produce the behavior, one can then try to figure out what experience the animal requires for the behavior to emerge. It may be that brief exposure to the behavior in its conspecifics is sufficient for the behavior to emerge, or it may be that both exposure and practice are required.
Though some behavior in animals is innate, most behavior emerges as a result of the animal’s innate endowment and its experience. The latter is especially evident in the case of behavior that is characteristic of a mature organism but of which the immature organism is incapable. On the one hand, experience alone is not sufficient for the behavior to emerge, since an organism that is not disposed to develop the behavior never will. Cats, for example, cannot build nests, and birds cannot swim upstream to spawn eggs. On the other hand, innate endowment alone is also not sufficient for the behavior to emerge, since ex hypothesi the organism cannot at inception perform the behavior. The question then becomes: what balance between innate endowment and experience is required to bring about the organism’s ability to exercise the capacity that it eventually acquires? To ascertain what the innate endowment is, one must fix what the capacity is that the organism comes to exercise and which experiences are necessary for it to acquire the capacity. This leads to the following suggestive picture:
In addition, it has been demonstrated repeatedly, for the capacity that underpins a certain form of behavior to develop in an organism, the organism must experience a certain form of stimulus, and it must do so during a limited period of time known as the critical period. This is well illustrated by experimental work done by C. Blakemore and G. Cooper (1970), reported in Frisby (1980, 95). Mature cats have the capacity to detect horizontal and vertical lines. Blakemore and Cooper undertook the following: Newly born kittens were segregated into two groups. One group was placed in an environment set up in such a way that the kittens saw only vertical lines, while the other group was placed in an environment set up in such a way they saw only horizontal lines. After some time, neurophysiological recordings from their brains showed that the kittens raised in an environment with only vertical stripes possessed only vertically tuned striate neurons, while those raised in an environment with only horizontal stripes possessed only horizontally tuned striate neurons.³ Kittens, then, pass through a period as a result of which exposure to vertical and horizontal stripes permits them to come to be able to detect such stripes. Clearly, then, kittens have an innate endowment that, when properly stimulated, enables them to acquire a capacity to detect vertical and horizontal stripes.
2.1.1 Language acquisition
Linguistic behavior is a form of intraspecial communicative behavior, that is, behavior whereby one member of a species signals information to another member of the same species. However, not all intraspecial communicative behavior is alike. For some species, its repertoire of communicative behavior comprises a finite, indeed very small, set of discrete, or digital, signals. For example, vervet monkeys have three vocal signals, one used on the sighting of a leopard, another on the sighting of a python, and a third on the sighting of an eagle. Honey bees, in contrast, have an infinite repertoire of continuous, or analog, signals to communicate the location of nectar. A honey bee returning to the hive conducts a so-called waggle dance, the axis of which indicates the direction of the nectar and the rate indicates the distance (von Frisch 1950, 1974). The repertoire of human linguistic expressions, however, is unlike either. It is unique among the repertoires of all animals: it is both discrete and infinite.
Linguistic behavior is both unique to humans and common among humans. Moreover, it is a form of behavior that humans are incapable of engaging in at birth but which they are capable of engaging in later. It is natural to conceive of this development along the lines indicated earlier:
However, the linguistic capacity is not the only capacity thought to be unique to humans. As Aristotle pointed out centuries ago, the capacity to reason is unique to humans. It might be thought, then, that our linguistic capacity results somehow from our general ability to reason. This is certainly a widely held layman’s view of language. However, it has been argued that this is not the case, rather the human linguistic capacity arises from a peculiar part of the innate endowment.
Like his predecessor, Leonard Bloomfield, Noam Chomsky (born 1928) has also emphasized the link between linguistics and psychology. However, unlike Bloomfield, Chomsky has been a vociferous critic of behaviorism, particularly in the study of language. Indeed, his celebrated review of Verbal Behavior by the well-known behaviorist psychologist Burrhus Frederic Skinner (1904–1990) was extremely influential both inside of and outside of linguistics (Chomsky 1959b). In many, many of his publications, Chomsky developed a very different view of the nature of language. In particular, he has argued that linguistic behavior is underpinned by a linguistic capacity, which, he maintains, emerges from several capacities. These include the capacity to reason, the capacity to remember, the capacity to focus one’s attention on various things, as well as, among other capacities, the capacity to form and recognize grammatical sentences. This capacity, which Chomsky once dubbed grammatical competence and now calls the I-language, together with other capacities pertinent to humans using and understanding language, results in the human linguistic capacity. He has also argued that humans have an innate endowment unique to them and specific to the development of the linguistic capacity. Chomsky has variously called this innate endowment language acquisition device, universal grammar, and language faculty. In early work, Chomsky depicted his view with the following adaptation of the earlier diagrams:
The argument adduced by Chomsky again and again to support this hypothesis is the so-called poverty of the stimulus argument (see Chomsky [1960] 1962, 528–530; 1965, 57–58; 1967, 4–6). The argument is based on a number of observations, which, when taken together, furnish presumptive evidence in favor of a working hypothesis to the effect that humans have an innate predisposition to acquire grammatical competence.
The first observation is that the structure of a language, over which a child gains mastery, is both complex and abstract from its acoustic signal. In particular, Chomsky has pointed out that the basic expressions making up a complex expression of a natural language have a structure that is over and above the linear order of the successive basic expressions making it up and that this additional structure is not contained in the acoustic signal conveying the complex expression.
The second observation is that, while the structure of linguistic expressions is both abstract from its acoustic signal and complex, the grammatical competence whereby a human produces and understands sentences is acquired by a child in a short span of time. Third, it is observed, this competence is acquired, even though the child has little exposure to signals carrying examples of the relevant structure and even though many of the utterances exhibit these structures in a defective way—being interrupted or otherwise unfinished sentences.
Fourth, it is observed that, in spite of important differences in the sample of utterances to which children of the same linguistic community are exposed, nonetheless they do converge on the same grammatical competence, as reflected in the convergence of the expressions they find acceptable.
The fifth observation is that the rules that are characteristic of grammatical competence are not taught. Consider, for example, an anglophone child’s mastery of the rule for plural noun formation in English. All that anglophone children are taught in school is that the letter -s is added to a noun (with the exception of such words as man, foot, and so on). But this is not the rule that anglophone children learn when they are learning to speak English, for the simple reason that it is not the rule of English plural formation. The actual rule is more complex. And no adult, native English speaker can state the rule, unless he or she has been linguistically trained. To be sure, there is a suffix. But, it is pronounced differently, depending on what the sound immediately preceding it is. Thus, the plural suffix, when attached to the word cat, yields one sound, namely [s]; when attached to the word dog, it yields another, namely [z]; and when attached to the word bush, it yields still another, namely, [iz]. Children discern and master the difference without instruction.
Another example of anglophone speakers mastering an aspect of English without any instruction is this. Consider this pair of sentences:
Every native speaker of English knows that the agent of the leaving expressed in sentence (1.1) is the agent of the promising, that is, the person denoted by the subject of the main verb, whereas the agent of the leaving expressed in sentence (1.2) is the patient of the persuading, that is, the person denoted by the direct object. It cannot be due to any explicit instruction, for dictionary entries for the relevant verbs do not provide that kind of information about the verbs. And it cannot be due to the left-to-right order of the words, for the two sentences differ from one another only in the choice of verb.
Similarly, every native speaker of English knows when the third-person personal pronoun it can be used and when it cannot be used, as illustrated next, yet no speaker can state the rule governing its distribution.
(Underlining is used to indicate which expression is the antecedent of which pronoun. An asterisk prefixed to an expression indicates that native speakers judge it as unacceptable.)
Sixth, it is generally acknowledged that a child’s acquisition of his grammatical competence is independent of his intelligence, motivation, and emotional makeup. And finally, it is believed that no child is predisposed to learn one language rather than another. A child born to unilingual Korean speakers, if raised from birth by unilingual French speakers, will learn French as easily as any other child learning French born to unilingual French speakers, just as a child born to unilingual French speakers, if raised from birth by unilingual Korean speakers, will learn Korean as easily as any other child learning Korean born to unilingual Korean speakers.
These seven observations give rise to the following limits on any hypothesis about the acquisition language, namely, that it cannot be so rich as to predispose a child to acquire competence in the grammar of one language over that of another, for, as noted, no child is more disposed to learn one language over another. At the same time, the innate endowment cannot be so poor as to fail to account for a child’s rapid acquisiton of grammatical competence, in light of the abstract yet uniform nature of the competence; the quality of his exposure; the poverty of his exposure; and the independence of his acquisition from his intelligence, motivation, and emotional makeup (Chomsky 1967, 3). In short, this innate endowment cannot be so rich as to preclude the acquisition of some attested language but it must be rich enough to ensure that one can acquire any attested language within the limits of time and access to data (Chomsky 1967, 2).
This argument, though widely accepted by linguists, was initially greeted with skepticism by empirically minded philosophers such as Hilary Putnam (1967), who, disputing some of the observations herein, argued for the conclusion that human linguistic competence is the result of the general human ability to learn. More recently, connectionists have presented computational models suggesting that it is indeed possible to abstract constituency structure from the acoustic signal (Ellman et al. 1996). Notice, however, as stressed by Fodor (1981), the debate is not about whether there is an innate endowment to account for language learning—this no one truly disputes—rather, the debate is about the nature of the innate endowment, in particular, about whether or not the necessary innate endowment is specific to language learning.
Another kind of argument to support the hypothesis that humans have a special aptitude to learn language is based on the claim that humans pass through a period critical to the acquisition of the linguistic capacity (Lenneberg 1967, 142–153). Basic morality excludes performing a deprivation experiment in which one would deprive an infant of exposure to language during its childhood to see whether the adult could later learn a language. Nevertheless, it has been claimed that such deprived children have been found. One case was that of a so-called wild child of Aveyron, found in the wilds in France in the nineteenth century, unable to speak, and reported never to have acquired a command of French, despite rigorous training (Lane 1976). Yet, too many facts about this boy’s situation remain unclear for experts to see in him confirmation or disconfirmation of the hypothesis that humans have an innate capacity specific to acquiring language. More recently, in the latter half of the twentieth century, in the United States, more specifically in Los Angeles, a young girl named Genie was discovered who had been locked up in a room by herself from infancy. She too never acquired normal English fluency, again despite extensive training (Curtiss 1988). However, in spite of careful documentation of her history, confounding factors preclude experts from arriving at any consensus as to whether or not her case provides evidence for or against the hypothesis.
Still another argument to support the hypothesis is based on the claim that children are much more successful than adults in learning a second language. While researchers have conducted studies to support this claim, the interpretation of the studies has been disputed by other authors and other studies have failed to replicate them.
2.1.2 Grammatical competence
The human linguistic capacity is not something that can be directly observed. Rather, it must be figured out from behavior. The relevant behavior is the set of expressions composing the language used. Thus, the first step in characterizing the capacity is to characterize the expressions of the language.
Important steps in this direction were taken, amazingly enough, over two thousand five hundred years ago by unknown thinkers of those Indo-European tribes that moved into what is today Pakistan and northwest India. These tribes, known as Indo-Aryans, had a keen interest in their language, Sanskrit. So advanced was their knowledge of Sanskrit that by the fifth century BCE, they had formulated what today we call a generative grammar of Sanskrit. The monument that testifies to this astonishing achievement is the Aṣṭādhyāyī (fourth century BCE), the world’s earliest extant grammar. This grammar, either written or compiled by Pāṇini, a speaker of the language, of whom we know almost nothing, is neither a list of observations about the language, nor is it a descriptive grammar of the kind compiled by modern field linguists; rather, it comprises a finite set of rules and a finite set of minimal expressions from which each and every proper expression of Sanskrit can be derived in a finite number of steps. Such grammars, unknown elsewhere in the world until the middle of the last century, are today known as generative grammars. It is this conception of grammar that is the basis of all mathematically rigorous treatments of natural language.
Pāṇini’s grammar embodies a number of insights. One of them is made explicit by the great Sanskrit grammarian, Patañjali (second century BCE), in his Great Commentary, or Mahābhāṣya, a commentary on Pāṇini’s Aṣṭādhyāyī. In it, Patañjali observes that there is no finite upper bound on the set of possible correct Sanskrit expressions, so that the learning of the language requires the learning of its vocabulary and its rules. He writes:
the recitation of each particular word is not a means for understanding of grammatical expressions. Bṛhaspati addressed Indra⁴ during a thousand divine years going over the grammatical expressions by speaking each particular word, and still he did not attain the end. With Bṛhaspati as the instructor, Indra as the student and a thousand divine years as the period of study, the end could not be attained, so what of the present day when he who lives a life in full lives at most a hundred years? … Therefore the recitation of each particular word is not a means for the understanding of grammatical expressions. But then how are grammatical expressions understood? Some work containing general and particular rules has to be composed. (Kielhorn 1880, vol. 1, 5–6; translated by Staal 1969, 501–502)
The insight is that the number of grammatical expressions in Sanskrit is so large that it would be impossible to learn them one by one; instead, one must learn a finite set of rules that can be applied to a finite set of basic expressions.⁵ Indeed, the Aṣṭādhyāyī and its appendices comprise just that: a finite list of basic expressions and a finite set of rules that together generate all and only the grammatical expressions of Sanskrit.
Another insight embodied in the grammar is that a complex expression is understood on the basis of an understanding of the basic expressions making it up. This insight is also found in the works of Medieval European logicians such as Peter Abelard (1079–1142) and John Buridan (fourteenth century) and apparently independently in the works of modern European logicians such as Gottlob Frege (1848–1925) and Rudolf Carnap (1891–1979). The Aṣṭādhyāyī embodies this insight by pairing each sentence of Sanskrit that it generates with a situation whose parts are associated with the minimal expressions making up the sentence.⁶ The empirical basis for this insight arose from the following kind of observation. Consider these two sentences uttered in precisely the same circumstances:
These sentences form what linguists call a minimal pair: a pair of linguistic items that are alike in all relevant respects except two. The two sentences in (3) are alike in all relevant respects except that, where the first sentence has the word cow, the second sentence has the word rock, and where native speakers of English judge the first sentence true, they judge the second false. The obvious explanation of why native speakers of English assign different truth values to the sentences in (3) is that they understand the words cow and rock differently.⁷
But what precisely is the relation that determines how complex expressions are made out of simpler expressions? Here we turn to the idea of immediate constituency analysis, a key idea of the American structuralist linguists, who were working in North America in the first half of the twentieth century. Though its three essential ingredients are embodied in the rules of Pāṇini’s Aṣṭādhyāyī, it was Leonard Bloomfield, the founder of the movement, himself a student of the Indian grammatical tradition, who gave explicit formulation to the ideas. They were further elaborated and applied by his successors: in particular, by Bernard Bloch (1946), Zellig Harris (1946), Eugene Nida (1948), Rulon Wells (1947), and Charles Hockett (1954). Immediate constituency analysis has three essential ingredients. The first is that each complex expression can be analyzed into immediate subexpressions—typically two, which themselves can be analyzed, in turn, into their immediate constituents and that this analysis can be continued until minimal constituents are reached. The second is that each expression can be put into a set of expressions that can be substituted one for the other in a more complex expression without compromising its acceptability. And the third is that each of these sets of expressions can be assigned a syntactic category.
Using immediate constituency analysis, one can show that sentence (4.0) can be analyzed in two different ways, each having a different meaning.
To see this, consider the following circumstances. Galileo is looking out the window of his apartment in Venice through his telescope at a patrician walking empty-handed through Saint Mark’s Square. Sentence (4.0) is judged false, should it be understood according to the sentence’s annotation in (4.1), where the prepositional phrase (PP) with a telescope is taken as a modifier of the noun phrase (NP) a patrician, and it is judged true, should it be understood according to the sentence’s annotation in (4.2), where the prepositional phrase with a telescope is taken as a modifier of the verb saw.
These kinds of facts suggest that immediate constituency is crucial in the determination of the meaning of a complex expression by the meanings of its subexpressions. After all, what else but the grouping due to the immediate constituent analysis could explain how it is that the very same sentence can be judged both true and false with respect to one and the same circumstances?
Immediate constituency analysis also throws light on the absence of any finite bound on the expressions of a natural language. As we shall see in greater detail in later chapters, a constituent of a certain type can have as a constituent another constituent of the same type. Thus, for example, a prepositional phrase may contain another prepositional phrase that in turn contains still another prepositional phrase.
Indeed, it is easy to show that many kinds of constituents in English, besides prepositional phrases, have this property. Coordinated independent clauses do, for example. Consider the English connector and. It can be put between two independent clauses (say (6.1) and (6.2)) to form an independent, compound clause—(6.3), which itself can be joined to either of the initial clauses to form still another independent compound clause (6.4).
The same thing can be done with the connector or. Relative clauses furnish still another example of a constituent having this property.
Indeed, this property, whereby a constituent of a certain type can contain as a constituent another of the same type, seems to be a property of every known human language. And it is by dint of this property, known to mathematicians as recursion, that each human language comprises an infinite number of expressions. Put another way, the infinite set of expressions that composes a language can be obtained, or generated, by applying a finite set of recursive rules to a finite set of basic expressions, or minimal elements. These and other rules are said to compose the language’s grammar and are thereby said to characterize the corresponding grammatical competence.
The hypothesis of grammatical competence involves idealization. The idealization is simple enough. For example, it is clear that human beings have the capacity to do arithmetic. The mastery of this capacity does not require that the person exercising the capacity exercise it flawlessly. No doubt every person, no matter how arithmetically competent, makes arithmetic errors. The simple fact of making arithmetic errors in no way impugns the ascription of the arithmetic competence. What would warrant the withdrawal of the ascription would be either persistent error or an inability to recognize errors once pointed out. Indeed, without an ascription of arithmetic competence, it would make no sense even to speak of errors and to try to characterize them in any way that might yield insight into human psychology. In other words, one can study errors only against a standard.
As a result, it is useful to distinguish human grammatical competence from performance, discounting all slips of the tongue, mispronunciations, hesitation pauses, stammering, stuttering—in short, discounting anything attributable to irrelevant factors such as memory limitations, distractions, shifts of attention and interest, and the malfunctioning of the physiological and neurological mechanisms involved in language behavior (Lyons 1977, 586). Such an idealization, or regularization, of the data is essential to a proper understanding not only of grammatical competence but also of linguistic performance. Derogations from something can be determined once what the derogations are derogations from has been correctly characterized. It should be emphasized that these errors of performance are not set aside to be ignored, but to be eventually the object of more careful study.
2.1.3 Autonomy
Just as the exercise of the human linguistic capacity is thought to result from the exercise of several capacities together—including the grammatical competence; so the exercise of grammatical competence is thought to result from the exercise of component competences, whose joint exercise is required for the exercise of the grammatical competence. Moreover, just as it is held that the various capacities making up the linguistic capacity are distinguishable and not reducible one to the other, so it is held that the various components of grammatical competence are distinguishable and not reducible one to the other. Another way to put this is to say that the capacities making up the linguistic capacity and the competences making up grammatical competence are autonomous from one another.
To understand better what is meant by autonomy⁸ applied to the various component competences of grammatical competence, let us consider how various facts are explained. To begin with, consider the following expressions. The unacceptability of these expressions is to be explained, not by the rules of English phonology, but by the rules of English syntax.
The first fails to have any syntactic structure, while the second violates the syntactic rule whereby a particle forming a constituent with a verb must succeed a pronoun, which is the constituent’s direct object.
Similarly, the unacceptability of the next pair of expressions is to be explained, in the first case, by the rule of English that the noun of the subject noun phrase and the verb of the verb phrase agree in grammatical number and, in the second case, by the fact the noun mouse is an exception to the rule of plural formation.
Notice that the sentences in (9) are not uninterpretable. If uttered either by an adult for whom English is not a native language or by a child, they would be interpreted, in the first case, as saying the same thing either as the sentence The boys are here or as the sentence The boy is here and, in the second case, as saying the same thing as The mice have escaped. Consequently, no linguist would suggest that such sentences are unacceptable for reasons of semantics.
Consider finally another set of unacceptable expressions. They do not violate any recognized rules of English phonology or morphology. Moreover, each corresponds in syntactic structure to a perfectly acceptable English sentence.
Such expressions, if interpretable at all, are interpretable only by reconstruing the meaning of one or another of its words. Such sentences are explained as unacceptable, not for reasons of phonology, morphology, or syntax, but for reasons of semantics, in particular, because of a failure of the meanings of various expressions to cohere, as it were.
Thus, we see how expressions, all judged to be unacceptable, have their unacceptability explained by different explanatory resources: some are explained by phonological rules, some by morphological rules, others by syntactic rules, and still others by semantic rules. Thus, phonology, syntax, morphology, and semantics are taken to be autonomous, though related, components of a theory of grammar.
It is important to note that the mutual autonomy of these various components is not drawn into question by the fact that they may have overlapping domains of application. Here is one elementary example of how morphology and syntax on the one hand and phonology on the other can have overlapping domains of application. In French, for example, the choice of the form of a word may be determined in some cases by purely morphosyntactic considerations and in other cases by phonological considerations.
Thus, in (11), the choice between the possessive adjectives sa and son is determined by the gender of the following noun: logement is masculine, so the form of the possessive adjective is the masculine form son, while demeure is feminine, so the form of the possessive adjective is the feminine form sa. In (12.2), however, even though épouse is feminine, the appropriate form of the possessive adjective is the masculine form son, which is required by the fact that the immediately following word begins with a vowel.⁹
Not only have linguists held that the various components of grammar are autonomous with respect to one another, they have also held that grammatical competence and world knowledge are also autonomous from one another. Thus, the unacceptability of the next pair of sentences is to be ascribed to different sources: the first to a violation of a grammatical rule of syntax and the second to a conflict with our beliefs about the world.
Expressions such as those in (13.2), which seem to be in every respect like declarative sentences and yet seem to make no sense, are sometimes said to be semantically anomalous expressions. They are said to be anomalous, instead of false, since they do not seem to be expressions that are easily liable to being judged as either true or false. The English philosopher Gilbert Ryle (1900–1976) said that such sentences contain category mistakes (Ryle 1949, 16). They contrast with the sentences in (14), which are judged inexorably false.
Corresponding to sentences that are judged inexorably false are those that are judged inexorably true.
The inexorability of such judgments is thought to arise, not from a speaker’s knowledge of the world, but from his or her knowledge of English. This inexorability contrasts with the lack of inexorable falsity of either of these sentences.
After all, the truth of these sentences is conceivable, as is the falsity of the next two sentences.
However, it is not always clear whether or not a given sentence is inexorably true or inexorably false. Consider the next sentence.
Is this sentence necessarily true? And if so, is it necessarily true because of the meanings of the words in it? As Lyons (1995, 122) notes, it is not inconceivable that biotechnology will some day permit a fetus-bearing womb to be implanted in a man who can later deliver the child by cesarean section.
There are sentences that are true, no matter what, by dint of the grammar of English, and there are others that are false, no matter what, by dint of the grammar of English. There are still others that are true or false, depending on what the world is like. However, as we shall see later, drawing lines between these various classes of sentences is not always easy.
2.2 Linguistics and Logic
If the link between linguistics and psychology is strong, the link with logic is no less so. Logic, initially, sought to distinguish good arguments from bad ones. More particularly, it sought to identify which argument forms preserve truth and which do not. Since arguments are communicated and, to that extent, expressed in a language, it is natural to use the forms of language to identify the forms of arguments. It is not surprising, then, that those interested in logic have been interested in language and have made, in their pursuit of logic, interesting observations about language and have developed important insights into language.
The intertwining of logical and linguistic concerns is evident from the beginning of the study of logic in Europe. In the course of developing his syllogistic, Aristotle introduces the distinction between subject and predicate, a distinction that has been with us ever since as both a grammatical and a logical one. Nor were these the only linguistic distinctions drawn by Aristotle: Aristotle seems to have been the first European to have identified conjunctions as a lexical class as well as to have identified tense as a feature of verbs, among many other things. The Stoics too, a philosophical movement of the Hellenistic period, also had an interest in logical and linguistic matters, among other things. They identified the truth-functionality of and, or, and if and distinguished verbal aspect from verbal tense, again, among many other things. This mixture of logical and linguistic concerns appears again in the logica moderna of the Middle Ages, especially with its treatment of syncategoremata, those expressions like the English expressions all (omnis), both (uterque), no (nullus), unless (nisi), only (tantum), alone (solus), infinitely many (infinita in pluralia), numerals (dictiones numerales), and so on.
The next major development to stimulate still further the intertwining of logical and linguistic concerns is the development of the formalization of mathematics. In particular, classical quantificational logic was developed as a means to represent mathematical reasoning. It does this by providing a notation in terms of which, it is generally agreed, all mathematical arguments can be framed. By focusing on mathematical arguments, or proofs, logic turned its attention to how all parts of a mathematical proof could be put into notation and how that notation could be rigorously specified. The study of how the notation could be rigorously specified led to recursion theory. The study of how the notation was to be interpreted led to model theory. Both of these developments have had a fundamental impact on linguistics: the first provided the basis for the formalization of grammatical rules, and the second brought to the attention of philosophers, and later of linguists, the central question of semantics: how do the meanings of constituent expressions contribute to the meaning of the expression of which they are constituents?
In the remainder of this section, using a few extremely simple examples for illustration, we will learn about these two ideas, which are fundamental to the study of both logic and natural language. While the examples may seem utterly contrived, the basic idea is not. Indeed, the basic idea is familiar to anyone who has studied secondary school mathematics, though not the choice of basic symbols, made with a view to helping to avoid distracting, extraneous detail. We can use these simple examples to maintain our bearing as we wade more deeply into the complexity that inevitably accompanies exposure to a wider and wider range of natural language phenomena.
2.2.1 Formation rules
What is recursion? We shall not try to give a rigorous mathematical definition here. However, we shall give a perfectly rigorous illustration. Consider the set SL, whose members are the sequences of one or more occurrences of instances of the letters A, B, C, or D. The set SL, then, includes not only the letters A, B, C, and D, but also sequences of these letters such as AB, BD, DC, AAA, DCBAADD, and many, many others. It includes only such sequences. Thus, it does not include AEC, FBE, and so on. As readers can easily see, SL has an infinite number of members; after all, given any sequence of these letters, one can obtain a different sequence of these letters by adding any one of the four letters to the sequence given.
The foregoing characterization of SL is not a recursive specification. The following is, however. SL comprises the elements A, B, C, and D as well as any expression that can be obtained from an expression already in SL, as it were, by suffixing to it either A, B, C, or D. With L as the set of elements A, B, C, and D, we can give a formal recursive definition of SL.
We shall call definitions of this kind formation rules. We shall refer to this particular formation rule as the suffixation formation rule (for SL), which we shall abbreviate as FRs.
Such a definition is said to generate the members of SL on the basis of L. Let us see how FRs generates a specific expression of SL, say the expression BACD. A, B, C, and D, being expressions of L, are, by (19.1), also expressions of SL. Since B is an expression of L, then by (19.1), B is an expression of SL. This is represented in the following diagram by the line that has B ∈ L above and B ∈ SL below. For the time being, we will regard ∈ as shorthand for is an expression of.
Since B is an expression of SL, or equivalently B ∈ SL, and A is an expression of L, or equivalently A ∈ L, it follows by (19.2) that BA is an expression of SL. This is represented in the diagram by two lines, one with B ∈ SL above, another with A ∈ L above, and the two converging to BA ∈ SL below them. Another application of the rule in (19.2) yields the expression BAC. A fourth application yields the expression BACD. In short, each transition from one level of the diagram to the level immediately beneath it corresponds to the application of one of the rules in (19), where what is on the upper level corresponds to the if -clause of the rule and what is on the lower level corresponds to the then-clause of the same rule.
This diagram illustrates two facts. First, it illustrates that simpler expressions are combined, two at a time, into a more complex expression. We shall refer to two expressions that combine to form a more complex expression as the (immediate) subexpressions of the more complex expression. For example here, BA and C are the immediate subexpressions of BAC. Second, it depicts that, in a finite number of steps, any expression of SL can be obtained from the simplest expressions of SL, namely those in L, or equivalently, that, in a finite number of steps, any expression of SL can be decomposed into the simplest expressions of SL, namely those of L.
A little reflection on (19) shows that all and only the expressions of SL can be obtained on the basis of (19). In particular, it should be obvious that (19.1) and (19.2) guarantee that all the sequences of one or more occurrences of members of L are included, and that (19.3) guarantees that no other sequences are included.
Suffixation is not the only way in which the expressions in SL can be generated. They can also be generated by prefixation. Thus, the members of SL can be recursively specified as comprising the elements of L as well as any expression that can be obtained by prefixing an element of L to an expression already in SL.
As we shall now show, BACD is also an expression of SL according to the recursive definition in (20). Since D is an expression of L, it is also, by (20.1), an expression of SL. Now C is also an expression of L. So by (20.2), CD is an expression of SL; as is ACD, again by (20.2), since A is an expression of L. Finally, B, being an expression of L, is prefixed by (20.2) to ACD, thereby yielding BACD as an expression of SL. We depict these steps next.
Thus, as we just showed, the expression BACD is a member of SL, both by the recursive specification in (19) and by the recursive specification in (20). However, as the two diagrams make clear, each recursive specification generates the same expression in different ways. Thus, while BAC is a subexpression of BACD, according to the recursive specification in (19), it is not a subexpression of BACD according to the recursive specification in (20).
The form of recursive specification illustrated by (19) and (20) is far simpler than the form of recursive specification used by linguists to generate the expressions of a natural language. Two types of recursive specification are usually employed by linguists: that of a constituency grammar and that of a categorial grammar. The first arose out of the study of language itself, the other out of the study of logic. The first has its origins in Pāṇini’s Aṣṭādhyāyī, for a great number of its phonological and morphological rules are, to a very close approximation, instances of what are also called today context sensitive rules (Staal 1965). Such rules were not used again in grammars for another two thousand five hundred years, when the American structuralist linguist Leonard Bloomfield (1933), who had closely studied Pāṇini’s grammar (Rogers 1987), rediscovered their utility, applying them not only in morphology and phonology, as Pāṇini had done, but in syntax as well. Their use was subsequently greatly developed by Zellig Harris (1946; 1951), Bernard Bloch (1946), Rulon Wells (1947), Eugene Nida (1948), and Charles Hockett (1954), among others. Shortly thereafter, Noam Chomsky, who had studied mathematical systems known as rewriting, or semi-Thue, systems (Chomsky 1956; 1959a; 1963), suggested that such systems be used to formalize immediate constituency analysis (Chomsky 1957).
Another way to specify recursively the expressions of natural language is to use what is known as the Lambek calculus (Lambek 1958), named after its discoverer, Joachim Lambek (1922–2014). The Lambek calculus is a generalization of categorial grammar, which Kazimierz Ajdukiewicz (1890–1963), drawing on ideas found in the Fourth Logical Investigation of Edmund Husserl (1859–1938), had devised to give a mathematical characterization of the notation of classical quantificational logic. Ajdukiewicz (1935) explains some aspects of the mathematics of the notation by pointing out, in an anecdotal way, some of its possible applications in the study of natural language structure. But Ajdukiewicz’s mention of natural language is completely opportunistic, for his concerns were entirely with logic. Yehoshua Bar-Hillel (1915–1975) was the first person to explore seriously how categorial grammar might be applied to the study of natural language (Bar-Hillel 1953).
We shall learn in later chapters how each of these two types of recursive specification can be applied in the analysis of the syntax of the expressions of natural language and how they relate to one another.
2.2.2 Valuation rules
Having given an idea of what recursion is and how it pertains to the analysis of the syntax of natural language, let us turn to what model theory is and how it pertains to the way in which the meanings of constituent expressions contribute to the meaning of the expression of which they are constituents. Linguists and philosophers refer to this property as compositionality.
Model theory is concerned, among other things, with the relation of symbols of logical notation to mathematical objects. The basic idea can be illustrated with the expressions of SL. Let us consider the question: are the expressions of SL sufficient to permit one to name each and every natural number, that is, to name 0, 1, 2, …? The answer is yes. It can be done by riding piggyback on the recursive specification of SL to assign values recursively to each expression of SL. Here is one way to do it. First, we assign 0 to A, 1 to B, 2 to C, and 3 to D. Next, we assign a value to a complex expression following its recursive specification. We shall call such rules valuation rules. As we shall see, each valuation rule is defined in light of an antecedently given formation rule.
To get a better idea of what valuation rules involve, let us consider an example. Recall the formation rule FRs defined in (19). Each complex expression comprises two immediate constituent expressions, a left-hand one and a right-hand one. The value assigned to the complex expression is the one obtained by multiplying the value assigned to the left-hand immediate constituent expression by 4 and adding to it the value of the right-hand immediate constituent expression. Let us call this assignment i. It is defined in two clauses. The first stipulates which values are to be assigned to the expressions in L: the expressions A, B, C, and D are assigned the values 0, 1, 2, and 3, respectively. The second states how, on the basis of values assigned to the parts of an expression, a value is to be assigned to the expression itself: if an expression consists in expression y of SL and expression z of L, then the value assigned to the complex expression yz is the sum of the value assigned to z and four times the value assigned to y. Using the notation i(x) = y to mean i assigns y to x, we can state the definition of i as follows:
To see how VRi works, let us notice some of its features. First, just as (19.1) specifies the elements whereby the expressions of SL are generated, so (21.1) specifies their values. Next, just as (19.2) specifies how a complex expression is constituted by two immediate constituent expressions, so (21.2) specifies how the value of a complex expression is determined by the values of its immediate constituent expressions. In brief, VRi (21) works in tandem with FRs (19).
Let us see how. Recall how BACD is generated by FRs (19). First, (19.1) states that the elements of L are expressions of SL. Thus, as we saw, B is an expression of SL. By (19.2), BA is an expression of SL. At the same time, (21.1) assigns 0 to A (that is, i(A) = 0) and 1 to B (that is, i(B) = 1). According to (21.2), the value assigned to BA is four times the value assigned to B plus the value of A, that is, 4 · i(B) + i(A), or 4 · 1 + 0, or 4. Since BA is an expression of SL and C is an element of L, according to (19.2), BAC is an expression of SL. According to (21.2), since BA is assigned the value 4 and C is assigned the value 2, BAC is assigned the value 4 · 4 + 2, or 18. Finally, since BAC is an expression of SL and D is an element of L, BACD is an expression of SL and its value is 4 · 18 + 3, or 75. All of this is nicely displayed in the following diagram.
Note that this diagram is just like the diagram used earlier to depict how FRs (19) generates the expression BACD, except the node labels indicating the categories of the various subexpressions have been replaced with labels indicating the values assigned to the various subexpressions.
The reader should be able to see that each expression of SL formed by FRs and assigned a value by VRi denotes a natural number and that each natural number is denoted by some expression of SL formed from the rule of FRs and assigned a value by the rule VRi.
VRi is not the only way the expressions of SL formed by FRs can be assigned natural numbers as values in such a way that each expression denotes a natural number and each natural number is denoted by some expression. To see this, let us retain the recursive specification of the expressions of SL given by FRs (19), but we will replace the recursive specification of values given by VRi (21) with the one in (22). The specification in (22) assigns the very same values to the elements of L, but instead of multiplying the value of the left-hand immediate constituent expression by 4, it multiplies the value of the right-hand immediate constituent by 4 and then adds that value to the value of the left-hand immediate constituent expression.
Let us see what value VRj (22) assigns to BACD when it is generated by FRs (19). As before, (19.1) states that the elements of L are expressions of SL. So, B is an expression of SL. By (19.2), BA is an expression of SL too. As with (21.1), (22.1) assigns 0 to A (that is, j(A) = 0) and 1 to B (that is, j(B) = 1). According to (22.2), the value assigned to BA