Current theories of auditory comprehension assume that the segmentation of speech into word forms... more Current theories of auditory comprehension assume that the segmentation of speech into word forms is an essential prerequisite to understanding. We present a computational model that does not seek to learn word forms, but instead decodes the experiences discriminated by the speech input. At the heart of this model is a discrimination learning network trained on full utterances. This network constitutes an atemporal long-term memory system. A fixed-width short-term memory buffer projects a constantly updated moving window over the incoming speech onto the network's input layer. In response, the memory generates temporal activation functions for each of the output units. We show that this new discriminative perspective on auditory comprehension is consistent with young infants' sensitivity to the statistical structure of the input. Simulation studies, both with artificial language and with English child-directed speech, provide a first computational proof of concept and demonstrate the importance of utterance-wide co-learning.
When asked to think about the subjective frequency of an n-gram (a group of n words), what proper... more When asked to think about the subjective frequency of an n-gram (a group of n words), what properties of the n-gram influence the respondent? It has been recently shown that n-grams that occurred more frequently in a large corpus of English were read faster than n-grams that occurred less frequently (Arnon & Snider, 2010), an effect that is analogous to the frequency effects in word reading and lexical decision. The subjective frequency of words has also been extensively studied and linked to performance on linguistic tasks. We investigated the capacity of people to gauge the absolute and relative frequencies of n-grams. Subjective frequency ratings collected for 352 n-grams showed a strong correlation with corpus frequency, in particular for n-grams with the highest subjective frequency. These n-grams were then paired up and used in a relative frequency decision task (e.g. Is green hills more frequent than weekend trips?). Accuracy on this task was reliably above chance, and the tr...
Performance of HAL-like word space models on semantic clustering. Abstract A recent implementatio... more Performance of HAL-like word space models on semantic clustering. Abstract A recent implementation of a HAL-like word space model called HiDEx was used to create vector representations of nouns and verbs. As proposed by the organizers of the Lexical Semantics Workshop (part of ...
HAL (Hyperspace Analog to Language) is a high-dimensional model of semantic space that uses the g... more HAL (Hyperspace Analog to Language) is a high-dimensional model of semantic space that uses the global co-occurrence frequency of words in a large corpus of text as the basis for a representation of semantic memory. In the original version of the HAL model, many of its parameters ...
ABSTRACT What knowledge influences our choice of words when we write or speak? Predicting which w... more ABSTRACT What knowledge influences our choice of words when we write or speak? Predicting which word a person will produce next is not easy, even when the linguistic context is known. One task that has been used to assess context dependent word choice is the fill-in-the-blank task, also called the cloze task. The cloze probability of specific context is an empirical measure found by asking many people to fill in the blank. In this paper we harness the power of large corpora to look at the influence of corpus-derived probabilistic information from a word’s micro-context on word choice. We asked young adults to complete short phrases called n-grams with up to 20 responses per phrase. The probability of the responded word and the conditional probability of the response given the context were predictive of the frequency with which each response was produced. Furthermore the order in which the participants generated multiple completions of the same context was predicted by the conditional probability as well. These results suggest that word choice in cloze tasks taps into implicit knowledge of a person’s past experience with that word in various contexts. Furthermore, the importance of n-gram conditional probabilities in our analysis is further evidence of implicit knowledge about multi-word sequences and support theories of language processing that involve anticipating or predicting based on context.
Current theories of auditory comprehension assume that the segmentation of speech into word forms... more Current theories of auditory comprehension assume that the segmentation of speech into word forms is an essential prerequisite to understanding. We present a computational model that does not seek to learn word forms, but instead decodes the experiences discriminated by the speech input. At the heart of this model is a discrimination learning network trained on full utterances. This network constitutes an atemporal long-term memory system. A fixed-width short-term memory buffer projects a constantly updated moving window over the incoming speech onto the network's input layer. In response, the memory generates temporal activation functions for each of the output units. We show that this new discriminative perspective on auditory comprehension is consistent with young infants' sensitivity to the statistical structure of the input. Simulation studies, both with artificial language and with English child-directed speech, provide a first computational proof of concept and demonstrate the importance of utterance-wide co-learning.
When asked to think about the subjective frequency of an n-gram (a group of n words), what proper... more When asked to think about the subjective frequency of an n-gram (a group of n words), what properties of the n-gram influence the respondent? It has been recently shown that n-grams that occurred more frequently in a large corpus of English were read faster than n-grams that occurred less frequently (Arnon & Snider, 2010), an effect that is analogous to the frequency effects in word reading and lexical decision. The subjective frequency of words has also been extensively studied and linked to performance on linguistic tasks. We investigated the capacity of people to gauge the absolute and relative frequencies of n-grams. Subjective frequency ratings collected for 352 n-grams showed a strong correlation with corpus frequency, in particular for n-grams with the highest subjective frequency. These n-grams were then paired up and used in a relative frequency decision task (e.g. Is green hills more frequent than weekend trips?). Accuracy on this task was reliably above chance, and the tr...
Performance of HAL-like word space models on semantic clustering. Abstract A recent implementatio... more Performance of HAL-like word space models on semantic clustering. Abstract A recent implementation of a HAL-like word space model called HiDEx was used to create vector representations of nouns and verbs. As proposed by the organizers of the Lexical Semantics Workshop (part of ...
HAL (Hyperspace Analog to Language) is a high-dimensional model of semantic space that uses the g... more HAL (Hyperspace Analog to Language) is a high-dimensional model of semantic space that uses the global co-occurrence frequency of words in a large corpus of text as the basis for a representation of semantic memory. In the original version of the HAL model, many of its parameters ...
ABSTRACT What knowledge influences our choice of words when we write or speak? Predicting which w... more ABSTRACT What knowledge influences our choice of words when we write or speak? Predicting which word a person will produce next is not easy, even when the linguistic context is known. One task that has been used to assess context dependent word choice is the fill-in-the-blank task, also called the cloze task. The cloze probability of specific context is an empirical measure found by asking many people to fill in the blank. In this paper we harness the power of large corpora to look at the influence of corpus-derived probabilistic information from a word’s micro-context on word choice. We asked young adults to complete short phrases called n-grams with up to 20 responses per phrase. The probability of the responded word and the conditional probability of the response given the context were predictive of the frequency with which each response was produced. Furthermore the order in which the participants generated multiple completions of the same context was predicted by the conditional probability as well. These results suggest that word choice in cloze tasks taps into implicit knowledge of a person’s past experience with that word in various contexts. Furthermore, the importance of n-gram conditional probabilities in our analysis is further evidence of implicit knowledge about multi-word sequences and support theories of language processing that involve anticipating or predicting based on context.
Uploads
Papers by Cyrus Shaoul