Carina Kauf

Massachusetts Institute of Technology (MIT), Brain and Cognitive Sciences, Graduate Student

Followers

Following

Public Views

InterestsView All (6)

Uploads

Papers by Carina Kauf

The neural architecture of language: Integrative modeling converges on predictive processing

Proceedings of the National Academy of Sciences of the United States of America, Nov 4, 2021

Download

A Better Way to Do Masked Language Model Scoring

arXiv (Cornell University), May 17, 2023

Download

The neural architecture of language: Integrative modeling converges on predictive processing

bioRxiv (Cold Spring Harbor Laboratory), Jun 27, 2020

Download

Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network

Neurobiology of language, Jul 19, 2023

Download

Event knowledge in large language models: the gap between the impossible and the unlikely

arXiv (Cornell University), Dec 2, 2022

Download

A Better Way to Do Masked Language Model Scoring

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network

Representations from artificial neural network (ANN) language models have been shown to predict h... more Representations from artificial neural network (ANN) language models have been shown to predict human brain activity in the language network. To understand what aspects of linguistic stimuli contribute to ANN-to-brain similarity, we used an fMRI dataset of responses to n=627 naturalistic English sentences (Pereira et al., 2018) and systematically manipulated the stimuli for which ANN representations were extracted. In particular, we i) perturbed sentences’ word order, ii) removed different subsets of words, or iii) replaced sentences with other sentences of varying semantic similarity. We found that the lexical semantic content of the sentence (largely carried by content words) rather than the sentence’s syntactic form (conveyed via word order or function words) is primarily responsible for the ANN-to-brain similarity. In follow-up analyses, we found that perturbation manipulations that adversely affect brain predictivity also lead to more divergent representations in the ANN’s embe...

The neural architecture of language: Integrative reverse-engineering converges on a model for predictive processing

2 The neuroscience of perception has recently been revolutionized with an integrative reverse-eng... more 2 The neuroscience of perception has recently been revolutionized with an integrative reverse-engineering approach in which 3 computation, brain function, and behavior are linked across many different datasets and many computational models. We here 4 present a first systematic study taking this approach into higher-level cognition: human language processing, our species’ 5 signature cognitive skill. We find that the most powerful ‘transformer’ networks predict neural responses at nearly 100% and 6 generalize across different datasets and data types (fMRI, ECoG). Across models, significant correlations are observed among all 7 three metrics of performance: neural fit, fit to behavioral responses, and accuracy on the next-word prediction task (but not 8 other language tasks), consistent with the long-standing hypothesis that the brain’s language system is optimized for predictive 9 processing. Model architectures with initial weights further perform surprisingly similar to final train...

Download

Counteridenticals and dream reports

ZAS Papers in Linguistics, 2018

Counteridenticals are counterfactual conditional sentences whose antecedent clausescontain an ide... more Counteridenticals are counterfactual conditional sentences whose antecedent clausescontain an identity statement, e.g. If I were you, I’d buy the blue dress. Here, we argue thatcounteridenticals are best analyzed along the lines of dream reports. After showing that counteridenticalsand dream reports exhibit striking grammatical and perceptual parallels, we suggestan analysis of counteridenticals with Percus and Sauerland’s (2003) analysis of dreamreports. Following their proposal, we propose to make use of concept generators, realized ascentered worlds. To this end, we argue that the presence of if licenses the presence of an imagine-operator, which constitutes the attitude the antecedent clause ‘x be-PAST y’ is taken under;The speaker predicates, in the imagine mode, the consequent property to his/her imagined self.To capture the different degrees of identification between the subject and the predicate of theidentity statement of counteridenticals’ antecedents observed in the liter...

Download

The Asymmetry of Past Tense ∗

In this paper, we propose a semantics for (the highest instance) of past tense in a syntactic dom... more In this paper, we propose a semantics for (the highest instance) of past tense in a syntactic domain that is essentially modal and not strictly temporal. Given this asymmetry we are able to account for the fact that, once embedded under another modal, past tense morphology can receive a modal interpretation and is not an inherent time shifter. This naturally derives the syntax of counterfactual if and wish clauses. Overgeneration of modal readings in other modal contexts is ruled out by means of pragmatic competition with present tense morphology.

Download

Chancen und Grenzen von Digitalen Methoden zur Analyse der politischen Meinungsbildung in Sozialen Medien

Download

Towards a New Explanation of Sequence of Tense

Semantics and Linguistic Theory, 2018

Past-under-past embeddings have two readings, a simultaneous and a backward-shifted one. While ex... more Past-under-past embeddings have two readings, a simultaneous and a backward-shifted one. While existing accounts derive these readings via distinct mechanisms, be it by means of an ambiguity at the level of LF or via blocking of a cessation implicature, we propose an alternative account which avoids such ambiguity. For us, the meaning of a past tense morpheme, like -ed, is comprised of two components. Syntactically, every past tense morpheme carries an uninterpretable past feature [uPAST], to be checked by a (single) covert past tense operator Op- PAST carrying an interpretable feature [iPAST]. Semantically, the past tense marker encodes a relative non-future with respect to its closest c-commanding tense node (informally: ‘not later than’), immediately yielding the two distinct readings.

Download

An Analysis of Counteridenticals in Terms of Dream Reports

In this paper, I argue that counteridenticals are best analyzed along the lines of dream reports.... more In this paper, I argue that counteridenticals are best analyzed along the lines of dream reports. The analysis opposes existing proposals of counteridentical meaning (Lakoff (1996); Kocurek (2016)), both of which constitute variations of Lewis’ (1973) counterpart theory. First, I show that counteridenticals and dream reports exhibit striking grammatical as well as perceptual parallels. Then, I suggest an analysis of counteridenticals on a par with Percus and Sauerland’s (2003), henceforward P&S, analysis of dream reports. In contrast to the existing theories, this proposal is able to account for the correlations between the two linguistic structures. Counteridenticals and dream reports exhibit at least four parallels with regard to their grammatical and perceptual make-up. Some of these correlations have already been noted in the literature by Arregui (2007), and this paper provides two novel arguments in favor of an analysis which treats the two constructions on a par. 1. Both allo...

Download

The neural architecture of language: Integrative modeling converges on predictive processing

Proceedings of the National Academy of Sciences of the United States of America, Nov 4, 2021

Download

A Better Way to Do Masked Language Model Scoring

arXiv (Cornell University), May 17, 2023

Download

The neural architecture of language: Integrative modeling converges on predictive processing

bioRxiv (Cold Spring Harbor Laboratory), Jun 27, 2020

Download

Lexical semantic content, not syntactic structure, is the main contributor to ANN-brain similarity of fMRI responses in the language network

Neurobiology of language, Jul 19, 2023