Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–13 of 13 results for author: Kervadec, C

.
  1. arXiv:2405.15471  [pdf, other

    cs.CL

    Emergence of a High-Dimensional Abstraction Phase in Language Transformers

    Authors: Emily Cheng, Diego Doimo, Corentin Kervadec, Iuri Macocco, Jade Yu, Alessandro Laio, Marco Baroni

    Abstract: A language model (LM) is a mapping from a linguistic context to an output token. However, much remains to be known about this mapping, including how its geometric properties relate to its function. We take a high-level geometric approach to its analysis, observing, across five pre-trained transformer-based LMs and three input datasets, a distinct phase characterized by high intrinsic dimensionalit… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  2. arXiv:2310.15829  [pdf, other

    cs.CL

    Unnatural language processing: How do language models handle machine-generated prompts?

    Authors: Corentin Kervadec, Francesca Franzon, Marco Baroni

    Abstract: Language model prompt optimization research has shown that semantically and grammatically well-formed manually crafted prompts are routinely outperformed by automatically generated token sequences with no apparent meaning or syntactic structure, including sequences of vectors from a model's embedding space. We use machine-generated prompts to probe how models respond to input that is not composed… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023 Camera-Ready

  3. arXiv:2310.13620  [pdf, other

    cs.CL

    Bridging Information-Theoretic and Geometric Compression in Language Models

    Authors: Emily Cheng, Corentin Kervadec, Marco Baroni

    Abstract: For a language model (LM) to faithfully model human language, it must compress vast, potentially infinite information into relatively few dimensions. We propose analyzing compression in (pre-trained) LMs from two points of view: geometric and information-theoretic. We demonstrate that the two views are highly correlated, such that the intrinsic geometric dimension of linguistic data predicts their… ▽ More

    Submitted 9 November, 2023; v1 submitted 20 October, 2023; originally announced October 2023.

    Comments: EMNLP 2023 Camera-Ready

  4. arXiv:2202.06858  [pdf, other

    cs.CV

    An experimental study of the vision-bottleneck in VQA

    Authors: Pierre Marza, Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf

    Abstract: As in many tasks combining vision and language, both modalities play a crucial role in Visual Question Answering (VQA). To properly solve the task, a given model should both understand the content of the proposed image and the nature of the question. While the fusion between modalities, which is another obviously important part of the problem, has been highly studied, the vision part has received… ▽ More

    Submitted 14 February, 2022; originally announced February 2022.

  5. arXiv:2106.05597  [pdf, other

    cs.CV cs.LG

    Supervising the Transfer of Reasoning Patterns in VQA

    Authors: Corentin Kervadec, Christian Wolf, Grigory Antipov, Moez Baccouche, Madiha Nadri

    Abstract: Methods for Visual Question Anwering (VQA) are notorious for leveraging dataset biases rather than performing reasoning, hindering generalization. It has been recently shown that better reasoning patterns emerge in attention layers of a state-of-the-art VQA model when they are trained on perfect (oracle) visual inputs. This provides evidence that deep neural networks can learn to reason when train… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

  6. arXiv:2104.03656  [pdf, other

    cs.CV

    How Transferable are Reasoning Patterns in VQA?

    Authors: Corentin Kervadec, Theo Jaunet, Grigory Antipov, Moez Baccouche, Romain Vuillemot, Christian Wolf

    Abstract: Since its inception, Visual Question Answering (VQA) is notoriously known as a task, where models are prone to exploit biases in datasets to find shortcuts instead of performing high-level reasoning. Classical methods address this by removing biases from training data, or adding branches to models to detect and remove biases. In this paper, we argue that uncertainty in vision is a dominating facto… ▽ More

    Submitted 8 April, 2021; originally announced April 2021.

  7. arXiv:2104.00926  [pdf, other

    cs.CV cs.HC

    VisQA: X-raying Vision and Language Reasoning in Transformers

    Authors: Theo Jaunet, Corentin Kervadec, Romain Vuillemot, Grigory Antipov, Moez Baccouche, Christian Wolf

    Abstract: Visual Question Answering systems target answering open-ended textual questions given input images. They are a testbed for learning high-level reasoning with a primary use in HCI, for instance assistance for the visually impaired. Recent research has shown that state-of-the-art models tend to produce answers exploiting biases and shortcuts in the training data, and sometimes do not even look at th… ▽ More

    Submitted 20 July, 2021; v1 submitted 2 April, 2021; originally announced April 2021.

  8. arXiv:2006.05726  [pdf, other

    cs.CV cs.CL

    Estimating semantic structure for the VQA answer space

    Authors: Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf

    Abstract: Since its appearance, Visual Question Answering (VQA, i.e. answering a question posed over an image), has always been treated as a classification problem over a set of predefined answers. Despite its convenience, this classification approach poorly reflects the semantics of the problem limiting the answering to a choice between independent proposals, without taking into account the similarity betw… ▽ More

    Submitted 8 April, 2021; v1 submitted 10 June, 2020; originally announced June 2020.

    Comments: [WARNING] We want to notice the reader that additional experiments (not in the paper) have shown that using a `random' semantic space performs as much as the proposed semantic loss. This additional result question the effectiveness of our method

  9. arXiv:2006.05121  [pdf, other

    cs.CV

    Roses Are Red, Violets Are Blue... but Should Vqa Expect Them To?

    Authors: Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf

    Abstract: Models for Visual Question Answering (VQA) are notorious for their tendency to rely on dataset biases, as the large and unbalanced diversity of questions and concepts involved and tends to prevent models from learning to reason, leading them to perform educated guesses instead. In this paper, we claim that the standard evaluation metric, which consists in measuring the overall in-domain accuracy,… ▽ More

    Submitted 7 April, 2021; v1 submitted 9 June, 2020; originally announced June 2020.

  10. arXiv:1912.03063  [pdf, other

    cs.CV cs.CL cs.LG cs.NE

    Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks

    Authors: Corentin Kervadec, Grigory Antipov, Moez Baccouche, Christian Wolf

    Abstract: The large adoption of the self-attention (i.e. transformer model) and BERT-like training principles has recently resulted in a number of high performing models on a large panoply of vision-and-language problems (such as Visual Question Answering (VQA), image retrieval, etc.). In this paper we claim that these State-Of-The-Art (SOTA) approaches perform reasonably well in structuring information ins… ▽ More

    Submitted 6 December, 2019; originally announced December 2019.

  11. arXiv:1810.13197  [pdf, other

    cs.NE cs.AI cs.CV

    The Many Moods of Emotion

    Authors: Valentin Vielzeuf, Corentin Kervadec, Stéphane Pateux, Frédéric Jurie

    Abstract: This paper presents a novel approach to the facial expression generation problem. Building upon the assumption of the psychological community that emotion is intrinsically continuous, we first design our own continuous emotion representation with a 3-dimensional latent space issued from a neural network trained on discrete emotion classification. The so-obtained representation can be used to annot… ▽ More

    Submitted 31 October, 2018; originally announced October 2018.

  12. arXiv:1808.02668  [pdf, other

    cs.AI cs.CV cs.NE stat.ML

    An Occam's Razor View on Learning Audiovisual Emotion Recognition with Small Training Sets

    Authors: Valentin Vielzeuf, Corentin Kervadec, Stéphane Pateux, Alexis Lechervy, Frédéric Jurie

    Abstract: This paper presents a light-weight and accurate deep neural model for audiovisual emotion recognition. To design this model, the authors followed a philosophy of simplicity, drastically limiting the number of parameters to learn from the target datasets, always choosing the simplest earning methods: i) transfer learning and low-dimensional space embedding allows to reduce the dimensionality of t… ▽ More

    Submitted 8 August, 2018; originally announced August 2018.

    Journal ref: ICMI (EmotiW) 2018, Oct 2018, Boulder, Colorado, United States

  13. arXiv:1807.11215  [pdf, other

    cs.AI cs.CV cs.NE

    CAKE: Compact and Accurate K-dimensional representation of Emotion

    Authors: Corentin Kervadec, Valentin Vielzeuf, Stéphane Pateux, Alexis Lechervy, Frédéric Jurie

    Abstract: Numerous models describing the human emotional states have been built by the psychology community. Alongside, Deep Neural Networks (DNN) are reaching excellent performances and are becoming interesting features extraction tools in many computer vision tasks.Inspired by works from the psychology community, we first study the link between the compact two-dimensional representation of the emotion kno… ▽ More

    Submitted 3 August, 2018; v1 submitted 30 July, 2018; originally announced July 2018.

    Journal ref: Image Analysis for Human Facial and Activity Recognition (BMVC Workshop), Sep 2018, Newcastle, United Kingdom. http://juz-dev.myweb.port.ac.uk/BMVCWorkshop/index.html