Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–1 of 1 results for author: Darrel, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.08164  [pdf, other

    cs.CV

    ConMe: Rethinking Evaluation of Compositional Reasoning for Modern VLMs

    Authors: Irene Huang, Wei Lin, M. Jehanzeb Mirza, Jacob A. Hansen, Sivan Doveh, Victor Ion Butoi, Roei Herzig, Assaf Arbelle, Hilde Kuhene, Trevor Darrel, Chuang Gan, Aude Oliva, Rogerio Feris, Leonid Karlinsky

    Abstract: Compositional Reasoning (CR) entails grasping the significance of attributes, relations, and word order. Recent Vision-Language Models (VLMs), comprising a visual encoder and a Large Language Model (LLM) decoder, have demonstrated remarkable proficiency in such reasoning tasks. This prompts a crucial question: have VLMs effectively tackled the CR challenge? We conjecture that existing CR benchmark… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: The first three authors contributed equally