Jan 23, 2024 · Abstract:Vision-language models (VLMs) excel in zero-shot recognition but their performance varies greatly across different visual concepts.
Vision-language models (VLMs) excel in zero-shot recognition but their performance varies greatly across different visual concepts.
Our analysis of LAION-400M and LAION-2B helps us identify visual concepts that are under-represented in the pretraining datasets of Vision Language models.
We investigate the critical yet ever-neglected long-tailed issues of Vision-Language Models (VLMs). We use large language models (LLMs) to estimate concept ...
Vision-language models (VLMs) excel in zero-shot recognition but their performance varies greatly across different visual concepts.
This repository houses the code for the CVPR 2024 paper - "The Neglected Tails of Vision Language Models". Updates. 2024-08-10: Please check out our latest work ...
Our analysis confirms that popular datasets, such as LAION, exhibit a long-tailed concept distribution, yielding biased performance in VLMs. We also find that ...
People also ask
What are vision language models?
What is VLM vs LLM?
Dec 10, 2024 · To address above issue, previous works can generally be divided into two categories: Prompt Engineering (PE) and Test-Time Adaptation (TTA).
Jan 23, 2024 · This work uses large language models (LLMs) to count the number of pretraining texts that con-tain synonyms of these concepts and proposes REtrieval-Augmented ...
Jan 31, 2024 · Hi all, I'm excited to share our latest work that unveils "The Neglected Tails of Vision-Language Models"! Our study is the first to reveal ...