Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
×
Jun 24, 2024 · We propose a semi-automated pipeline for constructing cultural VLM benchmarks to enhance diversity and efficiency. This pipeline leverages human ...
Jun 27, 2024 · This pipeline leverages human-VLM collaboration, where VLMs generate questions based on guidelines, human-annotated examples, and image-wise ...
Jun 25, 2024 · The K-ViScuit benchmark is designed to test VLMs' ability to understand and interpret visual scenes in a culturally-aware manner, going beyond ...
Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration. Y Baek, CH Park, J Kim, YJ Heo, DS Chang, J Choo. arXiv ...
Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration ... benchmark, CDEval, aimed at evaluating the cultural ...
Yujin Baek's 3 research works with 2 citations and 18 reads, including: Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with ...
Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration. ... Criteria for Human-Compatible AI in Two-Player ...
Jul 8, 2024 · For example, the K-VisCuit benchmark evaluates how accurately the models can understand the cultural significance of images, while the See It ...
We propose a semi-automated pipeline for constructing cultural VLM benchmarks to enhance diversity and efficiency. This pipeline leverages human ...
We propose a semi-automated pipeline for constructing cultural VLM benchmarks to enhance diversity and efficiency. Diversity · Visual Reasoning.