Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- abstractMarch 2024
Scaling Up LLM Reviews for Google Ads Content Moderation
- Wei Qiao,
- Tushar Dogra,
- Otilia Stretcu,
- Yu-Han Lyu,
- Tiantian Fang,
- Dongjin Kwon,
- Chun-Ta Lu,
- Enming Luo,
- Yuan Wang,
- Chih-Chun Chia,
- Ariel Fuxman,
- Fangzhou Wang,
- Ranjay Krishna,
- Mehmet Tek
WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data MiningMarch 2024, Pages 1174–1175https://doi.org/10.1145/3616855.3635736Large language models (LLMs) are powerful tools for content moderation, but their inference costs and latency make them prohibitive for casual use on large datasets, such as the Google Ads repository. This study proposes a method for scaling up LLM ...
- research-articleMay 2024
Large language model as attributed training data generator: a tale of diversity and bias
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 2433, Pages 55734–55784Large language models (LLMs) have been recently leveraged as training data generators for various natural language processing (NLP) tasks. While previous research has explored different approaches to training models using generated data, they generally ...
- research-articleMay 2024
Cola: a benchmark for compositional text-to-image retrieval
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 2014, Pages 46433–46445Compositional reasoning is a hallmark of human visual intelligence. Yet despite the size of large vision-language models, they struggle to represent simple compositions by combining objects with their attributes. To measure this lack of compositional ...
- research-articleMay 2024
Quilt-IM: one million image-text pairs for histopathology
- Wisdom O. Ikezogwo,
- Mehmet S. Seyfioglu,
- Fatemeh Ghezloo,
- Dylan Geva,
- Fatwir S. Mohammed,
- Pavan K. Anand,
- Ranjay Krishna,
- Linda G. Shapiro
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 1654, Pages 37995–38017Recent accelerations in multi-modal applications have been made possible with the plethora of image and text data available online. However, the scarcity of analogous data in the medical field, specifically in histopathology, has slowed comparable ...
- research-articleMay 2024
SUGARCREPE: fixing hackable benchmarks for vision-language compositionality
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 1355, Pages 31096–31116In the last year alone, a surge of new benchmarks to measure compositional understanding of vision-language models have permeated the machine learning ecosystem. Given an image, these benchmarks probe a model's ability to identify its associated caption ...
- research-articleMay 2024
DataComp: in search of the next generation of multimodal datasets
- Samir Yitzhak Gadre,
- Gabriel Ilharco,
- Alex Fang,
- Jonathan Hayase,
- Georgios Smyrnis,
- Thao Nguyen,
- Ryan Marten,
- Mitchell Wortsman,
- Dhruba Ghosh,
- Jieyu Zhang,
- Eyal Orgad,
- Rahim Entezari,
- Giannis Daras,
- Sarah Pratt,
- Vivek Ramanujan,
- Yonatan Bitton,
- Kalyani Marathe,
- Stephen Mussmann,
- Richard Vencu,
- Mehdi Cherti,
- Ranjay Krishna,
- Pang Wei Koh,
- Olga Saukh,
- Alexander Ratner,
- Shuran Song,
- Hannaneh Hajishirzi,
- Ali Farhadi,
- Romain Beaumont,
- Sewoong Oh,
- Alex Dimakis,
- Jenia Jitsev,
- Yair Carmon,
- Vaishaal Shankar,
- Ludwig Schmidt
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 1179, Pages 27092–27112Multimodal datasets are a critical component in recent breakthroughs such as CLIP, Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the ...
- research-articleMay 2024
OBJECT 3DIT: language-guided 3D-aware image editing
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 155, Pages 3497–3516Existing image editing tools, while powerful, typically disregard the underlying 3D geometry from which the image is projected. As a result, edits made using these tools may become detached from the geometry and lighting conditions that are at the ...
VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building
- Maureen Daum,
- Enhao Zhang,
- Dong He,
- Stephen Mussmann,
- Brandon Haynes,
- Ranjay Krishna,
- Magdalena Balazinska
Proceedings of the VLDB Endowment (PVLDB), Volume 16, Issue 13Pages 4188–4201https://doi.org/10.14778/3625054.3625057We introduce VOCALExplore, a system designed to support users in building domain-specific models over video datasets. VOCALExplore supports interactive labeling sessions and trains models using user-supplied labels. VOCALExplore maximizes model quality ...
EQUI-VOCAL Demonstration: Synthesizing Video Queries from User Interactions
Proceedings of the VLDB Endowment (PVLDB), Volume 16, Issue 12Pages 3978–3981https://doi.org/10.14778/3611540.3611600We demonstrate EQUI-VOCAL, a system that synthesizes compositional queries over videos from user feedback. EQUI-VOCAL enables users to query a video database for complex events by providing a few positive and negative examples of what they are looking ...
EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions
Proceedings of the VLDB Endowment (PVLDB), Volume 16, Issue 11Pages 2714–2727https://doi.org/10.14778/3611479.3611482We introduce EQUI-VOCAL: a new system that automatically synthesizes queries over videos from limited user interactions. The user only provides a handful of positive and negative examples of what they are looking for. EQUI-VOCAL utilizes these initial ...
- research-articleApril 2024
ELIGN: expectation alignment as a multi-agent intrinsic reward
NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsNovember 2022, Article No.: 604, Pages 8304–8317Modern multi-agent reinforcement learning frameworks rely on centralized training and reward shaping to perform well. However, centralized training and dense rewards are not readily available in the real world. Current multi-agent algorithms struggle to ...