Author: Krishna, Ranjay : Search

abstract

Scaling Up LLM Reviews for Google Ads Content Moderation

WSDM '24: Proceedings of the 17th ACM International Conference on Web Search and Data MiningMarch 2024, Pages 1174–1175https://doi.org/10.1145/3616855.3635736

Large language models (LLMs) are powerful tools for content moderation, but their inference costs and latency make them prohibitive for casual use on large datasets, such as the Google Ads repository. This study proposes a method for scaling up LLM ...

research-article

Large language model as attributed training data generator: a tale of diversity and bias

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 2433, Pages 55734–55784

Large language models (LLMs) have been recently leveraged as training data generators for various natural language processing (NLP) tasks. While previous research has explored different approaches to training models using generated data, they generally ...

research-article

Cola: a benchmark for compositional text-to-image retrieval

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 2014, Pages 46433–46445

Compositional reasoning is a hallmark of human visual intelligence. Yet despite the size of large vision-language models, they struggle to represent simple compositions by combining objects with their attributes. To measure this lack of compositional ...

research-article

Quilt-IM: one million image-text pairs for histopathology

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 1654, Pages 37995–38017

Recent accelerations in multi-modal applications have been made possible with the plethora of image and text data available online. However, the scarcity of analogous data in the medical field, specifically in histopathology, has slowed comparable ...

research-article

SUGARCREPE: fixing hackable benchmarks for vision-language compositionality

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 1355, Pages 31096–31116

In the last year alone, a surge of new benchmarks to measure compositional understanding of vision-language models have permeated the machine learning ecosystem. Given an image, these benchmarks probe a model's ability to identify its associated caption ...

research-article

DataComp: in search of the next generation of multimodal datasets

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 1179, Pages 27092–27112

Multimodal datasets are a critical component in recent breakthroughs such as CLIP, Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the ...

research-article

OBJECT 3DIT: language-guided 3D-aware image editing

NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing SystemsDecember 2023, Article No.: 155, Pages 3497–3516

Existing image editing tools, while powerful, typically disregard the underlying 3D geometry from which the image is projected. As a result, edits made using these tools may become detached from the geometry and lighting conditions that are at the ...

research-article

VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building

Proceedings of the VLDB Endowment (PVLDB), Volume 16, Issue 13Pages 4188–4201https://doi.org/10.14778/3625054.3625057

We introduce VOCALExplore, a system designed to support users in building domain-specific models over video datasets. VOCALExplore supports interactive labeling sessions and trains models using user-supplied labels. VOCALExplore maximizes model quality ...

research-article

EQUI-VOCAL Demonstration: Synthesizing Video Queries from User Interactions

Proceedings of the VLDB Endowment (PVLDB), Volume 16, Issue 12Pages 3978–3981https://doi.org/10.14778/3611540.3611600

We demonstrate EQUI-VOCAL, a system that synthesizes compositional queries over videos from user feedback. EQUI-VOCAL enables users to query a video database for complex events by providing a few positive and negative examples of what they are looking ...

research-article

EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions

Proceedings of the VLDB Endowment (PVLDB), Volume 16, Issue 11Pages 2714–2727https://doi.org/10.14778/3611479.3611482

We introduce EQUI-VOCAL: a new system that automatically synthesizes queries over videos from limited user interactions. The user only provides a handful of positive and negative examples of what they are looking for. EQUI-VOCAL utilizes these initial ...

research-article

ELIGN: expectation alignment as a multi-agent intrinsic reward

NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing SystemsNovember 2022, Article No.: 604, Pages 8304–8317

Modern multi-agent reinforcement learning frameworks rely on centralized training and reward shaping to perform well. However, centralized training and dense rewards are not readily available in the real world. Current multi-agent algorithms struggle to ...

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Caption

Scaling Up LLM Reviews for Google Ads Content Moderation

Large language model as attributed training data generator: a tale of diversity and bias

Cola: a benchmark for compositional text-to-image retrieval

Quilt-IM: one million image-text pairs for histopathology

SUGARCREPE: fixing hackable benchmarks for vision-language compositionality

DataComp: in search of the next generation of multimodal datasets

OBJECT 3DIT: language-guided 3D-aware image editing

VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building

EQUI-VOCAL Demonstration: Synthesizing Video Queries from User Interactions

EQUI-VOCAL: Synthesizing Queries for Compositional Video Events from Limited User Interactions

ELIGN: expectation alignment as a multi-agent intrinsic reward

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Save to Binder