I am a PhD student at Robotics Institute, Carnegie Mellon University (CMU). I am advised by Jun-Yan Zhu and collaborate closely with
Richard Zhang and Eli Shechtman. My research interests lie in computer vision, specifically, generative models, model customization, and few-shot learning.
Prior to CMU, I worked at Media and Data Science Research, Adobe India, and had the pleasure to
collaborate with Vineeth N Balasubramanian during that time. I did my undergraduate from Indian Institute of Tenchnology Delhi with a major in Mathematics and Computing.
We propose Custom Diffusion-360, to add object viewpoint control when personalizing text-to-image diffusion models, e.g. Stable Diffusion-XL, given multi-view images of the new object.
We propose Custom Diffusion, a method to fine-tune large-scale text-to-image diffusion models, e.g. Stable Diffusion, given few
(~4-20) user-provided images of a new concept.
Our method is computationally efficient (~6 minutes on 2 A100 GPUs) and has low storage requirements for
each additional concept model (75MB) apart from the pretrained model.
We propose a method to ablate (remove) copyrighted materials and memorized images from pretrained
text-to-image generative models. Our algorithm changes the target concept distribution to an anchor
concept, e.g., Van Gogh painting to paintings or Grumpy cat to Cat.
We show that pretrained computer vision models can significantly improve performance when used in an
ensemble of discriminators.
Our method improves FID by 1.5x to 2x on cat, church, and horse categories of LSUN.
We propose a robust attribution training methodology ART that maximizes the alignment between
the input and its attribution map.
ART achieves state-of-the-art performance in attributional robustness and weakly supervised
object localization on CUB dataset.
Used self-supervision techniques - rotation and exemplar, followed by manifold mixup for few-shot
classification tasks.
The proposed approach beats the current state-of-the-art accuracy on mini-ImageNet, CUB and CIFAR-FS
datasets by 3-8%.