Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–12 of 12 results for author: Singla, V

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10328  [pdf, other

    cs.CV cs.CL cs.LG

    From Pixels to Prose: A Large Dataset of Dense Image Captions

    Authors: Vasu Singla, Kaiyu Yue, Sukriti Paul, Reza Shirkavand, Mayuka Jayawardhana, Alireza Ganjdanesh, Heng Huang, Abhinav Bhatele, Gowthami Somepalli, Tom Goldstein

    Abstract: Training large vision-language models requires extensive, high-quality image-text pairs. Existing web-scraped datasets, however, are noisy and lack detailed image descriptions. To bridge this gap, we introduce PixelProse, a comprehensive dataset of over 16M (million) synthetically generated captions, leveraging cutting-edge vision-language models for detailed and accurate descriptions. To ensure d… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: pixelprose 16M dataset

  2. arXiv:2406.10219  [pdf, other

    cs.CV cs.GR

    PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting

    Authors: Alex Hanson, Allen Tu, Vasu Singla, Mayuka Jayawardhana, Matthias Zwicker, Tom Goldstein

    Abstract: Recent advancements in novel view synthesis have enabled real-time rendering speeds and high reconstruction accuracy. 3D Gaussian Splatting (3D-GS), a foundational point-based parametric 3D scene representation, models scenes as large sets of 3D Gaussians. Complex scenes can comprise of millions of Gaussians, amounting to large storage and memory requirements that limit the viability of 3D-GS on d… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  3. arXiv:2311.03386  [pdf, other

    cs.CV cs.LG

    A Simple and Efficient Baseline for Data Attribution on Images

    Authors: Vasu Singla, Pedro Sandoval-Segura, Micah Goldblum, Jonas Geiping, Tom Goldstein

    Abstract: Data attribution methods play a crucial role in understanding machine learning models, providing insight into which training data points are most responsible for model outputs during deployment. However, current state-of-the-art approaches require a large ensemble of as many as 300,000 models to accurately attribute model predictions. These approaches therefore come at a high computational cost, a… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: Code available at https://github.com/vasusingla/simple-data-attribution

  4. arXiv:2305.20086  [pdf, other

    cs.LG cs.CR cs.CV

    Understanding and Mitigating Copying in Diffusion Models

    Authors: Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein

    Abstract: Images generated by diffusion models like Stable Diffusion are increasingly widespread. Recent works and even lawsuits have shown that these models are prone to replicating their training data, unbeknownst to the user. In this paper, we first analyze this memorization problem in text-to-image diffusion models. While it is widely believed that duplicated images in the training set are responsible f… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: 17 pages, preprint. Code is available at https://github.com/somepago/DCR

  5. arXiv:2305.19254  [pdf, other

    cs.LG cs.CR

    What Can We Learn from Unlearnable Datasets?

    Authors: Pedro Sandoval-Segura, Vasu Singla, Jonas Geiping, Micah Goldblum, Tom Goldstein

    Abstract: In an era of widespread web scraping, unlearnable dataset methods have the potential to protect data privacy by preventing deep neural networks from generalizing. But in addition to a number of practical limitations that make their use unlikely, we make a number of findings that call into question their ability to safeguard data. First, it is widely believed that neural networks trained on unlearn… ▽ More

    Submitted 7 November, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to NeurIPS 2023. Code available at https://github.com/psandovalsegura/learn-from-unlearnable

  6. arXiv:2212.03860  [pdf, other

    cs.LG cs.CV cs.CY

    Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models

    Authors: Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Geiping, Tom Goldstein

    Abstract: Cutting-edge diffusion models produce images with high quality and customizability, enabling them to be used for commercial art and graphic design purposes. But do diffusion models create unique works of art, or are they replicating content directly from their training sets? In this work, we study image retrieval frameworks that enable us to compare generated images with training samples and detec… ▽ More

    Submitted 12 December, 2022; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: Updated draft with the following changes (1) Clarified the LAION Aesthetics versions everywhere (2) Correction on which LAION Aesthetics version SD - 1.4 is finetuned on and updated figure 12 based on this (3) A section on possible causes of replication

  7. arXiv:2206.03693  [pdf, other

    cs.LG cs.CR

    Autoregressive Perturbations for Data Poisoning

    Authors: Pedro Sandoval-Segura, Vasu Singla, Jonas Geiping, Micah Goldblum, Tom Goldstein, David W. Jacobs

    Abstract: The prevalence of data scraping from social media as a means to obtain datasets has led to growing concerns regarding unauthorized use of data. Data poisoning attacks have been proposed as a bulwark against scraping, as they make data "unlearnable" by adding small, imperceptible perturbations. Unfortunately, existing methods require knowledge of both the target architecture and the complete datase… ▽ More

    Submitted 13 October, 2022; v1 submitted 8 June, 2022; originally announced June 2022.

    Comments: Accepted to NeurIPS 2022. Code available at https://github.com/psandovalsegura/autoregressive-poisoning

  8. arXiv:2204.08615  [pdf, other

    cs.LG cs.CR

    Poisons that are learned faster are more effective

    Authors: Pedro Sandoval-Segura, Vasu Singla, Liam Fowl, Jonas Geiping, Micah Goldblum, David Jacobs, Tom Goldstein

    Abstract: Imperceptible poisoning attacks on entire datasets have recently been touted as methods for protecting data privacy. However, among a number of defenses preventing the practical use of these techniques, early-stopping stands out as a simple, yet effective defense. To gauge poisons' vulnerability to early-stopping, we benchmark error-minimizing, error-maximizing, and synthetic poisons in terms of p… ▽ More

    Submitted 18 April, 2022; originally announced April 2022.

    Comments: 8 pages, 4 figures. Accepted to CVPR 2022 Art of Robustness Workshop

  9. arXiv:2103.02695  [pdf, other

    cs.LG

    Shift Invariance Can Reduce Adversarial Robustness

    Authors: Songwei Ge, Vasu Singla, Ronen Basri, David Jacobs

    Abstract: Shift invariance is a critical property of CNNs that improves performance on classification. However, we show that invariance to circular shifts can also lead to greater sensitivity to adversarial attacks. We first characterize the margin between classes when a shift-invariant linear classifier is used. We show that the margin can only depend on the DC component of the signals. Then, using results… ▽ More

    Submitted 22 November, 2021; v1 submitted 3 March, 2021; originally announced March 2021.

    Comments: Published as a conference paper at NeurIPS 2021

  10. arXiv:2102.07861  [pdf, other

    cs.LG

    Low Curvature Activations Reduce Overfitting in Adversarial Training

    Authors: Vasu Singla, Sahil Singla, David Jacobs, Soheil Feizi

    Abstract: Adversarial training is one of the most effective defenses against adversarial attacks. Previous works suggest that overfitting is a dominant phenomenon in adversarial training leading to a large generalization gap between test and train accuracy in neural networks. In this work, we show that the observed generalization gap is closely related to the choice of the activation function. In particular… ▽ More

    Submitted 18 August, 2021; v1 submitted 15 February, 2021; originally announced February 2021.

    Comments: Accepted at ICCV 2021

  11. arXiv:2007.09785  [pdf, other

    cs.CV

    ASAP-NMS: Accelerating Non-Maximum Suppression Using Spatially Aware Priors

    Authors: Rohun Tripathi, Vasu Singla, Mahyar Najibi, Bharat Singh, Abhishek Sharma, Larry Davis

    Abstract: The widely adopted sequential variant of Non Maximum Suppression (or Greedy-NMS) is a crucial module for object-detection pipelines. Unfortunately, for the region proposal stage of two/multi-stage detectors, NMS is turning out to be a latency bottleneck due to its sequential nature. In this article, we carefully profile Greedy-NMS iterations to find that a major chunk of computation is wasted in c… ▽ More

    Submitted 21 August, 2020; v1 submitted 19 July, 2020; originally announced July 2020.

    Comments: Under Review at CVIU

  12. arXiv:1307.2669  [pdf, ps, other

    cs.IR

    Text Categorization via Similarity Search: An Efficient and Effective Novel Algorithm

    Authors: Hubert Haoyang Duan, Vladimir Pestov, Varun Singla

    Abstract: We present a supervised learning algorithm for text categorization which has brought the team of authors the 2nd place in the text categorization division of the 2012 Cybersecurity Data Mining Competition (CDMC'2012) and a 3rd prize overall. The algorithm is quite different from existing approaches in that it is based on similarity search in the metric space of measure distributions on the diction… ▽ More

    Submitted 10 July, 2013; originally announced July 2013.

    Comments: 12 pages, 5 tables, accepted for the 6th International Conference on Similarity Search and Applications (SISAP 2013)