Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–18 of 18 results for author: Daras, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11794  [pdf, other

    cs.LG cs.CL

    DataComp-LM: In search of the next generation of training sets for language models

    Authors: Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner , et al. (34 additional authors not shown)

    Abstract: We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with dat… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Project page: https://www.datacomp.ai/dclm/

  2. arXiv:2404.10177  [pdf, other

    cs.CV cs.AI cs.LG

    Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data

    Authors: Giannis Daras, Alexandros G. Dimakis, Constantinos Daskalakis

    Abstract: Ambient diffusion is a recently proposed framework for training diffusion models using corrupted data. Both Ambient Diffusion and alternative SURE-based approaches for learning diffusion models from corrupted data resort to approximations which deteriorate performance. We present the first framework for training diffusion models that provably sample from the uncorrupted distribution given only noi… ▽ More

    Submitted 20 March, 2024; originally announced April 2024.

    Comments: Preprint, work in progress. 19 pages, 9 figures

  3. arXiv:2403.08728  [pdf, other

    cs.CV cs.AI cs.LG

    Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models trained on Corrupted Data

    Authors: Asad Aali, Giannis Daras, Brett Levac, Sidharth Kumar, Alexandros G. Dimakis, Jonathan I. Tamir

    Abstract: We provide a framework for solving inverse problems with diffusion models learned from linearly corrupted data. Our method, Ambient Diffusion Posterior Sampling (A-DPS), leverages a generative model pre-trained on one type of corruption (e.g. image inpainting) to perform posterior sampling conditioned on measurements from a potentially different forward process (e.g. image blurring). We test the e… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Pre-print, work in progress

  4. arXiv:2307.00619  [pdf, other

    cs.LG cs.AI stat.ML

    Solving Linear Inverse Problems Provably via Posterior Sampling with Latent Diffusion Models

    Authors: Litu Rout, Negin Raoof, Giannis Daras, Constantine Caramanis, Alexandros G. Dimakis, Sanjay Shakkottai

    Abstract: We present the first framework to solve linear inverse problems leveraging pre-trained latent diffusion models. Previously proposed algorithms (such as DPS and DDRM) only apply to pixel-space diffusion models. We theoretically analyze our algorithm showing provable sample recovery in a linear model setting. The algorithmic insight obtained from our analysis extends to more general settings often c… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: Preprint

  5. arXiv:2305.19256  [pdf, other

    cs.LG cs.AI cs.CV cs.IT

    Ambient Diffusion: Learning Clean Distributions from Corrupted Data

    Authors: Giannis Daras, Kulin Shah, Yuval Dagan, Aravind Gollakota, Alexandros G. Dimakis, Adam Klivans

    Abstract: We present the first diffusion-based framework that can learn an unknown distribution using only highly-corrupted samples. This problem arises in scientific applications where access to uncorrupted samples is impossible or expensive to acquire. Another benefit of our approach is the ability to train generative models that are less likely to memorize individual training samples since they never obs… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: 24 pages, 11 figures

  6. arXiv:2304.14108  [pdf, other

    cs.CV cs.CL cs.LG

    DataComp: In search of the next generation of multimodal datasets

    Authors: Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song , et al. (9 additional authors not shown)

    Abstract: Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the ML ecosystem, we introduce DataComp, a testbed for dataset experiments centered around a new candidate pool of 12.8 billion image-text pairs from Commo… ▽ More

    Submitted 20 October, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023 Datasets and Benchmarks Track

  7. arXiv:2303.03384  [pdf, ps, other

    cs.LG math.ST stat.ML

    Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-Type Samplers

    Authors: Sitan Chen, Giannis Daras, Alexandros G. Dimakis

    Abstract: We develop a framework for non-asymptotic analysis of deterministic samplers used for diffusion generative modeling. Several recent works have analyzed stochastic samplers using tools like Girsanov's theorem and a chain rule variant of the interpolation argument. Unfortunately, these techniques give vacuous bounds when applied to deterministic samplers. We give a new operational interpretation for… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

    Comments: 29 pages

  8. arXiv:2302.09057  [pdf, other

    cs.LG cs.AI cs.CV cs.IT

    Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be Consistent

    Authors: Giannis Daras, Yuval Dagan, Alexandros G. Dimakis, Constantinos Daskalakis

    Abstract: Imperfect score-matching leads to a shift between the training and the sampling distribution of diffusion models. Due to the recursive nature of the generation process, errors in previous steps yield sampling iterates that drift away from the training distribution. Yet, the standard training objective via Denoising Score Matching (DSM) is only designed to optimize over non-drifted data. To train o… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: 29 pages, 8 figures

  9. arXiv:2211.17115  [pdf, other

    cs.CV cs.AI cs.LG

    Multiresolution Textual Inversion

    Authors: Giannis Daras, Alexandros G. Dimakis

    Abstract: We extend Textual Inversion to learn pseudo-words that represent a concept at different resolutions. This allows us to generate images that use the concept with different levels of detail and also to manipulate different resolutions using language. Once learned, the user can generate images at different levels of agreement to the original concept; "A photo of $S^*(0)$" produces the exact object wh… ▽ More

    Submitted 30 November, 2022; originally announced November 2022.

    Comments: Accepted at NeurIPS 2022 Workshop on Score-Based Methods. 5 pages, 4 Figures, work in progress

  10. arXiv:2210.11618  [pdf, other

    cs.LG cs.AI cs.CL

    Multitasking Models are Robust to Structural Failure: A Neural Model for Bilingual Cognitive Reserve

    Authors: Giannis Daras, Negin Raoof, Zoi Gkalitsiou, Alexandros G. Dimakis

    Abstract: We find a surprising connection between multitask learning and robustness to neuron failures. Our experiments show that bilingual language models retain higher performance under various neuron perturbations, such as random deletions, magnitude pruning and weight noise compared to equivalent monolingual ones. We provide a theoretical justification for this robustness by mathematically analyzing lin… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted at NeurIPS 2022. 22 pages, 11 Figures

  11. arXiv:2209.05442  [pdf, other

    cs.CV cs.AI cs.LG

    Soft Diffusion: Score Matching for General Corruptions

    Authors: Giannis Daras, Mauricio Delbracio, Hossein Talebi, Alexandros G. Dimakis, Peyman Milanfar

    Abstract: We define a broader family of corruption processes that generalizes previously known diffusion models. To reverse these general diffusions, we propose a new objective called Soft Score Matching that provably learns the score function for any linear corruption process and yields state of the art results for CelebA. Soft Score Matching incorporates the degradation process in the network. Our new los… ▽ More

    Submitted 4 October, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

    Comments: 21 pages, 12 figures, work in progress

  12. arXiv:2206.09104  [pdf, other

    cs.LG cs.AI

    Score-Guided Intermediate Layer Optimization: Fast Langevin Mixing for Inverse Problems

    Authors: Giannis Daras, Yuval Dagan, Alexandros G. Dimakis, Constantinos Daskalakis

    Abstract: We prove fast mixing and characterize the stationary distribution of the Langevin Algorithm for inverting random weighted DNN generators. This result extends the work of Hand and Voroninski from efficient inversion to efficient posterior sampling. In practice, to allow for increased expressivity, we propose to do posterior sampling in the latent space of a pre-trained generative model. To achieve… ▽ More

    Submitted 22 June, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: Accepted to ICML 2022. 32 pages, 9 Figures

  13. arXiv:2206.00169  [pdf, other

    cs.LG cs.CL cs.CR cs.CV

    Discovering the Hidden Vocabulary of DALLE-2

    Authors: Giannis Daras, Alexandros G. Dimakis

    Abstract: We discover that DALLE-2 seems to have a hidden vocabulary that can be used to generate images with absurd prompts. For example, it seems that \texttt{Apoploe vesrreaitais} means birds and \texttt{Contarra ccetnxniams luryca tanniounons} (sometimes) means bugs or pests. We find that these prompts are often consistent in isolation but also sometimes in combinations. We present our black-box method… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

    Comments: 6 pages, 4 figures

  14. arXiv:2112.09061  [pdf, other

    cs.CV cs.AI cs.LG

    Solving Inverse Problems with NerfGANs

    Authors: Giannis Daras, Wen-Sheng Chu, Abhishek Kumar, Dmitry Lagun, Alexandros G. Dimakis

    Abstract: We introduce a novel framework for solving inverse problems using NeRF-style generative models. We are interested in the problem of 3-D scene reconstruction given a single 2-D image and known camera parameters. We show that naively optimizing the latent space leads to artifacts and poor novel view rendering. We attribute this problem to volume obstructions that are clear in the 3-D geometry and be… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: 16 pages, 18 figures

  15. arXiv:2108.01368  [pdf, other

    cs.LG cs.CV cs.IT stat.ML

    Robust Compressed Sensing MRI with Deep Generative Priors

    Authors: Ajil Jalal, Marius Arvinte, Giannis Daras, Eric Price, Alexandros G. Dimakis, Jonathan I. Tamir

    Abstract: The CSGM framework (Bora-Jalal-Price-Dimakis'17) has shown that deep generative priors can be powerful tools for solving inverse problems. However, to date this framework has been empirically successful only on certain datasets (for example, human faces and MNIST digits), and it is known to perform poorly on out-of-distribution samples. In this paper, we present the first successful application of… ▽ More

    Submitted 6 December, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

  16. arXiv:2102.07364  [pdf, other

    cs.LG

    Intermediate Layer Optimization for Inverse Problems using Deep Generative Models

    Authors: Giannis Daras, Joseph Dean, Ajil Jalal, Alexandros G. Dimakis

    Abstract: We propose Intermediate Layer Optimization (ILO), a novel optimization algorithm for solving inverse problems with deep generative models. Instead of optimizing only over the initial latent code, we progressively change the input layer obtaining successively more expressive generators. To explore the higher dimensional spaces, our method searches for latent codes that lie within a small $l_1$ ball… ▽ More

    Submitted 15 February, 2021; originally announced February 2021.

  17. arXiv:2010.05315  [pdf, other

    cs.LG

    SMYRF: Efficient Attention using Asymmetric Clustering

    Authors: Giannis Daras, Nikita Kitaev, Augustus Odena, Alexandros G. Dimakis

    Abstract: We propose a novel type of balanced clustering algorithm to approximate attention. Attention complexity is reduced from $O(N^2)$ to $O(N \log N)$, where $N$ is the sequence length. Our algorithm, SMYRF, uses Locality Sensitive Hashing (LSH) in a novel way by defining new Asymmetric transformations and an adaptive scheme that produces balanced clusters. The biggest advantage of SMYRF is that it can… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

    Comments: 30 pages, 10 figures

  18. arXiv:1911.12287  [pdf, other

    cs.LG cs.CV stat.ML

    Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models

    Authors: Giannis Daras, Augustus Odena, Han Zhang, Alexandros G. Dimakis

    Abstract: We introduce a new local sparse attention layer that preserves two-dimensional geometry and locality. We show that by just replacing the dense attention layer of SAGAN with our construction, we obtain very significant FID, Inception score and pure visual improvements. FID score is improved from $18.65$ to $15.94$ on ImageNet, keeping all other parameters the same. The sparse attention patterns tha… ▽ More

    Submitted 2 December, 2019; v1 submitted 27 November, 2019; originally announced November 2019.

    Comments: Added TFRC, tensorflow-gan acknowledgements. Changed "Ablation Study" to "Ablation Studies"