Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 137 results for author: Dimakis, A G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11794  [pdf, other

    cs.LG cs.CL

    DataComp-LM: In search of the next generation of training sets for language models

    Authors: Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner , et al. (34 additional authors not shown)

    Abstract: We introduce DataComp for Language Models (DCLM), a testbed for controlled dataset experiments with the goal of improving language models. As part of DCLM, we provide a standardized corpus of 240T tokens extracted from Common Crawl, effective pretraining recipes based on the OpenLM framework, and a broad suite of 53 downstream evaluations. Participants in the DCLM benchmark can experiment with dat… ▽ More

    Submitted 20 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Project page: https://www.datacomp.ai/dclm/

  2. arXiv:2404.10917  [pdf, other

    cs.CL

    Which questions should I answer? Salience Prediction of Inquisitive Questions

    Authors: Yating Wu, Ritika Mangla, Alexandros G. Dimakis, Greg Durrett, Junyi Jessy Li

    Abstract: Inquisitive questions -- open-ended, curiosity-driven questions people ask as they read -- are an integral part of discourse processing (Kehler and Rohde, 2017; Onea, 2016) and comprehension (Prince, 2004). Recent work in NLP has taken advantage of question generation capabilities of LLMs to enhance a wide range of applications. But the space of inquisitive questions is vast: many questions can be… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  3. arXiv:2404.10177  [pdf, other

    cs.CV cs.AI cs.LG

    Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data

    Authors: Giannis Daras, Alexandros G. Dimakis, Constantinos Daskalakis

    Abstract: Ambient diffusion is a recently proposed framework for training diffusion models using corrupted data. Both Ambient Diffusion and alternative SURE-based approaches for learning diffusion models from corrupted data resort to approximations which deteriorate performance. We present the first framework for training diffusion models that provably sample from the uncorrupted distribution given only noi… ▽ More

    Submitted 22 July, 2024; v1 submitted 20 March, 2024; originally announced April 2024.

    Comments: Accepted to ICML 2024

  4. arXiv:2404.08634  [pdf, other

    cs.CL cs.AI cs.LG

    Pre-training Small Base LMs with Fewer Tokens

    Authors: Sunny Sanyal, Sujay Sanghavi, Alexandros G. Dimakis

    Abstract: We study the effectiveness of a simple approach to develop a small base language model (LM) starting from an existing large base LM: first inherit a few transformer blocks from the larger LM, and then train this smaller model on a very small subset (0.1\%) of the raw pretraining data of the larger model. We call our simple recipe Inheritune and first demonstrate it for building a small base LM wit… ▽ More

    Submitted 12 April, 2024; originally announced April 2024.

    Comments: 15 pages, 6 figures, 10 tables

  5. arXiv:2403.08728  [pdf, other

    cs.CV cs.AI cs.LG

    Ambient Diffusion Posterior Sampling: Solving Inverse Problems with Diffusion Models trained on Corrupted Data

    Authors: Asad Aali, Giannis Daras, Brett Levac, Sidharth Kumar, Alexandros G. Dimakis, Jonathan I. Tamir

    Abstract: We provide a framework for solving inverse problems with diffusion models learned from linearly corrupted data. Our method, Ambient Diffusion Posterior Sampling (A-DPS), leverages a generative model pre-trained on one type of corruption (e.g. image inpainting) to perform posterior sampling conditioned on measurements from a potentially different forward process (e.g. image blurring). We test the e… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: Pre-print, work in progress

  6. arXiv:2403.08540  [pdf, other

    cs.CL cs.LG

    Language models scale reliably with over-training and on downstream tasks

    Authors: Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar, Suchin Gururangan, Mitchell Wortsman, Rulin Shao, Jean Mercat, Alex Fang, Jeffrey Li, Sedrick Keh, Rui Xin, Marianna Nezhurina, Igor Vasiljevic, Jenia Jitsev, Luca Soldaini, Alexandros G. Dimakis, Gabriel Ilharco, Pang Wei Koh, Shuran Song, Thomas Kollar, Yair Carmon, Achal Dave, Reinhard Heckel, Niklas Muennighoff, Ludwig Schmidt

    Abstract: Scaling laws are useful guides for derisking expensive training runs, as they predict performance of large models using cheaper, small-scale experiments. However, there remain gaps between current scaling studies and how language models are ultimately trained and evaluated. For instance, scaling is usually studied in the compute-optimal training regime (i.e., "Chinchilla optimal" regime). In contr… ▽ More

    Submitted 14 June, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  7. arXiv:2307.00619  [pdf, other

    cs.LG cs.AI stat.ML

    Solving Linear Inverse Problems Provably via Posterior Sampling with Latent Diffusion Models

    Authors: Litu Rout, Negin Raoof, Giannis Daras, Constantine Caramanis, Alexandros G. Dimakis, Sanjay Shakkottai

    Abstract: We present the first framework to solve linear inverse problems leveraging pre-trained latent diffusion models. Previously proposed algorithms (such as DPS and DDRM) only apply to pixel-space diffusion models. We theoretically analyze our algorithm showing provable sample recovery in a linear model setting. The algorithmic insight obtained from our analysis extends to more general settings often c… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

    Comments: Preprint

  8. arXiv:2306.04001  [pdf, other

    cs.LG cs.AI eess.SP

    One-Dimensional Deep Image Prior for Curve Fitting of S-Parameters from Electromagnetic Solvers

    Authors: Sriram Ravula, Varun Gorti, Bo Deng, Swagato Chakraborty, James Pingenot, Bhyrav Mutnury, Doug Wallace, Doug Winterberg, Adam Klivans, Alexandros G. Dimakis

    Abstract: A key problem when modeling signal integrity for passive filters and interconnects in IC packages is the need for multiple S-parameter measurements within a desired frequency band to obtain adequate resolution. These samples are often computationally expensive to obtain using electromagnetic (EM) field solvers. Therefore, a common approach is to select a small subset of the necessary samples and u… ▽ More

    Submitted 6 June, 2023; originally announced June 2023.

  9. arXiv:2306.03284  [pdf, other

    cs.LG eess.IV

    Optimizing Sampling Patterns for Compressed Sensing MRI with Diffusion Generative Models

    Authors: Sriram Ravula, Brett Levac, Ajil Jalal, Jonathan I. Tamir, Alexandros G. Dimakis

    Abstract: Diffusion-based generative models have been used as powerful priors for magnetic resonance imaging (MRI) reconstruction. We present a learning method to optimize sub-sampling patterns for compressed sensing multi-coil MRI that leverages pre-trained diffusion generative models. Crucially, during training we use a single-step reconstruction based on the posterior mean estimate given by the diffusion… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  10. arXiv:2305.19256  [pdf, other

    cs.LG cs.AI cs.CV cs.IT

    Ambient Diffusion: Learning Clean Distributions from Corrupted Data

    Authors: Giannis Daras, Kulin Shah, Yuval Dagan, Aravind Gollakota, Alexandros G. Dimakis, Adam Klivans

    Abstract: We present the first diffusion-based framework that can learn an unknown distribution using only highly-corrupted samples. This problem arises in scientific applications where access to uncorrupted samples is impossible or expensive to acquire. Another benefit of our approach is the ability to train generative models that are less likely to memorize individual training samples since they never obs… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: 24 pages, 11 figures

  11. arXiv:2303.03384  [pdf, ps, other

    cs.LG math.ST stat.ML

    Restoration-Degradation Beyond Linear Diffusions: A Non-Asymptotic Analysis For DDIM-Type Samplers

    Authors: Sitan Chen, Giannis Daras, Alexandros G. Dimakis

    Abstract: We develop a framework for non-asymptotic analysis of deterministic samplers used for diffusion generative modeling. Several recent works have analyzed stochastic samplers using tools like Girsanov's theorem and a chain rule variant of the interpolation argument. Unfortunately, these techniques give vacuous bounds when applied to deterministic samplers. We give a new operational interpretation for… ▽ More

    Submitted 6 March, 2023; originally announced March 2023.

    Comments: 29 pages

  12. arXiv:2302.09057  [pdf, other

    cs.LG cs.AI cs.CV cs.IT

    Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be Consistent

    Authors: Giannis Daras, Yuval Dagan, Alexandros G. Dimakis, Constantinos Daskalakis

    Abstract: Imperfect score-matching leads to a shift between the training and the sampling distribution of diffusion models. Due to the recursive nature of the generation process, errors in previous steps yield sampling iterates that drift away from the training distribution. Yet, the standard training objective via Denoising Score Matching (DSM) is only designed to optimize over non-drifted data. To train o… ▽ More

    Submitted 17 February, 2023; originally announced February 2023.

    Comments: 29 pages, 8 figures

  13. arXiv:2211.17115  [pdf, other

    cs.CV cs.AI cs.LG

    Multiresolution Textual Inversion

    Authors: Giannis Daras, Alexandros G. Dimakis

    Abstract: We extend Textual Inversion to learn pseudo-words that represent a concept at different resolutions. This allows us to generate images that use the concept with different levels of detail and also to manipulate different resolutions using language. Once learned, the user can generate images at different levels of agreement to the original concept; "A photo of $S^*(0)$" produces the exact object wh… ▽ More

    Submitted 30 November, 2022; originally announced November 2022.

    Comments: Accepted at NeurIPS 2022 Workshop on Score-Based Methods. 5 pages, 4 Figures, work in progress

  14. arXiv:2210.11618  [pdf, other

    cs.LG cs.AI cs.CL

    Multitasking Models are Robust to Structural Failure: A Neural Model for Bilingual Cognitive Reserve

    Authors: Giannis Daras, Negin Raoof, Zoi Gkalitsiou, Alexandros G. Dimakis

    Abstract: We find a surprising connection between multitask learning and robustness to neuron failures. Our experiments show that bilingual language models retain higher performance under various neuron perturbations, such as random deletions, magnitude pruning and weight noise compared to equivalent monolingual ones. We provide a theoretical justification for this robustness by mathematically analyzing lin… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: Accepted at NeurIPS 2022. 22 pages, 11 Figures

  15. arXiv:2210.08069  [pdf, ps, other

    cs.LG stat.ML

    Zonotope Domains for Lagrangian Neural Network Verification

    Authors: Matt Jordan, Jonathan Hayase, Alexandros G. Dimakis, Sewoong Oh

    Abstract: Neural network verification aims to provide provable bounds for the output of a neural network for a given input range. Notable prior works in this domain have either generated bounds using abstract domains, which preserve some dependency between intermediate neurons in the network; or framed verification as an optimization problem and solved a relaxation using Lagrangian methods. A key drawback o… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: Accepted into NeurIPS 2022. Code: https://github.com/revbucket/dual-verification

  16. arXiv:2209.05442  [pdf, other

    cs.CV cs.AI cs.LG

    Soft Diffusion: Score Matching for General Corruptions

    Authors: Giannis Daras, Mauricio Delbracio, Hossein Talebi, Alexandros G. Dimakis, Peyman Milanfar

    Abstract: We define a broader family of corruption processes that generalizes previously known diffusion models. To reverse these general diffusions, we propose a new objective called Soft Score Matching that provably learns the score function for any linear corruption process and yields state of the art results for CelebA. Soft Score Matching incorporates the degradation process in the network. Our new los… ▽ More

    Submitted 4 October, 2022; v1 submitted 12 September, 2022; originally announced September 2022.

    Comments: 21 pages, 12 figures, work in progress

  17. arXiv:2206.09104  [pdf, other

    cs.LG cs.AI

    Score-Guided Intermediate Layer Optimization: Fast Langevin Mixing for Inverse Problems

    Authors: Giannis Daras, Yuval Dagan, Alexandros G. Dimakis, Constantinos Daskalakis

    Abstract: We prove fast mixing and characterize the stationary distribution of the Langevin Algorithm for inverting random weighted DNN generators. This result extends the work of Hand and Voroninski from efficient inversion to efficient posterior sampling. In practice, to allow for increased expressivity, we propose to do posterior sampling in the latent space of a pre-trained generative model. To achieve… ▽ More

    Submitted 22 June, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: Accepted to ICML 2022. 32 pages, 9 Figures

  18. arXiv:2206.00169  [pdf, other

    cs.LG cs.CL cs.CR cs.CV

    Discovering the Hidden Vocabulary of DALLE-2

    Authors: Giannis Daras, Alexandros G. Dimakis

    Abstract: We discover that DALLE-2 seems to have a hidden vocabulary that can be used to generate images with absurd prompts. For example, it seems that \texttt{Apoploe vesrreaitais} means birds and \texttt{Contarra ccetnxniams luryca tanniounons} (sometimes) means bugs or pests. We find that these prompts are often consistent in isolation but also sometimes in combinations. We present our black-box method… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

    Comments: 6 pages, 4 figures

  19. arXiv:2112.09061  [pdf, other

    cs.CV cs.AI cs.LG

    Solving Inverse Problems with NerfGANs

    Authors: Giannis Daras, Wen-Sheng Chu, Abhishek Kumar, Dmitry Lagun, Alexandros G. Dimakis

    Abstract: We introduce a novel framework for solving inverse problems using NeRF-style generative models. We are interested in the problem of 3-D scene reconstruction given a single 2-D image and known camera parameters. We show that naively optimizing the latent space leads to artifacts and poor novel view rendering. We attribute this problem to volume obstructions that are clear in the 3-D geometry and be… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: 16 pages, 18 figures

  20. arXiv:2112.02475  [pdf, other

    cs.CV eess.IV

    Deblurring via Stochastic Refinement

    Authors: Jay Whang, Mauricio Delbracio, Hossein Talebi, Chitwan Saharia, Alexandros G. Dimakis, Peyman Milanfar

    Abstract: Image deblurring is an ill-posed problem with multiple plausible solutions for a given input image. However, most existing methods produce a deterministic estimate of the clean image and are trained to minimize pixel-level distortion. These metrics are known to be poorly correlated with human perception, and often lead to unrealistic reconstructions. We present an alternative framework for blind d… ▽ More

    Submitted 28 December, 2021; v1 submitted 4 December, 2021; originally announced December 2021.

  21. arXiv:2110.07439  [pdf, other

    cs.LG cs.CV

    Inverse Problems Leveraging Pre-trained Contrastive Representations

    Authors: Sriram Ravula, Georgios Smyrnis, Matt Jordan, Alexandros G. Dimakis

    Abstract: We study a new family of inverse problems for recovering representations of corrupted data. We assume access to a pre-trained representation learning network R(x) that operates on clean images, like CLIP. The problem is to recover the representation of an image R(x), if we are only given a corrupted version A(x), for some known forward operator A. We propose a supervised inversion method that uses… ▽ More

    Submitted 26 October, 2021; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: Initial version. Final version to appear in Thirty-fifth Conference on Neural Information Processing Systems (NeurIPS 2021)

  22. arXiv:2108.01368  [pdf, other

    cs.LG cs.CV cs.IT stat.ML

    Robust Compressed Sensing MRI with Deep Generative Priors

    Authors: Ajil Jalal, Marius Arvinte, Giannis Daras, Eric Price, Alexandros G. Dimakis, Jonathan I. Tamir

    Abstract: The CSGM framework (Bora-Jalal-Price-Dimakis'17) has shown that deep generative priors can be powerful tools for solving inverse problems. However, to date this framework has been empirically successful only on certain datasets (for example, human faces and MNIST digits), and it is known to perform poorly on out-of-distribution samples. In this paper, we present the first successful application of… ▽ More

    Submitted 6 December, 2021; v1 submitted 3 August, 2021; originally announced August 2021.

  23. arXiv:2107.02732  [pdf, other

    cs.LG stat.ML

    Provable Lipschitz Certification for Generative Models

    Authors: Matt Jordan, Alexandros G. Dimakis

    Abstract: We present a scalable technique for upper bounding the Lipschitz constant of generative models. We relate this quantity to the maximal norm over the set of attainable vector-Jacobian products of a given generative model. We approximate this set by layerwise convex approximations using zonotopes. Our approach generalizes and improves upon prior work using zonotope transformers and we extend to Lips… ▽ More

    Submitted 6 July, 2021; originally announced July 2021.

    Comments: Accepted into ICML 2021

  24. arXiv:2106.12182  [pdf, other

    cs.LG cs.CV stat.ML

    Fairness for Image Generation with Uncertain Sensitive Attributes

    Authors: Ajil Jalal, Sushrut Karmalkar, Jessica Hoffmann, Alexandros G. Dimakis, Eric Price

    Abstract: This work tackles the issue of fairness in the context of generative procedures, such as image super-resolution, which entail different definitions from the standard classification setting. Moreover, while traditional group fairness definitions are typically defined with respect to specified protected groups -- camouflaging the fact that these groupings are artificial and carry historical and poli… ▽ More

    Submitted 2 July, 2021; v1 submitted 23 June, 2021; originally announced June 2021.

  25. arXiv:2106.11438  [pdf, other

    cs.LG cs.IT stat.ML

    Instance-Optimal Compressed Sensing via Posterior Sampling

    Authors: Ajil Jalal, Sushrut Karmalkar, Alexandros G. Dimakis, Eric Price

    Abstract: We characterize the measurement complexity of compressed sensing of signals drawn from a known prior distribution, even when the support of the prior is the entire space (rather than, say, sparse vectors). We show for Gaussian measurements and \emph{any} prior distribution on the signal, that the posterior sampling estimator achieves near-optimal recovery guarantees. Moreover, this result is robus… ▽ More

    Submitted 21 June, 2021; originally announced June 2021.

  26. arXiv:2106.02797  [pdf, other

    cs.IT cs.LG

    Neural Distributed Source Coding

    Authors: Jay Whang, Alliot Nagle, Anish Acharya, Hyeji Kim, Alexandros G. Dimakis

    Abstract: Distributed source coding (DSC) is the task of encoding an input in the absence of correlated side information that is only available to the decoder. Remarkably, Slepian and Wolf showed in 1973 that an encoder without access to the side information can asymptotically achieve the same compression rate as when the side information is available to it. While there is vast prior work on this topic, pra… ▽ More

    Submitted 1 July, 2024; v1 submitted 5 June, 2021; originally announced June 2021.

    Comments: To be published in JSAIT

  27. arXiv:2102.07364  [pdf, other

    cs.LG

    Intermediate Layer Optimization for Inverse Problems using Deep Generative Models

    Authors: Giannis Daras, Joseph Dean, Ajil Jalal, Alexandros G. Dimakis

    Abstract: We propose Intermediate Layer Optimization (ILO), a novel optimization algorithm for solving inverse problems with deep generative models. Instead of optimizing only over the initial latent code, we progressively change the input layer obtaining successively more expressive generators. To explore the higher dimensional spaces, our method searches for latent codes that lie within a small $l_1$ ball… ▽ More

    Submitted 15 February, 2021; originally announced February 2021.

  28. arXiv:2012.08405  [pdf, other

    eess.SP cs.LG

    Model-Based Deep Learning

    Authors: Nir Shlezinger, Jay Whang, Yonina C. Eldar, Alexandros G. Dimakis

    Abstract: Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Such model-based methods utilize mathematical formulations that represent the underlying physics, prior information and additional domain knowledge. Simple classical models are useful but sensitive to inaccuracies and may lead to poor performance when real systems display complex… ▽ More

    Submitted 11 September, 2022; v1 submitted 15 December, 2020; originally announced December 2020.

  29. arXiv:2010.05315  [pdf, other

    cs.LG

    SMYRF: Efficient Attention using Asymmetric Clustering

    Authors: Giannis Daras, Nikita Kitaev, Augustus Odena, Alexandros G. Dimakis

    Abstract: We propose a novel type of balanced clustering algorithm to approximate attention. Attention complexity is reduced from $O(N^2)$ to $O(N \log N)$, where $N$ is the sequence length. Our algorithm, SMYRF, uses Locality Sensitive Hashing (LSH) in a novel way by defining new Asymmetric transformations and an adaptive scheme that produces balanced clusters. The biggest advantage of SMYRF is that it can… ▽ More

    Submitted 11 October, 2020; originally announced October 2020.

    Comments: 30 pages, 10 figures

  30. arXiv:2006.09461  [pdf, other

    stat.ML cs.IT cs.LG

    Robust Compressed Sensing using Generative Models

    Authors: Ajil Jalal, Liu Liu, Alexandros G. Dimakis, Constantine Caramanis

    Abstract: The goal of compressed sensing is to estimate a high dimensional vector from an underdetermined system of noisy linear equations. In analogy to classical compressed sensing, here we assume a generative model as a prior, that is, we assume the vector is represented by a deep generative model $G: \mathbb{R}^k \rightarrow \mathbb{R}^n$. Classical recovery approaches such as empirical risk minimizatio… ▽ More

    Submitted 23 June, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

  31. arXiv:2005.06001  [pdf, other

    eess.IV cs.LG stat.ML

    Deep Learning Techniques for Inverse Problems in Imaging

    Authors: Gregory Ongie, Ajil Jalal, Christopher A. Metzler, Richard G. Baraniuk, Alexandros G. Dimakis, Rebecca Willett

    Abstract: Recent work in machine learning shows that deep neural networks can be used to solve a wide variety of inverse problems arising in computational imaging. We explore the central prevailing themes of this emerging area and present a taxonomy that can be used to categorize different problems and reconstruction methods. Our taxonomy is organized along two central axes: (1) whether or not a forward mod… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

  32. arXiv:2003.08089  [pdf, other

    cs.LG cs.IT stat.ML

    Solving Inverse Problems with a Flow-based Noise Model

    Authors: Jay Whang, Qi Lei, Alexandros G. Dimakis

    Abstract: We study image inverse problems with a normalizing flow prior. Our formulation views the solution as the maximum a posteriori estimate of the image conditioned on the measurements. This formulation allows us to use noise models with arbitrary dependencies as well as non-linear forward operators. We empirically validate the efficacy of our method on various inverse problems, including compressed se… ▽ More

    Submitted 1 July, 2021; v1 submitted 18 March, 2020; originally announced March 2020.

  33. arXiv:2003.01219  [pdf, other

    stat.ML cs.LG

    Exactly Computing the Local Lipschitz Constant of ReLU Networks

    Authors: Matt Jordan, Alexandros G. Dimakis

    Abstract: The local Lipschitz constant of a neural network is a useful metric with applications in robustness, generalization, and fairness evaluation. We provide novel analytic results relating the local Lipschitz constant of nonsmooth vector-valued functions to a maximization over the norm of the generalized Jacobian. We present a sufficient condition for which backpropagation always returns an element of… ▽ More

    Submitted 10 January, 2021; v1 submitted 2 March, 2020; originally announced March 2020.

    Comments: Accepted into NeurIPS 2020. Code: https://github.com/revbucket/lipMIP

  34. arXiv:2002.11743  [pdf, other

    stat.ML cs.IT cs.LG

    Composing Normalizing Flows for Inverse Problems

    Authors: Jay Whang, Erik M. Lindgren, Alexandros G. Dimakis

    Abstract: Given an inverse problem with a normalizing flow prior, we wish to estimate the distribution of the underlying signal conditioned on the observations. We approach this problem as a task of conditional inference on the pre-trained unconditional flow model. We first establish that this is computationally hard for a large class of flow models. Motivated by this, we propose a framework for approximate… ▽ More

    Submitted 14 June, 2021; v1 submitted 26 February, 2020; originally announced February 2020.

  35. arXiv:1911.12287  [pdf, other

    cs.LG cs.CV stat.ML

    Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models

    Authors: Giannis Daras, Augustus Odena, Han Zhang, Alexandros G. Dimakis

    Abstract: We introduce a new local sparse attention layer that preserves two-dimensional geometry and locality. We show that by just replacing the dense attention layer of SAGAN with our construction, we obtain very significant FID, Inception score and pure visual improvements. FID score is improved from $18.65$ to $15.94$ on ImageNet, keeping all other parameters the same. The sparse attention patterns tha… ▽ More

    Submitted 2 December, 2019; v1 submitted 27 November, 2019; originally announced November 2019.

    Comments: Added TFRC, tensorflow-gan acknowledgements. Changed "Ablation Study" to "Ablation Studies"

  36. arXiv:1910.07703  [pdf, other

    cs.LG cs.DC math.NA stat.ML

    Communication-Efficient Asynchronous Stochastic Frank-Wolfe over Nuclear-norm Balls

    Authors: Jiacheng Zhuo, Qi Lei, Alexandros G. Dimakis, Constantine Caramanis

    Abstract: Large-scale machine learning training suffers from two prior challenges, specifically for nuclear-norm constrained problems with distributed systems: the synchronization slowdown due to the straggling workers, and high communication costs. In this work, we propose an asynchronous Stochastic Frank Wolfe (SFW-asyn) method, which, for the first time, solves the two problems simultaneously, while succ… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

  37. arXiv:1910.07030  [pdf, other

    cs.LG stat.ML

    SGD Learns One-Layer Networks in WGANs

    Authors: Qi Lei, Jason D. Lee, Alexandros G. Dimakis, Constantinos Daskalakis

    Abstract: Generative adversarial networks (GANs) are a widely used framework for learning generative models. Wasserstein GANs (WGANs), one of the most successful variants of GANs, require solving a minmax optimization problem to global optimality, but are in practice successfully trained using stochastic gradient descent-ascent. In this paper, we show that, when the generator is a one-layer network, stochas… ▽ More

    Submitted 1 July, 2020; v1 submitted 15 October, 2019; originally announced October 2019.

    Comments: 24 pages, 4 figures, ICML2020

  38. arXiv:1909.01812  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Learning Distributions Generated by One-Layer ReLU Networks

    Authors: Shanshan Wu, Alexandros G. Dimakis, Sujay Sanghavi

    Abstract: We consider the problem of estimating the parameters of a $d$-dimensional rectified Gaussian distribution from i.i.d. samples. A rectified Gaussian distribution is defined by passing a standard Gaussian distribution through a one-layer ReLU neural network. We give a simple algorithm to estimate the parameters (i.e., the weight matrix and bias vector of the ReLU neural network) up to an error… ▽ More

    Submitted 19 September, 2019; v1 submitted 4 September, 2019; originally announced September 2019.

    Comments: NeurIPS 2019

  39. arXiv:1906.07437  [pdf, other

    cs.LG stat.ML

    Inverting Deep Generative models, One layer at a time

    Authors: Qi Lei, Ajil Jalal, Inderjit S. Dhillon, Alexandros G. Dimakis

    Abstract: We study the problem of inverting a deep generative model with ReLU activations. Inversion corresponds to finding a latent code vector that explains observed measurements as much as possible. In most prior works this is performed by attempting to solve a non-convex optimization problem involving the generator. In this paper we obtain several novel theoretical results for the inversion problem. W… ▽ More

    Submitted 19 June, 2019; v1 submitted 18 June, 2019; originally announced June 2019.

  40. arXiv:1906.02436  [pdf, other

    cs.LG math.OC stat.ML

    Primal-Dual Block Frank-Wolfe

    Authors: Qi Lei, Jiacheng Zhuo, Constantine Caramanis, Inderjit S. Dhillon, Alexandros G. Dimakis

    Abstract: We propose a variant of the Frank-Wolfe algorithm for solving a class of sparse/low-rank optimization problems. Our formulation includes Elastic Net, regularized SVMs and phase retrieval as special cases. The proposed Primal-Dual Block Frank-Wolfe algorithm reduces the per-iteration cost while maintaining linear convergence rate. The per iteration cost of our method depends on the structural compl… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

  41. arXiv:1904.08594  [pdf, other

    cs.LG stat.ML

    One-dimensional Deep Image Prior for Time Series Inverse Problems

    Authors: Sriram Ravula, Alexandros G. Dimakis

    Abstract: We extend the Deep Image Prior (DIP) framework to one-dimensional signals. DIP is using a randomly initialized convolutional neural network (CNN) to solve linear inverse problems by optimizing over weights to fit the observed measurements. Our main finding is that properly tuned one-dimensional convolutional architectures provide an excellent Deep Image Prior for various types of temporal signals… ▽ More

    Submitted 18 April, 2019; originally announced April 2019.

  42. arXiv:1903.08778  [pdf, other

    cs.LG cs.CR stat.ML

    Provable Certificates for Adversarial Examples: Fitting a Ball in the Union of Polytopes

    Authors: Matt Jordan, Justin Lewis, Alexandros G. Dimakis

    Abstract: We propose a novel method for computing exact pointwise robustness of deep neural networks for all convex $\ell_p$ norms. Our algorithm, GeoCert, finds the largest $\ell_p$ ball centered at an input point $x_0$, within which the output class of a given neural network with ReLU nonlinearities remains unchanged. We relate the problem of computing pointwise robustness of these networks to that of com… ▽ More

    Submitted 3 June, 2019; v1 submitted 20 March, 2019; originally announced March 2019.

    Comments: Code can be found here: https://github.com/revbucket/geometric-certificates

  43. arXiv:1902.08265  [pdf, other

    stat.ML cs.LG

    Quantifying Perceptual Distortion of Adversarial Examples

    Authors: Matt Jordan, Naren Manoj, Surbhi Goel, Alexandros G. Dimakis

    Abstract: Recent work has shown that additive threat models, which only permit the addition of bounded noise to the pixels of an image, are insufficient for fully capturing the space of imperceivable adversarial examples. For example, small rotations and spatial transformations can fool classifiers, remain imperceivable to humans, but have large additive distance from the original images. In this work, we l… ▽ More

    Submitted 21 February, 2019; originally announced February 2019.

    Comments: 18 pages, codebase/framework available at https://github.com/revbucket/mister_ed

  44. arXiv:1812.00151  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Discrete Adversarial Attacks and Submodular Optimization with Applications to Text Classification

    Authors: Qi Lei, Lingfei Wu, Pin-Yu Chen, Alexandros G. Dimakis, Inderjit S. Dhillon, Michael Witbrock

    Abstract: Adversarial examples are carefully constructed modifications to an input that completely change the output of a classifier but are imperceptible to humans. Despite these successful attacks for continuous data (such as image and audio samples), generating adversarial examples for discrete structures such as text has proven significantly more challenging. In this paper we formulate the attacks with… ▽ More

    Submitted 4 April, 2019; v1 submitted 1 December, 2018; originally announced December 2018.

    Comments: In SysML 2019

  45. arXiv:1811.10673  [pdf, other

    eess.IV cs.CV

    Adversarial Video Compression Guided by Soft Edge Detection

    Authors: Sungsoo Kim, Jin Soo Park, Christos G. Bampis, Jaeseong Lee, Mia K. Markey, Alexandros G. Dimakis, Alan C. Bovik

    Abstract: We propose a video compression framework using conditional Generative Adversarial Networks (GANs). We rely on two encoders: one that deploys a standard video codec and another which generates low-level maps via a pipeline of down-sampling, a newly devised soft edge detector, and a novel lossless compression scheme. For decoding, we use a standard video decoder as well as a neural network based one… ▽ More

    Submitted 26 November, 2018; originally announced November 2018.

  46. arXiv:1810.11905  [pdf, other

    cs.LG cs.DS math.ST stat.ML

    Sparse Logistic Regression Learns All Discrete Pairwise Graphical Models

    Authors: Shanshan Wu, Sujay Sanghavi, Alexandros G. Dimakis

    Abstract: We characterize the effectiveness of a classical algorithm for recovering the Markov graph of a general discrete pairwise graphical model from i.i.d. samples. The algorithm is (appropriately regularized) maximum conditional log-likelihood, which involves solving a convex program for each node; for Ising models this is $\ell_1$-constrained logistic regression, while for more general alphabets an… ▽ More

    Submitted 18 June, 2019; v1 submitted 28 October, 2018; originally announced October 2018.

    Comments: 30 pages, 3 figures

  47. arXiv:1810.11867  [pdf, other

    cs.LG cs.DM stat.ML

    Experimental Design for Cost-Aware Learning of Causal Graphs

    Authors: Erik M. Lindgren, Murat Kocaoglu, Alexandros G. Dimakis, Sriram Vishwanath

    Abstract: We consider the minimum cost intervention design problem: Given the essential graph of a causal graph and a cost to intervene on a variable, identify the set of interventions with minimum total cost that can learn any causal graph with the given essential graph. We first show that this problem is NP-hard. We then prove that we can achieve a constant factor approximation to this problem with a gree… ▽ More

    Submitted 28 October, 2018; originally announced October 2018.

    Comments: In NIPS 2018

  48. arXiv:1807.10399  [pdf, other

    stat.ML cs.AI cs.IT cs.LG

    Applications of Common Entropy for Causal Inference

    Authors: Murat Kocaoglu, Sanjay Shakkottai, Alexandros G. Dimakis, Constantine Caramanis, Sriram Vishwanath

    Abstract: We study the problem of discovering the simplest latent variable that can make two observed discrete variables conditionally independent. The minimum entropy required for such a latent is known as common entropy in information theory. We extend this notion to Renyi common entropy by minimizing the Renyi entropy of the latent variable. To efficiently compute common entropy, we propose an iterative… ▽ More

    Submitted 5 December, 2020; v1 submitted 26 July, 2018; originally announced July 2018.

    Comments: In Proceedings of NeurIPS 2020

  49. arXiv:1806.10175  [pdf, other

    stat.ML cs.IT cs.LG

    Learning a Compressed Sensing Measurement Matrix via Gradient Unrolling

    Authors: Shanshan Wu, Alexandros G. Dimakis, Sujay Sanghavi, Felix X. Yu, Daniel Holtmann-Rice, Dmitry Storcheus, Afshin Rostamizadeh, Sanjiv Kumar

    Abstract: Linear encoding of sparse vectors is widely popular, but is commonly data-independent -- missing any possible extra (but a priori unknown) structure beyond sparsity. In this paper we present a new method to learn linear encoders that adapt to data, while still performing well with the widely used $\ell_1$ decoder. The convex $\ell_1$ decoder prevents gradient propagation as needed in standard grad… ▽ More

    Submitted 2 July, 2019; v1 submitted 26 June, 2018; originally announced June 2018.

    Comments: 17 pages, 7 tables, 8 figures, published in ICML 2019; part of this work was done while Shanshan was an intern at Google Research, New York

  50. arXiv:1806.06438  [pdf, other

    stat.ML cs.IT cs.LG

    Compressed Sensing with Deep Image Prior and Learned Regularization

    Authors: Dave Van Veen, Ajil Jalal, Mahdi Soltanolkotabi, Eric Price, Sriram Vishwanath, Alexandros G. Dimakis

    Abstract: We propose a novel method for compressed sensing recovery using untrained deep generative models. Our method is based on the recently proposed Deep Image Prior (DIP), wherein the convolutional weights of the network are optimized to match the observed measurements. We show that this approach can be applied to solve any differentiable linear inverse problem, outperforming previous unlearned methods… ▽ More

    Submitted 29 October, 2020; v1 submitted 17 June, 2018; originally announced June 2018.