Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–27 of 27 results for author: Dieng, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.14746  [pdf, other

    cs.LG cs.RO

    Relational Reasoning On Graphs Using Opinion Dynamics

    Authors: Yulong Yang, Bowen Feng, Keqin Wang, Naomi Leonard, Adji Bousso Dieng, Christine Allen-Blanchette

    Abstract: From pedestrians to Kuramoto oscillators, interactions between agents govern how a multitude of dynamical systems evolve in space and time. Discovering how these agents relate to each other can improve our understanding of the often complex dynamics that underlie these systems. Recent works learn to categorize relationships between agents based on observations of their physical behavior. These app… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 14 pages, 7 figures

  2. arXiv:2406.00990  [pdf, other

    cs.LG cs.RO

    Constraint-Aware Diffusion Models for Trajectory Optimization

    Authors: Anjian Li, Zihan Ding, Adji Bousso Dieng, Ryne Beeson

    Abstract: The diffusion model has shown success in generating high-quality and diverse solutions to trajectory optimization problems. However, diffusion models with neural networks inevitably make prediction errors, which leads to constraint violations such as unmet goals or collisions. This paper presents a novel constraint-aware diffusion model for trajectory optimization. We introduce a novel hybrid loss… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  3. arXiv:2405.11848  [pdf, other

    stat.ML cs.AI cs.LG cs.NE physics.ao-ph q-bio.NC

    Alternators For Sequence Modeling

    Authors: Mohammad Reza Rezaei, Adji Bousso Dieng

    Abstract: This paper introduces alternators, a novel family of non-Markovian dynamical models for sequences. An alternator features two neural networks: the observation trajectory network (OTN) and the feature trajectory network (FTN). The OTN and the FTN work in conjunction, alternating between outputting samples in the observation space and some feature space, respectively, over a cycle. The parameters of… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: A new versatile family of sequence models that can be used for both generative modeling and supervised learning. The codebase will be made available upon publication. This paper is dedicated to Thomas Sankara

  4. arXiv:2405.02449  [pdf, other

    stat.ML cond-mat.mtrl-sci cs.LG q-bio.BM

    Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design

    Authors: Quan Nguyen, Adji Bousso Dieng

    Abstract: Experimental design techniques such as active search and Bayesian optimization are widely used in the natural sciences for data collection and discovery. However, existing techniques tend to favor exploitation over exploration of the search space, which causes them to get stuck in local optima. This ``collapse" problem prevents experimental design algorithms from yielding diverse high-quality data… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: Published in International Conference on Machine Learning, ICML 2024. Code can be found in the Vertaix GitHub: https://github.com/vertaix/Quality-Weighted-Vendi-Score. Paper dedicated to Kwame Nkrumah

  5. arXiv:2403.12025  [pdf, other

    cs.CY cs.CL cs.LG

    A Toolbox for Surfacing Health Equity Harms and Biases in Large Language Models

    Authors: Stephen R. Pfohl, Heather Cole-Lewis, Rory Sayres, Darlene Neal, Mercy Asiedu, Awa Dieng, Nenad Tomasev, Qazi Mamunur Rashid, Shekoofeh Azizi, Negar Rostamzadeh, Liam G. McCoy, Leo Anthony Celi, Yun Liu, Mike Schaekermann, Alanna Walton, Alicia Parrish, Chirag Nagpal, Preeti Singh, Akeiylah Dewitt, Philip Mansfield, Sushant Prakash, Katherine Heller, Alan Karthikesalingam, Christopher Semturs, Joelle Barral , et al. (5 additional authors not shown)

    Abstract: Large language models (LLMs) hold immense promise to serve complex health information needs but also have the potential to introduce harm and exacerbate health disparities. Reliably evaluating equity-related model failures is a critical step toward developing systems that promote health equity. In this work, we present resources and methodologies for surfacing biases with potential to precipitate… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  6. arXiv:2403.05571  [pdf, other

    cs.RO cs.LG

    Combining Constrained Diffusion Models and Numerical Solvers for Efficient and Robust Non-Convex Trajectory Optimization

    Authors: Anjian Li, Zihan Ding, Adji Bousso Dieng, Ryne Beeson

    Abstract: Motivated by the need to solve open-loop optimal control problems with computational efficiency and reliable constraint satisfaction, we introduce a general framework that combines diffusion models and numerical optimization solvers. Optimal control problems are rarely solvable in closed form, hence they are often transcribed into numerical trajectory optimization problems, which then require init… ▽ More

    Submitted 26 May, 2024; v1 submitted 21 February, 2024; originally announced March 2024.

  7. arXiv:2403.03357  [pdf, other

    cs.AI cs.CY

    The Case for Globalizing Fairness: A Mixed Methods Study on Colonialism, AI, and Health in Africa

    Authors: Mercy Asiedu, Awa Dieng, Iskandar Haykel, Negar Rostamzadeh, Stephen Pfohl, Chirag Nagpal, Maria Nagawa, Abigail Oppong, Sanmi Koyejo, Katherine Heller

    Abstract: With growing application of machine learning (ML) technologies in healthcare, there have been calls for developing techniques to understand and mitigate biases these systems may exhibit. Fair-ness considerations in the development of ML-based solutions for health have particular implications for Africa, which already faces inequitable power imbalances between the Global North and South.This paper… ▽ More

    Submitted 11 March, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: 11 pages, 4 figures. arXiv admin note: text overlap with arXiv:2304.02190

  8. arXiv:2311.13028  [pdf, other

    cs.LG cs.AI cs.DC eess.SP

    DMLR: Data-centric Machine Learning Research -- Past, Present and Future

    Authors: Luis Oala, Manil Maskey, Lilith Bat-Leah, Alicia Parrish, Nezihe Merve Gürel, Tzu-Sheng Kuo, Yang Liu, Rotem Dror, Danilo Brajovic, Xiaozhe Yao, Max Bartolo, William A Gaviria Rojas, Ryan Hileman, Rainier Aliment, Michael W. Mahoney, Meg Risdal, Matthew Lease, Wojciech Samek, Debojyoti Dutta, Curtis G Northcutt, Cody Coleman, Braden Hancock, Bernard Koch, Girmaw Abebe Tadesse, Bojan Karlaš , et al. (13 additional authors not shown)

    Abstract: Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods tow… ▽ More

    Submitted 1 June, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

    Comments: Published in the Journal of Data-centric Machine Learning Research (DMLR) at https://data.mlr.press/assets/pdf/v01-5.pdf

  9. arXiv:2310.14029  [pdf, other

    cs.CL cond-mat.mtrl-sci

    LLM-Prop: Predicting Physical And Electronic Properties Of Crystalline Solids From Their Text Descriptions

    Authors: Andre Niyongabo Rubungo, Craig Arnold, Barry P. Rand, Adji Bousso Dieng

    Abstract: The prediction of crystal properties plays a crucial role in the crystal design process. Current methods for predicting crystal properties focus on modeling crystal structures using graph neural networks (GNNs). Although GNNs are powerful, accurately modeling the complex interactions between atoms and molecules within a crystal remains a challenge. Surprisingly, predicting crystal properties from… ▽ More

    Submitted 21 October, 2023; originally announced October 2023.

    Comments: Code for LLM-Prop can be found at: https://github.com/vertaix/LLM-Prop

  10. arXiv:2310.12952  [pdf, other

    cs.LG physics.chem-ph q-bio.PE

    Cousins Of The Vendi Score: A Family Of Similarity-Based Diversity Metrics For Science And Machine Learning

    Authors: Amey P. Pasarkar, Adji Bousso Dieng

    Abstract: Measuring diversity accurately is important for many scientific fields, including machine learning (ML), ecology, and chemistry. The Vendi Score was introduced as a generic similarity-based diversity metric that extends the Hill number of order q=1 by leveraging ideas from quantum statistical mechanics. Contrary to many diversity metrics in ecology, the Vendi Score accounts for similarity and does… ▽ More

    Submitted 4 May, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

    Comments: Published in the proceedings of Artificial Intelligence and Statistics, AISTATS 2024. This paper is dedicated to Aline Sitoe Diatta. The code can be found on Vertaix's GitHub. Code for evaluating diversity using the Vendi scores can be found at https://github.com/vertaix/Vendi-Score. Code for using the scores within Vendi Sampling can be found at https://github.com/vertaix/Vendi-Sampling

  11. arXiv:2304.02190  [pdf, other

    cs.LG cs.AI cs.CY

    Globalizing Fairness Attributes in Machine Learning: A Case Study on Health in Africa

    Authors: Mercy Nyamewaa Asiedu, Awa Dieng, Abigail Oppong, Maria Nagawa, Sanmi Koyejo, Katherine Heller

    Abstract: With growing machine learning (ML) applications in healthcare, there have been calls for fairness in ML to understand and mitigate ethical concerns these systems may pose. Fairness has implications for global health in Africa, which already has inequitable power imbalances between the Global North and South. This paper seeks to explore fairness for global health, with Africa as a case study. We pr… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  12. arXiv:2210.02410  [pdf, other

    cs.LG cond-mat.mtrl-sci stat.ML

    The Vendi Score: A Diversity Evaluation Metric for Machine Learning

    Authors: Dan Friedman, Adji Bousso Dieng

    Abstract: Diversity is an important criterion for many areas of machine learning (ML), including generative modeling and dataset curation. However, existing metrics for measuring diversity are often domain-specific and limited in flexibility. In this paper, we address the diversity evaluation problem by proposing the Vendi Score, which connects and extends ideas from ecology and quantum statistical mechanic… ▽ More

    Submitted 2 July, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: The Vendi Score is available as a pip package at https://github.com/vertaix/Vendi-Score

  13. arXiv:2206.06295  [pdf, other

    cs.LG cs.AI stat.ML

    Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients

    Authors: Kyurae Kim, Jisu Oh, Jacob R. Gardner, Adji Bousso Dieng, Hongseok Kim

    Abstract: Minimizing the inclusive Kullback-Leibler (KL) divergence with stochastic gradient descent (SGD) is challenging since its gradient is defined as an integral over the posterior. Recently, multiple methods have been proposed to run SGD with biased gradient estimates obtained from a Markov chain. This paper provides the first non-asymptotic convergence analysis of these methods by establishing their… ▽ More

    Submitted 13 October, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: Accepted to NeurIPS 2022

  14. arXiv:2202.01034  [pdf, other

    cs.LG cs.CY stat.ML

    Diagnosing failures of fairness transfer across distribution shift in real-world medical settings

    Authors: Jessica Schrouff, Natalie Harris, Oluwasanmi Koyejo, Ibrahim Alabdulmohsin, Eva Schnider, Krista Opsahl-Ong, Alex Brown, Subhrajit Roy, Diana Mincu, Christina Chen, Awa Dieng, Yuan Liu, Vivek Natarajan, Alan Karthikesalingam, Katherine Heller, Silvia Chiappa, Alexander D'Amour

    Abstract: Diagnosing and mitigating changes in model fairness under distribution shift is an important component of the safe deployment of machine learning in healthcare settings. Importantly, the success of any mitigation strategy strongly depends on the structure of the shift. Despite this, there has been little discussion of how to empirically assess the structure of a distribution shift that one is enco… ▽ More

    Submitted 10 February, 2023; v1 submitted 2 February, 2022; originally announced February 2022.

    Journal ref: Advances in Neural Information Processing Systems 35 (NeurIPS 2022)

  15. arXiv:2105.14859  [pdf, other

    cs.LG cs.CV

    Consistency Regularization for Variational Auto-Encoders

    Authors: Samarth Sinha, Adji B. Dieng

    Abstract: Variational auto-encoders (VAEs) are a powerful approach to unsupervised learning. They enable scalable approximate posterior inference in latent-variable models using variational inference (VI). A VAE posits a variational family parameterized by a deep neural network called an encoder that takes data as input. This encoder is shared across all the observations, which amortizes the cost of inferen… ▽ More

    Submitted 6 June, 2022; v1 submitted 31 May, 2021; originally announced May 2021.

    Journal ref: NeurIPS 2021

  16. arXiv:2104.12053  [pdf, other

    stat.ML cs.LG

    Deep Probabilistic Graphical Modeling

    Authors: Adji B. Dieng

    Abstract: Probabilistic graphical modeling (PGM) provides a framework for formulating an interpretable generative process of data and expressing uncertainty about unknowns, but it lacks flexibility. Deep learning (DL) is an alternative framework for learning from data that has achieved great empirical success in recent years. DL offers great flexibility, but it lacks the interpretability and calibration of… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

    Comments: This thesis was defended in April 2020 and accepted without revision. The author received her PhD in Statistics from Columbia University on May 20, 2020

  17. arXiv:1910.04302  [pdf, other

    stat.ML cs.LG stat.ME

    Prescribed Generative Adversarial Networks

    Authors: Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei, Michalis K. Titsias

    Abstract: Generative adversarial networks (GANs) are a powerful approach to unsupervised learning. They have achieved state-of-the-art performance in the image domain. However, GANs are limited in two ways. They often learn distributions with low support---a phenomenon known as mode collapse---and they do not guarantee the existence of a probability density, which makes evaluating generalization using predi… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

    Comments: Code for this paper can be found at https://github.com/adjidieng/PresGANs

  18. arXiv:1907.05545  [pdf, other

    cs.CL stat.ML

    The Dynamic Embedded Topic Model

    Authors: Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei

    Abstract: Topic modeling analyzes documents to learn meaningful patterns of words. For documents collected in sequence, dynamic topic models capture how these patterns vary over time. We develop the dynamic embedded topic model (D-ETM), a generative model of documents that combines dynamic latent Dirichlet allocation (D-LDA) and word embeddings. The D-ETM models each word with a categorical distribution par… ▽ More

    Submitted 10 October, 2019; v1 submitted 11 July, 2019; originally announced July 2019.

  19. arXiv:1907.04907  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Topic Modeling in Embedding Spaces

    Authors: Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei

    Abstract: Topic modeling analyzes documents to learn meaningful patterns of words. However, existing topic models fail to learn interpretable topics when working with large and heavy-tailed vocabularies. To this end, we develop the Embedded Topic Model (ETM), a generative model of documents that marries traditional topic models with word embeddings. In particular, it models each word with a categorical dist… ▽ More

    Submitted 7 July, 2019; originally announced July 2019.

    Comments: Code can be found at https://github.com/adjidieng/ETM

  20. arXiv:1906.05850  [pdf, other

    stat.ML cs.LG stat.ME

    Reweighted Expectation Maximization

    Authors: Adji B. Dieng, John Paisley

    Abstract: Training deep generative models with maximum likelihood remains a challenge. The typical workaround is to use variational inference (VI) and maximize a lower bound to the log marginal likelihood of the data. Variational auto-encoders (VAEs) adopt this approach. They further amortize the cost of inference by using a recognition network to parameterize the variational family. Amortized VI scales app… ▽ More

    Submitted 10 August, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

    Comments: Code can be found at https://github.com/adjidieng/REM

  21. arXiv:1807.04863  [pdf, other

    stat.ML cs.CL cs.LG

    Avoiding Latent Variable Collapse With Generative Skip Models

    Authors: Adji B. Dieng, Yoon Kim, Alexander M. Rush, David M. Blei

    Abstract: Variational autoencoders learn distributions of high-dimensional data. They model data with a deep latent-variable model and then fit the model by maximizing a lower bound of the log marginal likelihood. VAEs can capture complex distributions, but they can also suffer from an issue known as "latent variable collapse," especially if the likelihood model is powerful. Specifically, the lower bound in… ▽ More

    Submitted 30 January, 2019; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: In the Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019), Naha, Okinawa, Japan. PMLR: Volume 89. An earlier version of this paper was presented at the Workshop on Theoretical Foundations and Applications of Deep Generative Models, ICML, 2018

  22. arXiv:1806.06802  [pdf, other

    stat.ML cs.LG stat.ME

    Interpretable Almost Matching Exactly for Causal Inference

    Authors: Yameng Liu, Aw Dieng, Sudeepa Roy, Cynthia Rudin, Alexander Volfovsky

    Abstract: We aim to create the highest possible quality of treatment-control matches for categorical data in the potential outcomes framework. Matching methods are heavily used in the social sciences due to their interpretability, but most matching methods do not pass basic sanity checks: they fail when irrelevant variables are introduced, and tend to be either computationally slow or produce low-quality ma… ▽ More

    Submitted 8 June, 2019; v1 submitted 18 June, 2018; originally announced June 2018.

    Comments: AISTATS 2019

  23. arXiv:1805.01500  [pdf, other

    stat.ML cs.LG stat.ME

    Noisin: Unbiased Regularization for Recurrent Neural Networks

    Authors: Adji B. Dieng, Rajesh Ranganath, Jaan Altosaar, David M. Blei

    Abstract: Recurrent neural networks (RNNs) are powerful models of sequential data. They have been successfully used in domains such as text and speech. However, RNNs are susceptible to overfitting; regularization is important. In this paper we develop Noisin, a new method for regularizing RNNs. Noisin injects random noise into the hidden states of the RNN and then maximizes the corresponding marginal likeli… ▽ More

    Submitted 12 July, 2018; v1 submitted 3 May, 2018; originally announced May 2018.

    Comments: In Proceedings of the International Conference on Machine Learning, 2018

  24. arXiv:1802.04220  [pdf, other

    stat.ML cs.LG

    Augment and Reduce: Stochastic Inference for Large Categorical Distributions

    Authors: Francisco J. R. Ruiz, Michalis K. Titsias, Adji B. Dieng, David M. Blei

    Abstract: Categorical distributions are ubiquitous in machine learning, e.g., in classification, language models, and recommendation systems. However, when the number of possible outcomes is very large, using categorical distributions becomes computationally expensive, as the complexity scales linearly with the number of outcomes. To address this problem, we propose augment and reduce (A&R), a method to all… ▽ More

    Submitted 7 June, 2018; v1 submitted 12 February, 2018; originally announced February 2018.

    Comments: 11 pages, 2 figures

    Journal ref: Francisco J. R. Ruiz, Michalis K. Titsias, Adji B. Dieng, and David M. Blei. Augment and Reduce: Stochastic Inference for Large Categorical Distributions. International Conference on Machine Learning. Stockholm (Sweden), July 2018

  25. arXiv:1611.01702  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency

    Authors: Adji B. Dieng, Chong Wang, Jianfeng Gao, John Paisley

    Abstract: In this paper, we propose TopicRNN, a recurrent neural network (RNN)-based language model designed to directly capture the global semantic meaning relating words in a document via latent topics. Because of their sequential nature, RNNs are good at capturing the local structure of a word sequence - both semantic and syntactic - but might face difficulty remembering long-range dependencies. Intuitiv… ▽ More

    Submitted 26 February, 2017; v1 submitted 5 November, 2016; originally announced November 2016.

    Comments: International Conference on Learning Representations

  26. arXiv:1611.00328  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Variational Inference via $χ$-Upper Bound Minimization

    Authors: Adji B. Dieng, Dustin Tran, Rajesh Ranganath, John Paisley, David M. Blei

    Abstract: Variational inference (VI) is widely used as an efficient alternative to Markov chain Monte Carlo. It posits a family of approximating distributions $q$ and finds the closest member to the exact posterior $p$. Closeness is usually measured via a divergence $D(q || p)$ from $q$ to $p$. While successful, this approach also has problems. Notably, it typically leads to underestimation of the posterior… ▽ More

    Submitted 12 November, 2017; v1 submitted 1 November, 2016; originally announced November 2016.

    Comments: Neural Information Processing Systems, 2017

  27. arXiv:1610.09787  [pdf, other

    stat.CO cs.AI cs.PL stat.AP stat.ML

    Edward: A library for probabilistic modeling, inference, and criticism

    Authors: Dustin Tran, Alp Kucukelbir, Adji B. Dieng, Maja Rudolph, Dawen Liang, David M. Blei

    Abstract: Probabilistic modeling is a powerful approach for analyzing empirical information. We describe Edward, a library for probabilistic modeling. Edward's design reflects an iterative process pioneered by George Box: build a model of a phenomenon, make inferences about the model given data, and criticize the model's fit to the data. Edward supports a broad class of probabilistic models, efficient algor… ▽ More

    Submitted 31 January, 2017; v1 submitted 31 October, 2016; originally announced October 2016.