Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 80 results for author: Ravikumar, P

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.02694  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    LLM-Select: Feature Selection with Large Language Models

    Authors: Daniel P. Jeong, Zachary C. Lipton, Pradeep Ravikumar

    Abstract: In this paper, we demonstrate a surprising capability of large language models (LLMs): given only input feature names and a description of a prediction task, they are capable of selecting the most predictive features, with performance rivaling the standard tools of data science. Remarkably, these models exhibit this capacity across various query mechanisms. For example, we zero-shot prompt an LLM… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Preprint

  2. arXiv:2406.18400  [pdf, other

    cs.CL cs.LG stat.ML

    Do LLMs dream of elephants (when told not to)? Latent concept association and associative memory in transformers

    Authors: Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

    Abstract: Large Language Models (LLMs) have the capacity to store and recall facts. Through experimentation with open-source models, we observe that this ability to retrieve facts can be easily manipulated by changing contexts, even without altering their factual meanings. These findings highlight that LLMs might behave like an associative memory model where certain tokens in the contexts serve as clues to… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  3. arXiv:2403.03867  [pdf, other

    cs.CL cs.LG stat.ML

    On the Origins of Linear Representations in Large Language Models

    Authors: Yibo Jiang, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam, Victor Veitch

    Abstract: Recent works have argued that high-level semantic concepts are encoded "linearly" in the representation space of large language models. In this work, we study the origins of such linear representations. To that end, we introduce a simple latent variable model to abstract and formalize the concept dynamics of the next token prediction. We use this formalism to show that the next token prediction ob… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  4. arXiv:2402.09236  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Learning Interpretable Concepts: Unifying Causal Representation Learning and Foundation Models

    Authors: Goutham Rajendran, Simon Buchholz, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: To build intelligent machine learning systems, there are two broad approaches. One approach is to build inherently interpretable models, as endeavored by the growing field of causal representation learning. The other approach is to build highly-performant foundation models and then invest efforts into understanding how they work. In this work, we relate these two approaches and study how to learn… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: 36 pages

  5. arXiv:2402.00645  [pdf, other

    stat.ML cs.LG

    Spectrally Transformed Kernel Regression

    Authors: Runtian Zhai, Rattana Pukdee, Roger Jin, Maria-Florina Balcan, Pradeep Ravikumar

    Abstract: Unlabeled data is a key component of modern machine learning. In general, the role of unlabeled data is to impose a form of smoothness, usually from the similarity information encoded in a base kernel, such as the $ε$-neighbor kernel or the adjacency matrix of a graph. This work revisits the classical idea of spectrally transformed kernel regression (STKR), and provides a new class of general and… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: ICLR 2024 spotlight. 36 pages

  6. arXiv:2311.18048  [pdf, other

    cs.LG cs.CE eess.SY stat.ME

    An Interventional Perspective on Identifiability in Gaussian LTI Systems with Independent Component Analysis

    Authors: Goutham Rajendran, Patrik Reizinger, Wieland Brendel, Pradeep Ravikumar

    Abstract: We investigate the relationship between system identification and intervention design in dynamical systems. While previous research demonstrated how identifiable representation learning methods, such as Independent Component Analysis (ICA), can reveal cause-effect relationships, it relied on a passive perspective without considering how to collect data. Our work shows that in Gaussian Linear Time-… ▽ More

    Submitted 16 February, 2024; v1 submitted 29 November, 2023; originally announced November 2023.

    Comments: CLeaR2024 camera ready. Code available at https://github.com/rpatrik96/lti-ica

  7. arXiv:2310.04295  [pdf, other

    cs.LG cs.AI stat.ML

    Identifying Representations for Intervention Extrapolation

    Authors: Sorawit Saengkyongam, Elan Rosenfeld, Pradeep Ravikumar, Niklas Pfister, Jonas Peters

    Abstract: The premise of identifiable and causal representation learning is to improve the current representation learning paradigm in terms of generalizability or robustness. Despite recent progress in questions of identifiability, more theoretical results demonstrating concrete advantages of these methods for downstream tasks are needed. In this paper, we consider the task of intervention extrapolation: p… ▽ More

    Submitted 5 March, 2024; v1 submitted 6 October, 2023; originally announced October 2023.

    Comments: Accepted at the International Conference on Learning Representations (ICLR) 2024

  8. arXiv:2306.17378  [pdf, other

    cs.LG stat.ML

    Global Optimality in Bivariate Gradient-based DAG Learning

    Authors: Chang Deng, Kevin Bello, Bryon Aragam, Pradeep Ravikumar

    Abstract: Recently, a new class of non-convex optimization problems motivated by the statistical problem of learning an acyclic directed graphical model from data has attracted significant interest. While existing work uses standard first-order optimization schemes to solve this problem, proving the global optimality of such approaches has proven elusive. The difficulty lies in the fact that unlike other no… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: 39 pages, 13 figures

  9. arXiv:2306.17361  [pdf, other

    cs.LG cs.AI stat.AP stat.ME stat.ML

    iSCAN: Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models

    Authors: Tianyu Chen, Kevin Bello, Bryon Aragam, Pradeep Ravikumar

    Abstract: Structural causal models (SCMs) are widely used in various disciplines to represent causal relationships among variables in complex systems. Unfortunately, the underlying causal structure is often unknown, and estimating it from data remains a challenging task. In many situations, however, the end goal is to localize the changes (shifts) in the causal mechanisms between related datasets instead of… ▽ More

    Submitted 12 January, 2024; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: 36 pages, 18 figures. Published at NeurIPS 2023

  10. arXiv:2306.02235  [pdf, other

    cs.LG cs.AI math.ST stat.ME stat.ML

    Learning Linear Causal Representations from Interventions under General Nonlinear Mixing

    Authors: Simon Buchholz, Goutham Rajendran, Elan Rosenfeld, Bryon Aragam, Bernhard Schölkopf, Pradeep Ravikumar

    Abstract: We study the problem of learning causal representations from unknown, latent interventions in a general setting, where the latent distribution is Gaussian but the mixing function is completely general. We prove strong identifiability results given unknown single-node interventions, i.e., without having access to the intervention targets. This generalizes prior works which have focused on weaker cl… ▽ More

    Submitted 18 December, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

    Comments: Accepted as Oral paper at NeurIPS 2023

  11. arXiv:2306.00788  [pdf, other

    cs.LG stat.ML

    Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression

    Authors: Runtian Zhai, Bingbin Liu, Andrej Risteski, Zico Kolter, Pradeep Ravikumar

    Abstract: Data augmentation is critical to the empirical success of modern self-supervised representation learning, such as contrastive learning and masked language modeling. However, a theoretical understanding of the exact role of augmentation remains limited. Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator, su… ▽ More

    Submitted 18 January, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: ICLR 2024 spotlight. 34 pages

  12. arXiv:2305.17277  [pdf, other

    stat.ML cs.LG

    Optimizing NOTEARS Objectives via Topological Swaps

    Authors: Chang Deng, Kevin Bello, Bryon Aragam, Pradeep Ravikumar

    Abstract: Recently, an intriguing class of non-convex optimization problems has emerged in the context of learning directed acyclic graphs (DAGs). These problems involve minimizing a given loss or score function, subject to a non-convex continuous constraint that penalizes the presence of cycles in a graph. In this work, we delve into the optimization challenges associated with this class of non-convex prog… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: 39 pages, 12 figures, ICML 2023

  13. arXiv:2303.14496  [pdf, other

    cs.LG cs.AI stat.ML

    Learning with Explanation Constraints

    Authors: Rattana Pukdee, Dylan Sam, J. Zico Kolter, Maria-Florina Balcan, Pradeep Ravikumar

    Abstract: As larger deep learning models are hard to interpret, there has been a recent focus on generating explanations of these black-box models. In contrast, we may have apriori explanations of how models should behave. In this paper, we formalize this notion as learning from explanation constraints and provide a learning theoretic framework to analyze how such explanations can improve the learning of ou… ▽ More

    Submitted 22 December, 2023; v1 submitted 25 March, 2023; originally announced March 2023.

    Comments: NeurIPS 2023

  14. arXiv:2210.03594  [pdf, other

    cs.LG stat.ML

    Label Propagation with Weak Supervision

    Authors: Rattana Pukdee, Dylan Sam, Maria-Florina Balcan, Pradeep Ravikumar

    Abstract: Semi-supervised learning and weakly supervised learning are important paradigms that aim to reduce the growing demand for labeled data in current machine learning applications. In this paper, we introduce a novel analysis of the classical label propagation algorithm (LPA) (Zhu & Ghahramani, 2002) that moreover takes advantage of useful prior information, specifically probabilistic hypothesized lab… ▽ More

    Submitted 9 April, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

    Comments: ICLR 2023, 26 pages, 2 figures

  15. arXiv:2209.08037  [pdf, other

    cs.LG stat.ME stat.ML

    DAGMA: Learning DAGs via M-matrices and a Log-Determinant Acyclicity Characterization

    Authors: Kevin Bello, Bryon Aragam, Pradeep Ravikumar

    Abstract: The combinatorial problem of learning directed acyclic graphs (DAGs) from data was recently framed as a purely continuous optimization problem by leveraging a differentiable acyclicity characterization of DAGs based on the trace of a matrix exponential function. Existing acyclicity characterizations are based on the idea that powers of an adjacency matrix contain information about walks and cycles… ▽ More

    Submitted 15 January, 2023; v1 submitted 16 September, 2022; originally announced September 2022.

    Comments: 28 pages, 13 figures, published at NeurIPS 2022

  16. arXiv:2206.10044  [pdf, other

    cs.LG cs.AI math.ST stat.ML

    Identifiability of deep generative models without auxiliary information

    Authors: Bohdan Kivva, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

    Abstract: We prove identifiability of a broad class of deep latent variable models that (a) have universal approximation capabilities and (b) are the decoders of variational autoencoders that are commonly used in practice. Unlike existing work, our analysis does not require weak supervision, auxiliary information, or conditioning in the latent space. Specifically, we show that for a broad class of generativ… ▽ More

    Submitted 18 October, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: 34 pages, 9 figures, to appear in NeurIPS 2022

  17. arXiv:2206.03362  [pdf, other

    cs.LG cs.AI cs.CR stat.ME stat.ML

    Building Robust Ensembles via Margin Boosting

    Authors: Dinghuai Zhang, Hongyang Zhang, Aaron Courville, Yoshua Bengio, Pradeep Ravikumar, Arun Sai Suggala

    Abstract: In the context of adversarial robustness, a single model does not usually have enough power to defend against all possible adversarial attacks, and as a result, has sub-optimal robustness. Consequently, an emerging line of work has focused on learning an ensemble of neural networks to defend against adversarial attacks. In this work, we take a principled approach towards building robust ensembles.… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: Accepted by ICML 2022

  18. arXiv:2202.09305  [pdf, other

    cs.LG stat.ML

    Masked prediction tasks: a parameter identifiability view

    Authors: Bingbin Liu, Daniel Hsu, Pradeep Ravikumar, Andrej Risteski

    Abstract: The vast majority of work in self-supervised learning, both theoretical and empirical (though mostly the latter), have largely focused on recovering good features for downstream tasks, with the definition of "good" often being intricately tied to the downstream task itself. This lens is undoubtedly very interesting, but suffers from the problem that there isn't a "canonical" set of downstream task… ▽ More

    Submitted 18 February, 2022; originally announced February 2022.

  19. arXiv:2201.12293  [pdf, other

    cs.LG stat.ML

    Understanding Why Generalized Reweighting Does Not Improve Over ERM

    Authors: Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar

    Abstract: Empirical risk minimization (ERM) is known in practice to be non-robust to distributional shift where the training and the test distributions are different. A suite of approaches, such as importance weighting, and variants of distributionally robust optimization (DRO), have been proposed to solve this problem. But a line of recent work has empirically shown that these approaches do not significant… ▽ More

    Submitted 7 February, 2023; v1 submitted 28 January, 2022; originally announced January 2022.

    Comments: ICLR 2023. 40 pages, 3 figures

  20. arXiv:2110.13948  [pdf, other

    cs.LG stat.ML

    Boosted CVaR Classification

    Authors: Runtian Zhai, Chen Dan, Arun Sai Suggala, Zico Kolter, Pradeep Ravikumar

    Abstract: Many modern machine learning tasks require models with high tail performance, i.e. high performance over the worst-off samples in the dataset. This problem has been widely studied in fields such as algorithmic fairness, class imbalance, and risk-sensitive decision making. A popular approach to maximize the model's tail performance is to minimize the CVaR (Conditional Value at Risk) loss, which com… ▽ More

    Submitted 10 November, 2021; v1 submitted 26 October, 2021; originally announced October 2021.

    Comments: NeurIPS 2021. 16 pages, 4 figures

  21. arXiv:2110.11271  [pdf, other

    cs.LG stat.ML

    Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation

    Authors: Bingbin Liu, Elan Rosenfeld, Pradeep Ravikumar, Andrej Risteski

    Abstract: Noise-contrastive estimation (NCE) is a statistically consistent method for learning unnormalized probabilistic models. It has been empirically observed that the choice of the noise distribution is crucial for NCE's performance. However, such observations have never been made formal or quantitative. In fact, it is not even clear whether the difficulties arising from a poorly chosen noise distribut… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

  22. arXiv:2108.11483  [pdf, other

    cs.LG math.OC stat.ML

    Heavy-tailed Streaming Statistical Estimation

    Authors: Che-Ping Tsai, Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar

    Abstract: We consider the task of heavy-tailed statistical estimation given streaming $p$-dimensional samples. This could also be viewed as stochastic optimization under heavy-tailed distributions, with an additional $O(p)$ space complexity constraint. We design a clipped stochastic gradient descent algorithm and provide an improved analysis, under a more nuanced condition on the noise of the stochastic gra… ▽ More

    Submitted 25 February, 2022; v1 submitted 25 August, 2021; originally announced August 2021.

  23. arXiv:2106.15563  [pdf, other

    cs.LG cs.AI stat.ML

    Learning latent causal graphs via mixture oracles

    Authors: Bohdan Kivva, Goutham Rajendran, Pradeep Ravikumar, Bryon Aragam

    Abstract: We study the problem of reconstructing a causal graphical model from data in the presence of latent variables. The main problem of interest is recovering the causal structure over the latent variables while allowing for general, potentially nonlinear dependence between the variables. In many practical problems, the dependence between raw observations (e.g. pixels in an image) is much less relevant… ▽ More

    Submitted 21 November, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

    Comments: To appear at NeurIPS 2021. 41 pages

  24. arXiv:2106.06142  [pdf, ps, other

    cs.LG stat.ML

    DORO: Distributional and Outlier Robust Optimization

    Authors: Runtian Zhai, Chen Dan, J. Zico Kolter, Pradeep Ravikumar

    Abstract: Many machine learning tasks involve subpopulation shift where the testing data distribution is a subpopulation of the training distribution. For such settings, a line of recent work has proposed the use of a variant of empirical risk minimization(ERM) known as distributionally robust optimization (DRO). In this work, we apply DRO to real, large-scale tasks with subpopulation shift, and observe tha… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: ICML 2021. Codes: https://github.com/RuntianZ/doro

  25. arXiv:2104.07232  [pdf, other

    cs.LG stat.ML

    Iterative Alignment Flows

    Authors: Zeyu Zhou, Ziyu Gong, Pradeep Ravikumar, David I. Inouye

    Abstract: The unsupervised task of aligning two or more distributions in a shared latent space has many applications including fair representations, batch effect mitigation, and unsupervised domain adaptation. Existing flow-based approaches estimate multiple flows independently, which is equivalent to learning multiple full generative models. Other approaches require adversarial learning, which can be compu… ▽ More

    Submitted 15 March, 2022; v1 submitted 15 April, 2021; originally announced April 2021.

  26. arXiv:2103.02740  [pdf, ps, other

    stat.ML cs.LG

    Contrastive learning of strong-mixing continuous-time stochastic processes

    Authors: Bingbin Liu, Pradeep Ravikumar, Andrej Risteski

    Abstract: Contrastive learning is a family of self-supervised methods where a model is trained to solve a classification task constructed from unlabeled data. It has recently emerged as one of the leading learning paradigms in the absence of labels across many different domains (e.g. brain imaging, text, images). However, theoretical understanding of many aspects of training, both statistical and algorithmi… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: Appearing in AISTATS 2021

  27. arXiv:2102.13128  [pdf, other

    cs.LG cs.AI cs.GT stat.ML

    An Online Learning Approach to Interpolation and Extrapolation in Domain Generalization

    Authors: Elan Rosenfeld, Pradeep Ravikumar, Andrej Risteski

    Abstract: A popular assumption for out-of-distribution generalization is that the training data comprises sub-datasets, each drawn from a distinct distribution; the goal is then to "interpolate" these distributions and "extrapolate" beyond them -- this objective is broadly known as domain generalization. A common belief is that ERM can interpolate but not extrapolate and that the latter is considerably more… ▽ More

    Submitted 18 November, 2021; v1 submitted 25 February, 2021; originally announced February 2021.

  28. arXiv:2102.10264  [pdf, other

    cs.LG cs.RO stat.ML

    On Proximal Policy Optimization's Heavy-tailed Gradients

    Authors: Saurabh Garg, Joshua Zhanson, Emilio Parisotto, Adarsh Prasad, J. Zico Kolter, Zachary C. Lipton, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Pradeep Ravikumar

    Abstract: Modern policy gradient algorithms such as Proximal Policy Optimization (PPO) rely on an arsenal of heuristics, including loss clipping and gradient clipping, to ensure successful learning. These heuristics are reminiscent of techniques from robust statistics, commonly used for estimation in outlier-rich (``heavy-tailed'') regimes. In this paper, we present a detailed empirical study to characteriz… ▽ More

    Submitted 12 July, 2021; v1 submitted 20 February, 2021; originally announced February 2021.

    Comments: ICML 2021

  29. arXiv:2101.00300  [pdf, ps, other

    cs.LG cs.AI stat.ML

    When Is Generalizable Reinforcement Learning Tractable?

    Authors: Dhruv Malik, Yuanzhi Li, Pradeep Ravikumar

    Abstract: Agents trained by reinforcement learning (RL) often fail to generalize beyond the environment they were trained in, even when presented with new scenarios that seem similar to the training environment. We study the query complexity required to train RL agents that generalize to multiple environments. Intuitively, tractable generalization is only possible when the environments are similar or close… ▽ More

    Submitted 25 October, 2021; v1 submitted 1 January, 2021; originally announced January 2021.

    Comments: Neurips 2021, v3 fixes minor typos

  30. arXiv:2012.10713  [pdf, other

    cs.LG cs.AI stat.ML

    Fundamental Limits and Tradeoffs in Invariant Representation Learning

    Authors: Han Zhao, Chen Dan, Bryon Aragam, Tommi S. Jaakkola, Geoffrey J. Gordon, Pradeep Ravikumar

    Abstract: A wide range of machine learning applications such as privacy-preserving learning, algorithmic fairness, and domain adaptation/generalization among others, involve learning invariant representations of the data that aim to achieve two competing goals: (a) maximize information or accuracy with respect to a target response, and (b) maximize invariance or independence with respect to a set of protect… ▽ More

    Submitted 23 November, 2022; v1 submitted 19 December, 2020; originally announced December 2020.

    Comments: JMLR camera-ready version

  31. arXiv:2010.05761  [pdf, other

    cs.LG cs.AI stat.ML

    The Risks of Invariant Risk Minimization

    Authors: Elan Rosenfeld, Pradeep Ravikumar, Andrej Risteski

    Abstract: Invariant Causal Prediction (Peters et al., 2016) is a technique for out-of-distribution generalization which assumes that some aspects of the data distribution vary across the training set but that the underlying causal mechanisms remain constant. Recently, Arjovsky et al. (2019) proposed Invariant Risk Minimization (IRM), an objective based on this idea for learning deep, invariant features of d… ▽ More

    Submitted 27 March, 2021; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: ICLR 2021 Camera-Ready

  32. arXiv:2006.16384  [pdf, other

    stat.ML cs.LG

    Sharp Statistical Guarantees for Adversarially Robust Gaussian Classification

    Authors: Chen Dan, Yuting Wei, Pradeep Ravikumar

    Abstract: Adversarial robustness has become a fundamental requirement in modern machine learning applications. Yet, there has been surprisingly little statistical understanding so far. In this paper, we provide the first result of the optimal minimax guarantees for the excess risk for adversarially robust classification, under Gaussian mixture model proposed by \cite{schmidt2018adversarially}. The results a… ▽ More

    Submitted 29 June, 2020; originally announced June 2020.

    Comments: 25 pages, 1 figure. Accepted by ICML 2020

  33. arXiv:2006.11430  [pdf, ps, other

    stat.ML cs.LG stat.ME

    Learning Minimax Estimators via Online Learning

    Authors: Kartik Gupta, Arun Sai Suggala, Adarsh Prasad, Praneeth Netrapalli, Pradeep Ravikumar

    Abstract: We consider the problem of designing minimax estimators for estimating the parameters of a probability distribution. Unlike classical approaches such as the MLE and minimum distance estimators, we consider an algorithmic approach for constructing such estimators. We view the problem of designing minimax estimators as finding a mixed strategy Nash equilibrium of a zero-sum game. By leveraging recen… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

    Comments: 60 pages. Under review

  34. arXiv:2006.07972  [pdf, other

    cs.LG stat.ML

    Sub-Seasonal Climate Forecasting via Machine Learning: Challenges, Analysis, and Advances

    Authors: Sijie He, Xinyan Li, Timothy DelSole, Pradeep Ravikumar, Arindam Banerjee

    Abstract: Sub-seasonal climate forecasting (SSF) focuses on predicting key climate variables such as temperature and precipitation in the 2-week to 2-month time scales. Skillful SSF would have immense societal value, in areas such as agricultural productivity, water resource management, transportation and aviation systems, and emergency planning for extreme weather events. However, SSF is considered more ch… ▽ More

    Submitted 24 June, 2020; v1 submitted 14 June, 2020; originally announced June 2020.

  35. arXiv:2006.00442  [pdf, other

    cs.LG stat.ML

    Evaluations and Methods for Explanation through Robustness Analysis

    Authors: Cheng-Yu Hsieh, Chih-Kuan Yeh, Xuanqing Liu, Pradeep Ravikumar, Seungyeon Kim, Sanjiv Kumar, Cho-Jui Hsieh

    Abstract: Feature based explanations, that provide importance of each feature towards the model prediction, is arguably one of the most intuitive ways to explain a model. In this paper, we establish a novel set of evaluation criteria for such feature based explanations by robustness analysis. In contrast to existing evaluations which require us to specify some way to "remove" features that could inevitably… ▽ More

    Submitted 8 April, 2021; v1 submitted 31 May, 2020; originally announced June 2020.

    Comments: To appear in ICLR 2021

  36. arXiv:2005.12914  [pdf, other

    stat.ML cs.LG

    Class-Weighted Classification: Trade-offs and Robust Approaches

    Authors: Ziyu Xu, Chen Dan, Justin Khim, Pradeep Ravikumar

    Abstract: We address imbalanced classification, the problem in which a label may have low marginal probability relative to other labels, by weighting losses according to the correct class. First, we examine the convergence rates of the expected excess weighted risk of plug-in classifiers where the weighting for the plug-in classifier and the risk may be different. This leads to irreducible errors that do no… ▽ More

    Submitted 26 May, 2020; originally announced May 2020.

    Comments: 28 pages, 4 figures

  37. arXiv:2004.05665  [pdf, other

    cs.LG stat.ML

    Minimizing FLOPs to Learn Efficient Sparse Representations

    Authors: Biswajit Paria, Chih-Kuan Yeh, Ian E. H. Yen, Ning Xu, Pradeep Ravikumar, Barnabás Póczos

    Abstract: Deep representation learning has become one of the most widely adopted approaches for visual search, recommendation, and identification. Retrieval of such representations from a large database is however computationally challenging. Approximate methods based on learning compact representations, have been widely explored for this problem, such as locality sensitive hashing, product quantization, an… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: Published at ICLR 2020

  38. arXiv:2002.03018  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Certified Robustness to Label-Flipping Attacks via Randomized Smoothing

    Authors: Elan Rosenfeld, Ezra Winston, Pradeep Ravikumar, J. Zico Kolter

    Abstract: Machine learning algorithms are known to be susceptible to data poisoning attacks, where an adversary manipulates the training data to degrade performance of the resulting classifier. In this work, we present a unifying view of randomized smoothing over arbitrary functions, and we leverage this novel characterization to propose a new strategy for building classifiers that are pointwise-certifiably… ▽ More

    Submitted 11 August, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

    Comments: ICML 2020

  39. arXiv:2001.02378  [pdf, other

    cs.LG cs.CR stat.ML

    MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius

    Authors: Runtian Zhai, Chen Dan, Di He, Huan Zhang, Boqing Gong, Pradeep Ravikumar, Cho-Jui Hsieh, Liwei Wang

    Abstract: Adversarial training is one of the most popular ways to learn robust models but is usually attack-dependent and time costly. In this paper, we propose the MACER algorithm, which learns robust models without using adversarial training but performs better than all existing provable l2-defenses. Recent work shows that randomized smoothing can be used to provide a certified l2 radius to smoothed class… ▽ More

    Submitted 14 March, 2022; v1 submitted 8 January, 2020; originally announced January 2020.

    Comments: Published in ICLR 2020. 20 Pages

  40. arXiv:1912.06074  [pdf, other

    cs.LG cs.AI stat.ML

    Game Design for Eliciting Distinguishable Behavior

    Authors: Fan Yang, Liu Leqi, Yifan Wu, Zachary C. Lipton, Pradeep Ravikumar, William W. Cohen, Tom Mitchell

    Abstract: The ability to inferring latent psychological traits from human behavior is key to developing personalized human-interacting machine learning systems. Approaches to infer such traits range from surveys to manually-constructed experiments and games. However, these traditional games are limited because they are typically designed based on heuristics. In this paper, we formulate the task of designing… ▽ More

    Submitted 12 December, 2019; originally announced December 2019.

    Comments: 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)

  41. arXiv:1912.01108  [pdf, other

    cs.LG stat.ML

    Automated Dependence Plots

    Authors: David I. Inouye, Liu Leqi, Joon Sik Kim, Bryon Aragam, Pradeep Ravikumar

    Abstract: In practical applications of machine learning, it is necessary to look beyond standard metrics such as test accuracy in order to validate various qualitative properties of a model. Partial dependence plots (PDP), including instance-specific PDPs (i.e., ICE plots), have been widely used as a visual tool to understand or validate a model. Yet, current PDPs suffer from two main drawbacks: (1) a user… ▽ More

    Submitted 29 July, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: In Uncertainty in Artificial Intelligence (UAI 2020). Camera-ready version. Code is available at https://github.com/davidinouye/adp

  42. arXiv:1910.13618  [pdf, ps, other

    cs.LG cs.DS stat.ML

    Optimal Analysis of Subset-Selection Based L_p Low Rank Approximation

    Authors: Chen Dan, Hong Wang, Hongyang Zhang, Yuchen Zhou, Pradeep Ravikumar

    Abstract: We study the low rank approximation problem of any given matrix $A$ over $\mathbb{R}^{n\times m}$ and $\mathbb{C}^{n\times m}$ in entry-wise $\ell_p$ loss, that is, finding a rank-$k$ matrix $X$ such that $\|A-X\|_p$ is minimized. Unlike the traditional $\ell_2$ setting, this particular variant is NP-Hard. We show that the algorithm of column subset selection, which was an algorithmic foundation o… ▽ More

    Submitted 29 October, 2019; originally announced October 2019.

    Comments: 20 pages, accepted by NeurIPS 2019

  43. arXiv:1910.07969  [pdf, other

    cs.LG stat.ML

    On Completeness-aware Concept-Based Explanations in Deep Neural Networks

    Authors: Chih-Kuan Yeh, Been Kim, Sercan O. Arik, Chun-Liang Li, Tomas Pfister, Pradeep Ravikumar

    Abstract: Human explanations of high-level decisions are often expressed in terms of key concepts the decisions are based on. In this paper, we study such concept-based explainability for Deep Neural Networks (DNNs). First, we define the notion of completeness, which quantifies how sufficient a particular set of concepts is in explaining a model's prediction behavior based on the assumption that complete co… ▽ More

    Submitted 7 February, 2022; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: Updated supplementary

  44. arXiv:1909.13189  [pdf, other

    stat.ML cs.LG stat.ME

    Learning Sparse Nonparametric DAGs

    Authors: Xun Zheng, Chen Dan, Bryon Aragam, Pradeep Ravikumar, Eric P. Xing

    Abstract: We develop a framework for learning sparse nonparametric directed acyclic graphs (DAGs) from data. Our approach is based on a recent algebraic characterization of DAGs that led to a fully continuous program for score-based learning of DAG models parametrized by a linear structural equation model (SEM). We extend this algebraic characterization to nonparametric SEM by leveraging nonparametric spars… ▽ More

    Submitted 23 March, 2020; v1 submitted 28 September, 2019; originally announced September 2019.

    Comments: To appear in AISTATS 2020

  45. arXiv:1907.00927  [pdf, ps, other

    stat.ML cs.AI cs.LG

    A Unified Approach to Robust Mean Estimation

    Authors: Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar

    Abstract: In this paper, we develop connections between two seemingly disparate, but central, models in robust statistics: Huber's epsilon-contamination model and the heavy-tailed noise model. We provide conditions under which this connection provides near-statistically-optimal estimators. Building on this connection, we provide a simple variant of recent computationally-efficient algorithms for mean estima… ▽ More

    Submitted 1 July, 2019; originally announced July 2019.

    Comments: 51 pages, 6 figures

  46. arXiv:1903.08192  [pdf, ps, other

    cs.LG stat.ML

    Adaptive Hard Thresholding for Near-optimal Consistent Robust Regression

    Authors: Arun Sai Suggala, Kush Bhatia, Pradeep Ravikumar, Prateek Jain

    Abstract: We study the problem of robust linear regression with response variable corruptions. We consider the oblivious adversary model, where the adversary corrupts a fraction of the responses in complete ignorance of the data. We provide a nearly linear time estimator which consistently estimates the true regression vector, even with $1-o(1)$ fraction of corruptions. Existing results in this setting eith… ▽ More

    Submitted 19 March, 2019; originally announced March 2019.

  47. arXiv:1901.10040  [pdf

    cs.LG cs.AI stat.ML

    Towards Aggregating Weighted Feature Attributions

    Authors: Umang Bhatt, Pradeep Ravikumar, Jose M. F. Moura

    Abstract: Current approaches for explaining machine learning models fall into two distinct classes: antecedent event influence and value attribution. The former leverages training instances to describe how much influence a training point exerts on a test point, while the latter attempts to attribute value to the features most pertinent to a given prediction. In this work, we discuss an algorithm, AVA: Aggre… ▽ More

    Submitted 20 January, 2019; originally announced January 2019.

    Comments: In AAAI-19 Workshop on Network Interpretability for Deep Learning

  48. arXiv:1901.09392  [pdf, other

    cs.LG stat.ML

    On the (In)fidelity and Sensitivity for Explanations

    Authors: Chih-Kuan Yeh, Cheng-Yu Hsieh, Arun Sai Suggala, David I. Inouye, Pradeep Ravikumar

    Abstract: We consider objective evaluation measures of saliency explanations for complex black-box machine learning models. We propose simple robust variants of two notions that have been considered in recent literature: (in)fidelity, and sensitivity. We analyze optimal explanations with respect to both these measures, and while the optimal explanation for sensitivity is a vacuous constant explanation, the… ▽ More

    Submitted 3 November, 2019; v1 submitted 27 January, 2019; originally announced January 2019.

    Comments: NeurIPS 2019 camera ready, previous version on Arxiv: "How Sensitive are Sensitivity-Based Explanations"

  49. arXiv:1811.09720  [pdf, other

    cs.LG stat.ML

    Representer Point Selection for Explaining Deep Neural Networks

    Authors: Chih-Kuan Yeh, Joon Sik Kim, Ian E. H. Yen, Pradeep Ravikumar

    Abstract: We propose to explain the predictions of a deep neural network, by pointing to the set of what we call representer points in the training set, for a given test point prediction. Specifically, we show that we can decompose the pre-activation prediction of a neural network into a linear combination of activations of training points, with the weights corresponding to what we call representer values,… ▽ More

    Submitted 23 November, 2018; originally announced November 2018.

    Comments: NIPS 2018

  50. arXiv:1811.01713  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    Word Mover's Embedding: From Word2Vec to Document Embedding

    Authors: Lingfei Wu, Ian E. H. Yen, Kun Xu, Fangli Xu, Avinash Balakrishnan, Pin-Yu Chen, Pradeep Ravikumar, Michael J. Witbrock

    Abstract: While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings. Recent work has demonstrated that a distance measure between documents called \emph{Word Mover's Distance} (WMD) that aligns semantically similar words, yields unprecedented KNN classif… ▽ More

    Submitted 30 October, 2018; originally announced November 2018.

    Comments: EMNLP'18 Camera-Ready Version