Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 195 results for author: Zhang, K

Searching in archive stat. Search in all archives.
.
  1. arXiv:2407.00529  [pdf, other

    cs.LG cs.SD eess.AS math.ST stat.ML

    Detecting and Identifying Selection Structure in Sequential Data

    Authors: Yujia Zheng, Zeyu Tang, Yiwen Qiu, Bernhard Schölkopf, Kun Zhang

    Abstract: We argue that the selective inclusion of data points based on latent objectives is common in practical situations, such as music sequences. Since this selection process often distorts statistical analysis, previous work primarily views it as a bias to be corrected and proposes various methods to mitigate its effect. However, while controlling this bias is crucial, selection also offers an opportun… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: ICML 2024

  2. arXiv:2406.06838  [pdf, other

    cs.LG cs.AI stat.ML

    Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes

    Authors: Dan Qiao, Kaiqi Zhang, Esha Singh, Daniel Soudry, Yu-Xiang Wang

    Abstract: We study the generalization of two-layer ReLU neural networks in a univariate nonparametric regression problem with noisy labels. This is a problem where kernels (\emph{e.g.} NTK) are provably sub-optimal and benign overfitting does not happen, thus disqualifying existing theory for interpolating (0-loss, global optimal) solutions. We present a new theory of generalization for local minima that gr… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 51 pages

  3. arXiv:2406.02191  [pdf, other

    stat.ML cs.LG

    On the Recoverability of Causal Relations from Temporally Aggregated I.I.D. Data

    Authors: Shunxing Fan, Mingming Gong, Kun Zhang

    Abstract: We consider the effect of temporal aggregation on instantaneous (non-temporal) causal discovery in general setting. This is motivated by the observation that the true causal time lag is often considerably shorter than the observational interval. This discrepancy leads to high aggregation, causing time-delay causality to vanish and instantaneous dependence to manifest. Although we expect such insta… ▽ More

    Submitted 11 June, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

    Comments: ICML 2024

  4. arXiv:2406.00519  [pdf, other

    cs.LG cs.AI stat.ML

    Learning Discrete Concepts in Latent Hierarchical Models

    Authors: Lingjing Kong, Guangyi Chen, Biwei Huang, Eric P. Xing, Yuejie Chi, Kun Zhang

    Abstract: Learning concepts from natural high-dimensional data (e.g., images) holds potential in building human-aligned and interpretable machine learning models. Despite its encouraging prospect, formalization and theoretical insights into this crucial task are still lacking. In this work, we formalize concepts as discrete latent causal variables that are related via a hierarchical causal model that encode… ▽ More

    Submitted 1 June, 2024; originally announced June 2024.

  5. arXiv:2405.19466  [pdf, other

    cs.LG stat.ML

    Posterior Sampling via Autoregressive Generation

    Authors: Kelly W Zhang, Tiffany, Cai, Hongseok Namkoong, Daniel Russo

    Abstract: Real-world decision-making requires grappling with a perpetual lack of data as environments change; intelligent agents must comprehend uncertainty and actively gather information to resolve it. We propose a new framework for learning bandit algorithms from massive historical data, which we demonstrate in a cold-start recommendation problem. First, we use historical data to pretrain an autoregressi… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  6. arXiv:2405.15325  [pdf, other

    cs.LG stat.ML

    On the Identification of Temporally Causal Representation with Instantaneous Dependence

    Authors: Zijian Li, Yifan Shen, Kaitao Zheng, Ruichu Cai, Xiangchen Song, Mingming Gong, Zhengmao Zhu, Guangyi Chen, Kun Zhang

    Abstract: Temporally causal representation learning aims to identify the latent causal process from time series observations, but most methods require the assumption that the latent causal processes do not have instantaneous relations. Although some recent methods achieve identifiability in the instantaneous causality case, they require either interventions on the latent variables or grouping of the observa… ▽ More

    Submitted 7 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

  7. arXiv:2405.05638  [pdf, ps, other

    stat.ME cs.LG math.NA math.OC

    An Efficient Finite Difference Approximation via a Double Sample-Recycling Approach

    Authors: Guo Liang, Guangwu Liu, Kun Zhang

    Abstract: Estimating stochastic gradients is pivotal in fields like service systems within operations research. The classical method for this estimation is the finite difference approximation, which entails generating samples at perturbed inputs. Nonetheless, practical challenges persist in determining the perturbation and obtaining an optimal finite difference estimator in the sense of possessing the small… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  8. arXiv:2405.03664  [pdf, other

    cs.LG stat.ML

    A New Robust Partial $p$-Wasserstein-Based Metric for Comparing Distributions

    Authors: Sharath Raghvendra, Pouyan Shirzadian, Kaiyi Zhang

    Abstract: The $2$-Wasserstein distance is sensitive to minor geometric differences between distributions, making it a very powerful dissimilarity metric. However, due to this sensitivity, a small outlier mass can also cause a significant increase in the $2$-Wasserstein distance between two similar distributions. Similarly, sampling discrepancy can cause the empirical $2$-Wasserstein distance on $n$ samples… ▽ More

    Submitted 2 June, 2024; v1 submitted 6 May, 2024; originally announced May 2024.

  9. arXiv:2404.17644  [pdf, other

    stat.ML cs.AI cs.LG

    A Conditional Independence Test in the Presence of Discretization

    Authors: Boyang Sun, Yu Yao, Huangyuan Hao, Yumou Qiu, Kun Zhang

    Abstract: Testing conditional independence has many applications, such as in Bayesian network learning and causal discovery. Different test methods have been proposed. However, existing methods generally can not work when only discretized observations are available. Specifically, consider $X_1$, $\tilde{X}_2$ and $X_3$ are observed variables, where $\tilde{X}_2$ is a discretization of latent variables… ▽ More

    Submitted 3 May, 2024; v1 submitted 26 April, 2024; originally announced April 2024.

  10. arXiv:2403.15711  [pdf, other

    cs.LG stat.ME stat.ML

    Identifiable Latent Neural Causal Models

    Authors: Yuhang Liu, Zhen Zhang, Dong Gong, Mingming Gong, Biwei Huang, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

    Abstract: Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data. It is particularly good at predictions under unseen distribution shifts, because these shifts can generally be interpreted as consequences of interventions. Hence leveraging {seen} distribution shifts becomes a natural strategy to help identifying causal representations, which in… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  11. arXiv:2403.10946  [pdf, other

    stat.ML cs.LG

    The Fallacy of Minimizing Local Regret in the Sequential Task Setting

    Authors: Ziping Xu, Kelly W. Zhang, Susan A. Murphy

    Abstract: In the realm of Reinforcement Learning (RL), online RL is often conceptualized as an optimization problem, where an algorithm interacts with an unknown environment to minimize cumulative regret. In a stationary setting, strong theoretical guarantees, like a sublinear ($\sqrt{T}$) regret bound, can be obtained, which typically implies the convergence to an optimal policy and the cessation of explor… ▽ More

    Submitted 16 March, 2024; originally announced March 2024.

  12. arXiv:2402.17070  [pdf, other

    stat.ME math.ST

    Dempster-Shafer P-values: Thoughts on an Alternative Approach for Multinomial Inference

    Authors: Kentaro Hoffman, Kai Zhang, Tyler McCormick, Jan Hannig

    Abstract: In this paper, we demonstrate that a new measure of evidence we developed called the Dempster-Shafer p-value which allow for insights and interpretations which retain most of the structure of the p-value while covering for some of the disadvantages that traditional p- values face. Moreover, we show through classical large-sample bounds and simulations that there exists a close connection between o… ▽ More

    Submitted 26 February, 2024; originally announced February 2024.

  13. arXiv:2402.15602  [pdf, other

    math.ST cs.LG stat.ML

    Minimax Optimality of Score-based Diffusion Models: Beyond the Density Lower Bound Assumptions

    Authors: Kaihong Zhang, Heqi Yin, Feng Liang, Jingbo Liu

    Abstract: We study the asymptotic error of score-based diffusion model sampling in large-sample scenarios from a non-parametric statistics perspective. We show that a kernel-based score estimator achieves an optimal mean square error of $\widetilde{O}\left(n^{-1} t^{-\frac{d+2}{2}}(t^{\frac{d}{2}} \vee 1)\right)$ for the score function of $p_0*\mathcal{N}(0,t\boldsymbol{I}_d)$, where $n$ and $d$ represent t… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

  14. arXiv:2402.08018  [pdf, other

    cs.LG cs.CV stat.ML

    Nearest Neighbour Score Estimators for Diffusion Generative Models

    Authors: Matthew Niedoba, Dylan Green, Saeid Naderiparizi, Vasileios Lioutas, Jonathan Wilder Lavington, Xiaoxuan Liang, Yunpeng Liu, Ke Zhang, Setareh Dabiri, Adam Ścibior, Berend Zwartsenberg, Frank Wood

    Abstract: Score function estimation is the cornerstone of both training and sampling from diffusion generative models. Despite this fact, the most commonly used estimators are either biased neural network approximations or high variance Monte Carlo estimators based on the conditional score. We introduce a novel nearest neighbour score function estimator which utilizes multiple samples from the training set… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 25 pages, 9 figures

  15. arXiv:2402.06223  [pdf, other

    cs.LG cs.CV stat.ML

    Revealing Multimodal Contrastive Representation Learning through Latent Partial Causal Models

    Authors: Yuhang Liu, Zhen Zhang, Dong Gong, Biwei Huang, Mingming Gong, Anton van den Hengel, Kun Zhang, Javen Qinfeng Shi

    Abstract: Multimodal contrastive representation learning methods have proven successful across a range of domains, partly due to their ability to generate meaningful shared representations of complex phenomena. To enhance the depth of analysis and understanding of these acquired representations, we introduce a unified causal model specifically designed for multimodal data. By examining this model, we show t… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  16. arXiv:2402.05052  [pdf, other

    cs.LG stat.ML

    Causal Representation Learning from Multiple Distributions: A General Setting

    Authors: Kun Zhang, Shaoan Xie, Ignavier Ng, Yujia Zheng

    Abstract: In many problems, the measured variables (e.g., image pixels) are just mathematical functions of the hidden causal variables (e.g., the underlying concepts or objects). For the purpose of making predictions in changing environments or making proper changes to the system, it is helpful to recover the hidden causal variables $Z_i$ and their causal relations represented by graph $\mathcal{G}_Z$. This… ▽ More

    Submitted 9 April, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

  17. arXiv:2402.03941  [pdf, other

    cs.LG cs.AI stat.ME

    Discovery of the Hidden World with Large Language Models

    Authors: Chenxi Liu, Yongqiang Chen, Tongliang Liu, Mingming Gong, James Cheng, Bo Han, Kun Zhang

    Abstract: Science originates with discovering new causal knowledge from a combination of known facts and observations. Traditional causal discovery approaches mainly rely on high-quality measured variables, usually given by human experts, to find causal relations. However, the causal variables are usually unavailable in a wide range of real-world applications. The rise of large language models (LLMs) that a… ▽ More

    Submitted 6 February, 2024; originally announced February 2024.

    Comments: Preliminary version of an ongoing project; Chenxi and Yongqiang contributed equally; 26 pages, 41 figures; Project page: https://causalcoat.github.io/

  18. arXiv:2402.01607  [pdf, other

    cs.AI cs.CV cs.LG cs.NE stat.ME

    Natural Counterfactuals With Necessary Backtracking

    Authors: Guang-Yuan Hao, Jiji Zhang, Biwei Huang, Hao Wang, Kun Zhang

    Abstract: Counterfactual reasoning is pivotal in human cognition and especially important for providing explanations and making decisions. While Judea Pearl's influential approach is theoretically elegant, its generation of a counterfactual scenario often requires interventions that are too detached from the real scenarios to be feasible. In response, we propose a framework of natural counterfactuals and a… ▽ More

    Submitted 20 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

  19. arXiv:2401.14535  [pdf, other

    cs.LG cs.CV stat.ME

    CaRiNG: Learning Temporal Causal Representation under Non-Invertible Generation Process

    Authors: Guangyi Chen, Yifan Shen, Zhenhao Chen, Xiangchen Song, Yuewen Sun, Weiran Yao, Xiao Liu, Kun Zhang

    Abstract: Identifying the underlying time-delayed latent causal processes in sequential data is vital for grasping temporal dynamics and making downstream reasoning. While some recent methods can robustly identify these latent causal variables, they rely on strict assumptions about the invertible generation process from latent variables to observed data. However, these assumptions are often hard to satisfy… ▽ More

    Submitted 30 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: To appear at ICML 2024, 24 pages

  20. arXiv:2401.09641  [pdf, ps, other

    cs.LG math.ST q-bio.NC stat.ME

    Functional Linear Non-Gaussian Acyclic Model for Causal Discovery

    Authors: Tian-Le Yang, Kuang-Yao Lee, Kun Zhang, Joe Suzuki

    Abstract: In causal discovery, non-Gaussianity has been used to characterize the complete configuration of a Linear Non-Gaussian Acyclic Model (LiNGAM), encompassing both the causal ordering of variables and their respective connection strengths. However, LiNGAM can only deal with the finite-dimensional case. To expand this concept, we extend the notion of variables to encompass vectors and even functions,… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

  21. arXiv:2401.05414  [pdf, other

    q-fin.ST cs.LG stat.ME

    On the Three Demons in Causality in Finance: Time Resolution, Nonstationarity, and Latent Factors

    Authors: Xinshuai Dong, Haoyue Dai, Yewen Fan, Songyao Jin, Sathyamoorthy Rajendran, Kun Zhang

    Abstract: Financial data is generally time series in essence and thus suffers from three fundamental issues: the mismatch in time resolution, the time-varying property of the distribution - nonstationarity, and causal factors that are important but unknown/unobserved. In this paper, we follow a causal perspective to systematically look into these three demons in finance. Specifically, we reexamine these iss… ▽ More

    Submitted 12 January, 2024; v1 submitted 28 December, 2023; originally announced January 2024.

  22. arXiv:2312.11934  [pdf, other

    cs.LG cs.AI stat.ME

    Identification of Causal Structure with Latent Variables Based on Higher Order Cumulants

    Authors: Wei Chen, Zhiyi Huang, Ruichu Cai, Zhifeng Hao, Kun Zhang

    Abstract: Causal discovery with latent variables is a crucial but challenging task. Despite the emergence of numerous methods aimed at addressing this challenge, they are not fully identified to the structure that two observed variables are influenced by one latent variable and there might be a directed edge in between. Interestingly, we notice that this structure can be identified through the utilization o… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  23. arXiv:2312.11001  [pdf, other

    cs.LG stat.ME

    A Versatile Causal Discovery Framework to Allow Causally-Related Hidden Variables

    Authors: Xinshuai Dong, Biwei Huang, Ignavier Ng, Xiangchen Song, Yujia Zheng, Songyao Jin, Roberto Legaspi, Peter Spirtes, Kun Zhang

    Abstract: Most existing causal discovery methods rely on the assumption of no latent confounders, limiting their applicability in solving real-life problems. In this paper, we introduce a novel, versatile framework for causal discovery that accommodates the presence of causally-related hidden variables almost everywhere in the causal network (for instance, they can be effects of observed variables), based o… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  24. arXiv:2311.00866  [pdf, other

    cs.LG eess.SP stat.ML

    Generalizing Nonlinear ICA Beyond Structural Sparsity

    Authors: Yujia Zheng, Kun Zhang

    Abstract: Nonlinear independent component analysis (ICA) aims to uncover the true latent sources from their observable nonlinear mixtures. Despite its significance, the identifiability of nonlinear ICA is known to be impossible without additional assumptions. Recent advances have proposed conditions on the connective structure from sources to observed variables, known as Structural Sparsity, to achieve iden… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  25. arXiv:2310.18615  [pdf, other

    cs.LG stat.ML

    Temporally Disentangled Representation Learning under Unknown Nonstationarity

    Authors: Xiangchen Song, Weiran Yao, Yewen Fan, Xinshuai Dong, Guangyi Chen, Juan Carlos Niebles, Eric Xing, Kun Zhang

    Abstract: In unsupervised causal representation learning for sequential data with time-delayed latent causal influences, strong identifiability results for the disentanglement of causally-related latent variables have been established in stationary settings by leveraging temporal structure. However, in nonstationary setting, existing work only partially addressed the problem by either utilizing observed aux… ▽ More

    Submitted 28 October, 2023; originally announced October 2023.

    Comments: NeurIPS 2023

  26. arXiv:2310.04723  [pdf, other

    cs.LG stat.ML

    Subspace Identification for Multi-Source Domain Adaptation

    Authors: Zijian Li, Ruichu Cai, Guangyi Chen, Boyang Sun, Zhifeng Hao, Kun Zhang

    Abstract: Multi-source domain adaptation (MSDA) methods aim to transfer knowledge from multiple labeled source domains to an unlabeled target domain. Although current methods achieve target joint distribution identifiability by enforcing minimal changes across domains, they often necessitate stringent conditions, such as an adequate number of domains, monotonic transformation of latent variables, and invari… ▽ More

    Submitted 14 December, 2023; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: NeurIPS2023 Spotlight

  27. arXiv:2308.08148  [pdf, other

    cs.LG stat.ME

    Hierarchical Topological Ordering with Conditional Independence Test for Limited Time Series

    Authors: Anpeng Wu, Haoxuan Li, Kun Kuang, Keli Zhang, Fei Wu

    Abstract: Learning directed acyclic graphs (DAGs) to identify causal relations underlying observational data is crucial but also poses significant challenges. Recently, topology-based methods have emerged as a two-step approach to discovering DAGs by first learning the topological ordering of variables and then eliminating redundant edges, while ensuring that the graph remains acyclic. However, one limitati… ▽ More

    Submitted 16 August, 2023; originally announced August 2023.

  28. arXiv:2308.06718  [pdf, other

    cs.LG cs.AI stat.ME

    Generalized Independent Noise Condition for Estimating Causal Structure with Latent Variables

    Authors: Feng Xie, Biwei Huang, Zhengming Chen, Ruichu Cai, Clark Glymour, Zhi Geng, Kun Zhang

    Abstract: We investigate the task of learning causal structure in the presence of latent variables, including locating latent variables and determining their quantity, and identifying causal relationships among both latent and observed variables. To this end, we propose a Generalized Independent Noise (GIN) condition for linear non-Gaussian acyclic causal models that incorporate latent variables, which esta… ▽ More

    Submitted 9 June, 2024; v1 submitted 13 August, 2023; originally announced August 2023.

  29. arXiv:2308.04428  [pdf, other

    stat.ML cs.LG eess.SY

    Meta-Learning Operators to Optimality from Multi-Task Non-IID Data

    Authors: Thomas T. C. K. Zhang, Leonardo F. Toso, James Anderson, Nikolai Matni

    Abstract: A powerful concept behind much of the recent progress in machine learning is the extraction of common features across data from heterogeneous sources or tasks. Intuitively, using all of one's data to learn a common representation function benefits both computational effort and statistical generalization by leaving a smaller number of parameters to fine-tune on a given task. Toward theoretically gr… ▽ More

    Submitted 8 August, 2023; originally announced August 2023.

  30. arXiv:2307.16405  [pdf, other

    cs.LG stat.ME stat.ML

    Causal-learn: Causal Discovery in Python

    Authors: Yujia Zheng, Biwei Huang, Wei Chen, Joseph Ramsey, Mingming Gong, Ruichu Cai, Shohei Shimizu, Peter Spirtes, Kun Zhang

    Abstract: Causal discovery aims at revealing causal relations from observational data, which is a fundamental task in science and engineering. We describe $\textit{causal-learn}$, an open-source Python library for causal discovery. This library focuses on bringing a comprehensive collection of causal discovery methods to both practitioners and researchers. It provides easy-to-use APIs for non-specialists, m… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

    Journal ref: Journal of Machine Learning Research 25 (2024)

  31. arXiv:2307.06457  [pdf, other

    cs.LG cs.DS stat.ML

    Tackling Combinatorial Distribution Shift: A Matrix Completion Perspective

    Authors: Max Simchowitz, Abhishek Gupta, Kaiqing Zhang

    Abstract: Obtaining rigorous statistical guarantees for generalization under distribution shift remains an open and active research area. We study a setting we call combinatorial distribution shift, where (a) under the test- and training-distributions, the labels $z$ are determined by pairs of features $(x,y)$, (b) the training distribution has coverage of certain marginal distributions over $x$ and $y$ sep… ▽ More

    Submitted 28 July, 2023; v1 submitted 12 July, 2023; originally announced July 2023.

    Comments: The 36th Annual Conference on Learning Theory (COLT 2023)

  32. arXiv:2306.17052  [pdf, other

    cs.LG cs.AI cs.MA stat.ML

    Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning

    Authors: Matej Jusup, Barna Pásztor, Tadeusz Janik, Kenan Zhang, Francesco Corman, Andreas Krause, Ilija Bogunovic

    Abstract: Many applications, e.g., in shared mobility, require coordinating a large number of agents. Mean-field reinforcement learning addresses the resulting scalability challenge by optimizing the policy of a representative agent interacting with the infinite population of identical agents instead of considering individual pairwise interactions. In this paper, we address an important generalization where… ▽ More

    Submitted 27 December, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: 23 pages, 26 figures, 6 tables

  33. arXiv:2306.10125  [pdf, other

    cs.LG cs.AI eess.SP stat.AP

    Self-Supervised Learning for Time Series Analysis: Taxonomy, Progress, and Prospects

    Authors: Kexin Zhang, Qingsong Wen, Chaoli Zhang, Rongyao Cai, Ming Jin, Yong Liu, James Zhang, Yuxuan Liang, Guansong Pang, Dongjin Song, Shirui Pan

    Abstract: Self-supervised learning (SSL) has recently achieved impressive performance on various time series tasks. The most prominent advantage of SSL is that it reduces the dependence on labeled data. Based on the pre-training and fine-tuning strategy, even a small amount of labeled data can achieve high performance. Compared with many published self-supervised surveys on computer vision and natural langu… ▽ More

    Submitted 8 April, 2024; v1 submitted 16 June, 2023; originally announced June 2023.

    Comments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI); 26 pages, 200+ references; the first work to comprehensively and systematically summarize self-supervised learning for time series analysis (SSL4TS). The GitHub repository is https://github.com/qingsongedu/Awesome-SSL4TS

  34. arXiv:2306.07916  [pdf, other

    cs.LG cs.AI stat.ML

    Identification of Nonlinear Latent Hierarchical Models

    Authors: Lingjing Kong, Biwei Huang, Feng Xie, Eric Xing, Yuejie Chi, Kun Zhang

    Abstract: Identifying latent variables and causal structures from observational data is essential to many real-world applications involving biological data, medical data, and unstructured data such as images and languages. However, this task can be highly challenging, especially when observed variables are generated by causally related latent variables and the relationships are nonlinear. In this work, we i… ▽ More

    Submitted 31 October, 2023; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023

  35. arXiv:2306.06510  [pdf, other

    cs.LG stat.ML

    Partial Identifiability for Domain Adaptation

    Authors: Lingjing Kong, Shaoan Xie, Weiran Yao, Yujia Zheng, Guangyi Chen, Petar Stojanov, Victor Akinwande, Kun Zhang

    Abstract: Unsupervised domain adaptation is critical to many real-world applications where label information is unavailable in the target domain. In general, without further assumptions, the joint distribution of the features and the label is not identifiable in the target domain. To address this issue, we rely on the property of minimal changes of causal mechanisms across domains to minimize unnecessary in… ▽ More

    Submitted 10 June, 2023; originally announced June 2023.

    Comments: ICML 2022

  36. arXiv:2306.05751  [pdf, other

    cs.LG stat.ME

    Advancing Counterfactual Inference through Nonlinear Quantile Regression

    Authors: Shaoan Xie, Biwei Huang, Bin Gu, Tongliang Liu, Kun Zhang

    Abstract: The capacity to address counterfactual "what if" inquiries is crucial for understanding and making use of causal influences. Traditional counterfactual inference, under Pearls' counterfactual framework, typically depends on having access to or estimating a structural causal model. Yet, in practice, this causal model is often unknown and might be challenging to identify. Hence, this paper aims to p… ▽ More

    Submitted 27 February, 2024; v1 submitted 9 June, 2023; originally announced June 2023.

  37. arXiv:2305.19582  [pdf, ps, other

    cs.LG cs.AI stat.ME

    Causal Discovery with Latent Confounders Based on Higher-Order Cumulants

    Authors: Ruichu Cai, Zhiyi Huang, Wei Chen, Zhifeng Hao, Kun Zhang

    Abstract: Causal discovery with latent confounders is an important but challenging task in many scientific areas. Despite the success of some overcomplete independent component analysis (OICA) based methods in certain domains, they are computationally expensive and can easily get stuck into local optima. We notice that interestingly, by making use of higher-order cumulants, there exists a closed-form soluti… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted by ICML 2023

  38. arXiv:2305.18410  [pdf, other

    cs.LG cs.CL q-bio.GN stat.ME

    Understanding Breast Cancer Survival: Using Causality and Language Models on Multi-omics Data

    Authors: Mugariya Farooq, Shahad Hardan, Aigerim Zhumbhayeva, Yujia Zheng, Preslav Nakov, Kun Zhang

    Abstract: The need for more usable and explainable machine learning models in healthcare increases the importance of developing and utilizing causal discovery algorithms, which aim to discover causal relations by analyzing observational data. Explainable approaches aid clinicians and biologists in predicting the prognosis of diseases and suggesting proper treatments. However, very little research has been c… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

  39. arXiv:2305.11379  [pdf, other

    cs.LG stat.ML

    Generalized Precision Matrix for Scalable Estimation of Nonparametric Markov Networks

    Authors: Yujia Zheng, Ignavier Ng, Yewen Fan, Kun Zhang

    Abstract: A Markov network characterizes the conditional independence structure, or Markov property, among a set of random variables. Existing work focuses on specific families of distributions (e.g., exponential families) and/or certain structures of graphs, and most of them can only handle variables of a single data type (continuous or discrete). In this work, we characterize the conditional independence… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: ICLR 2023

  40. arXiv:2305.05986  [pdf, other

    cs.LG cs.AI stat.ME

    Structural Hawkes Processes for Learning Causal Structure from Discrete-Time Event Sequences

    Authors: Jie Qiao, Ruichu Cai, Siyu Wu, Yu Xiang, Keli Zhang, Zhifeng Hao

    Abstract: Learning causal structure among event types from discrete-time event sequences is a particularly important but challenging task. Existing methods, such as the multivariate Hawkes processes based methods, mostly boil down to learning the so-called Granger causality which assumes that the cause event happens strictly prior to its effect event. Such an assumption is often untenable beyond application… ▽ More

    Submitted 10 May, 2023; originally announced May 2023.

    Comments: Accepted by IJCAI 2023

  41. arXiv:2305.01667  [pdf, other

    cs.LG cs.CV stat.AP stat.CO

    Predict NAS Multi-Task by Stacking Ensemble Models using GP-NAS

    Authors: Ke Zhang

    Abstract: Accurately predicting the performance of architecture with small sample training is an important but not easy task. How to analysis and train dataset to overcome overfitting is the core problem we should deal with. Meanwhile if there is the mult-task problem, we should also think about if we can take advantage of their correlation and estimate as fast as we can. In this track, Super Network builds… ▽ More

    Submitted 2 May, 2023; originally announced May 2023.

    Comments: Ranked 1st in CVPR 2022 Track 2 Challenge, GP-NAS, Stacking Model, Ensemble Model

  42. arXiv:2304.05365  [pdf, other

    cs.LG stat.AP stat.ME stat.ML

    Did we personalize? Assessing personalization by an online reinforcement learning algorithm using resampling

    Authors: Susobhan Ghosh, Raphael Kim, Prasidh Chhabria, Raaz Dwivedi, Predrag Klasnja, Peng Liao, Kelly Zhang, Susan Murphy

    Abstract: There is a growing interest in using reinforcement learning (RL) to personalize sequences of treatments in digital health to support users in adopting healthier behaviors. Such sequential decision-making problems involve decisions about when to treat and how to treat based on the user's context (e.g., prior activity level, location, etc.). Online RL is a promising data-driven approach for this pro… ▽ More

    Submitted 7 August, 2023; v1 submitted 11 April, 2023; originally announced April 2023.

    Comments: The first two authors contributed equally

  43. arXiv:2304.03382  [pdf, other

    cs.LG stat.ML

    Scalable Causal Discovery with Score Matching

    Authors: Francesco Montagna, Nicoletta Noceti, Lorenzo Rosasco, Kun Zhang, Francesco Locatello

    Abstract: This paper demonstrates how to discover the whole causal graph from the second derivative of the log-likelihood in observational non-linear additive Gaussian noise models. Leveraging scalable machine learning approaches to approximate the score function $\nabla \log p(\mathbf{X})$, we extend the work of Rolland et al. (2022) that only recovers the topological order from the score and requires an e… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Journal ref: 2nd Conference on Causal Learning and Reasoning (CLeaR 2023)

  44. arXiv:2304.03265  [pdf, other

    cs.LG stat.ME

    Causal Discovery with Score Matching on Additive Models with Arbitrary Noise

    Authors: Francesco Montagna, Nicoletta Noceti, Lorenzo Rosasco, Kun Zhang, Francesco Locatello

    Abstract: Causal discovery methods are intrinsically constrained by the set of assumptions needed to ensure structure identifiability. Moreover additional restrictions are often imposed in order to simplify the inference task: this is the case for the Gaussian noise assumption on additive non-linear models, which is common to many causal discovery approaches. In this paper we show the shortcomings of infere… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

    Journal ref: 2nd Conference on Causal Learning and Reasoning (CLeaR 2023)

  45. arXiv:2304.02146  [pdf, other

    cs.LG stat.ML

    Structure Learning with Continuous Optimization: A Sober Look and Beyond

    Authors: Ignavier Ng, Biwei Huang, Kun Zhang

    Abstract: This paper investigates in which cases continuous optimization for directed acyclic graph (DAG) structure learning can and cannot perform well and why this happens, and suggests possible directions to make the search procedure more reliable. Reisach et al. (2021) suggested that the remarkable performance of several continuous structure learning approaches is primarily driven by a high agreement be… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  46. arXiv:2302.03673  [pdf, ps, other

    cs.LG cs.AI cs.GT cs.MA stat.ML

    Breaking the Curse of Multiagents in a Large State Space: RL in Markov Games with Independent Linear Function Approximation

    Authors: Qiwen Cui, Kaiqing Zhang, Simon S. Du

    Abstract: We propose a new model, independent linear Markov game, for multi-agent reinforcement learning with a large state space and a large number of agents. This is a class of Markov games with independent linear function approximation, where each agent has its own function approximation for the state-action value functions that are marginalized by other players' policies. We design new algorithms for le… ▽ More

    Submitted 21 June, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: 51 pages. Update: Accepted for presentation at the Conference on Learning Theory (COLT) 2023

  47. arXiv:2301.08987  [pdf, ps, other

    cs.LG stat.ML

    Tier Balancing: Towards Dynamic Fairness over Underlying Causal Factors

    Authors: Zeyu Tang, Yatong Chen, Yang Liu, Kun Zhang

    Abstract: The pursuit of long-term fairness involves the interplay between decision-making and the underlying data generating process. In this paper, through causal modeling with a directed acyclic graph (DAG) on the decision-distribution interplay, we investigate the possibility of achieving long-term fairness from a dynamic perspective. We propose Tier Balancing, a technically more challenging but more na… ▽ More

    Submitted 6 June, 2023; v1 submitted 21 January, 2023; originally announced January 2023.

    Journal ref: The 11th International Conference on Learning Representations (ICLR 2023)

  48. arXiv:2212.14511  [pdf, other

    cs.LG eess.SY math.OC stat.ML

    Can Direct Latent Model Learning Solve Linear Quadratic Gaussian Control?

    Authors: Yi Tian, Kaiqing Zhang, Russ Tedrake, Suvrit Sra

    Abstract: We study the task of learning state representations from potentially high-dimensional observations, with the goal of controlling an unknown partially observable system. We pursue a direct latent model learning approach, where a dynamic model in some latent state space is learned by predicting quantities directly related to planning (e.g., costs) without reconstructing the observations. In particul… ▽ More

    Submitted 13 March, 2024; v1 submitted 29 December, 2022; originally announced December 2022.

    Comments: 37 pages; Updated structure and proofs

  49. arXiv:2212.13861  [pdf, ps, other

    cs.LG math.OC stat.ML

    Revisiting the Linear-Programming Framework for Offline RL with General Function Approximation

    Authors: Asuman Ozdaglar, Sarath Pattathil, Jiawei Zhang, Kaiqing Zhang

    Abstract: Offline reinforcement learning (RL) aims to find an optimal policy for sequential decision-making using a pre-collected dataset, without further interaction with the environment. Recent theoretical progress has focused on developing sample-efficient offline RL algorithms with various relaxed assumptions on data coverage and function approximators, especially to handle the case with excessively lar… ▽ More

    Submitted 8 February, 2023; v1 submitted 28 December, 2022; originally announced December 2022.

    Comments: 35 pages

  50. arXiv:2212.12658  [pdf, other

    cs.LG stat.ML

    Improving Uncertainty Quantification of Variance Networks by Tree-Structured Learning

    Authors: Wenxuan Ma, Xing Yan, Kun Zhang

    Abstract: To improve the uncertainty quantification of variance networks, we propose a novel tree-structured local neural network model that partitions the feature space into multiple regions based on uncertainty heterogeneity. A tree is built upon giving the training data, whose leaf nodes represent different regions where region-specific neural networks are trained to predict both the mean and the varianc… ▽ More

    Submitted 19 July, 2023; v1 submitted 24 December, 2022; originally announced December 2022.