-
Robust inference with GhostKnockoffs in genome-wide association studies
Authors:
Xinran Qi,
Michael E. Belloy,
Jiaqi Gu,
Xiaoxia Liu,
Hua Tang,
Zihuai He
Abstract:
Genome-wide association studies (GWASs) have been extensively adopted to depict the underlying genetic architecture of complex diseases. Motivated by GWASs' limitations in identifying small effect loci to understand complex traits' polygenicity and fine-mapping putative causal variants from proxy ones, we propose a knockoff-based method which only requires summary statistics from GWASs and demonst…
▽ More
Genome-wide association studies (GWASs) have been extensively adopted to depict the underlying genetic architecture of complex diseases. Motivated by GWASs' limitations in identifying small effect loci to understand complex traits' polygenicity and fine-mapping putative causal variants from proxy ones, we propose a knockoff-based method which only requires summary statistics from GWASs and demonstrate its validity in the presence of relatedness. We show that GhostKnockoffs inference is robust to its input Z-scores as long as they are from valid marginal association tests and their correlations are consistent with the correlations among the corresponding genetic variants. The property generalizes GhostKnockoffs to other GWASs settings, such as the meta-analysis of multiple overlapping studies and studies based on association test statistics deviated from score tests. We demonstrate GhostKnockoffs' performance using empirical simulation and a meta-analysis of nine European ancestral genome-wide association studies and whole exome/genome sequencing studies. Both results demonstrate that GhostKnockoffs identify more putative causal variants with weak genotype-phenotype associations that are missed by conventional GWASs.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Understanding Optimization of Deep Learning via Jacobian Matrix and Lipschitz Constant
Authors:
Xianbiao Qi,
Jianan Wang,
Lei Zhang
Abstract:
This article provides a comprehensive understanding of optimization in deep learning, with a primary focus on the challenges of gradient vanishing and gradient exploding, which normally lead to diminished model representational ability and training instability, respectively. We analyze these two challenges through several strategic measures, including the improvement of gradient flow and the impos…
▽ More
This article provides a comprehensive understanding of optimization in deep learning, with a primary focus on the challenges of gradient vanishing and gradient exploding, which normally lead to diminished model representational ability and training instability, respectively. We analyze these two challenges through several strategic measures, including the improvement of gradient flow and the imposition of constraints on a network's Lipschitz constant. To help understand the current optimization methodologies, we categorize them into two classes: explicit optimization and implicit optimization. Explicit optimization methods involve direct manipulation of optimizer parameters, including weight, gradient, learning rate, and weight decay. Implicit optimization methods, by contrast, focus on improving the overall landscape of a network by enhancing its modules, such as residual shortcuts, normalization methods, attention mechanisms, and activations. In this article, we provide an in-depth analysis of these two optimization classes and undertake a thorough examination of the Jacobian matrices and the Lipschitz constants of many widely used deep learning modules, highlighting existing issues as well as potential improvements. Moreover, we also conduct a series of analytical experiments to substantiate our theoretical discussions. This article does not aim to propose a new optimizer or network. Rather, our intention is to present a comprehensive understanding of optimization in deep learning. We hope that this article will assist readers in gaining a deeper insight in this field and encourages the development of more robust, efficient, and high-performing models.
△ Less
Submitted 12 November, 2023; v1 submitted 15 June, 2023;
originally announced June 2023.
-
A Review: Random Walk in Graph Sampling
Authors:
Xiao Qi
Abstract:
Graph sampling is a technique to pick a subset of vertices and/ or edges from original graph. Among various graph sampling approaches, Traversal Based Sampling (TBS) are widely used due to low cost and feasibility for many cases, in which Simple Random Walk (SRW) and its variants share a large proportion in TBS. We illustrate the foundation SRW and presents the problems of SRW. Based on the proble…
▽ More
Graph sampling is a technique to pick a subset of vertices and/ or edges from original graph. Among various graph sampling approaches, Traversal Based Sampling (TBS) are widely used due to low cost and feasibility for many cases, in which Simple Random Walk (SRW) and its variants share a large proportion in TBS. We illustrate the foundation SRW and presents the problems of SRW. Based on the problems, we provide a taxonomy of different Random Walk (RW) based graph sampling methods and give an insight to the reason why and how they revise SRW. our summary includes classical methods and state-of-art RW-based methods. There are 3 ways to propose new algorithms based on SRW, including SRW and its combinations, modified selection mechanisms, and the graph topology modification. We explained the ideas behind those algorithms, and present detailed pseudo codes. In addition, we add the mathematics behind random walk, and the essence of random walk variants, which is not mentioned in detail in many research papers and literature reviews. Apart from RW-based methods, SRW also has related with the non-RW and non-TBS methods, we discuss the relationships between SRW and non-RW methods, and the relationships between SRW and non-TBS methods. The relations between these approaches are formally argued and a general framework to bridge theoretical analysis and practical implementation is provided.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Efficient Random Walk based Sampling with Inverse Degree
Authors:
Xiao Qi
Abstract:
Random walk sampling methods have been widely used in graph sampling in recent years, while it has bias towards higher degree nodes in the sample. To overcome this deficiency, classical methods such as MHRW design weighted walking by repeating low-degree nodes while rejecting high-degree nodes, so that the long-term behavior of Markov chain can achieve uniform distribution. This modification, howe…
▽ More
Random walk sampling methods have been widely used in graph sampling in recent years, while it has bias towards higher degree nodes in the sample. To overcome this deficiency, classical methods such as MHRW design weighted walking by repeating low-degree nodes while rejecting high-degree nodes, so that the long-term behavior of Markov chain can achieve uniform distribution. This modification, however, may make the sampler stay in the same node for several times, leading to undersampling. To address this issue, we propose a sampling framework that only need current and candidate node degree to improve the performance of graph sampling methods. We also extend our original idea to a more general framework. Our extended IDRW method finds a balance between the large deviation problem of SRW and sample rejection problem in MHRW. We evaluate our technique in simulation by running extensive experiments on various real-world datasets, and the result show that our method improves the accuracy compared with the state of art techniques. We also investigate the effect of the parameter and give the suggested range for a better usage in application.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Weighted Jump in Random Walk Graph Sampling
Authors:
Xiao Qi
Abstract:
Random walk based sampling methods have been widely used in graph sampling in recent years, while it has bias towards higher degree nodes in the sample. To overcome this deficiency, classical methods such as GMD modify the topology of target graphs so that the long-term behavior of Markov chain can achieve uniform distribution. This modification, however, reduces the conductance of graphs, thus ma…
▽ More
Random walk based sampling methods have been widely used in graph sampling in recent years, while it has bias towards higher degree nodes in the sample. To overcome this deficiency, classical methods such as GMD modify the topology of target graphs so that the long-term behavior of Markov chain can achieve uniform distribution. This modification, however, reduces the conductance of graphs, thus makes the sampler stay in the same node for long time, resulting in undersampling. To address this issue, we propose a new way of modifying target graph, thus propose Weighted Jump Random Walk (WJRW) with parameter C to improve the performance. We prove that WJRW can unify Simple Random Walk and uniform distribution through C, and we also conduct extensive experiments on real-world dataset. The experimental results show WJRW can promote the accuracy significantly under the same budget. We also investigate the effect of the parameter C, and give the suggested range for a better usage in application.
△ Less
Submitted 26 September, 2022;
originally announced September 2022.
-
Sampling Online Social Networks: Metropolis Hastings Random Walk and Random Walk
Authors:
Xiao Qi
Abstract:
As social network analysis (SNA) has drawn much attention in recent years, one bottleneck of SNA is these network data are too massive to handle. Furthermore, some network data are not accessible due to privacy problems. Therefore, we have to develop sampling methods to draw representative sample graphs from the population graph. In this paper, Metropolis-Hastings Random Walk (MHRW) and Random Wal…
▽ More
As social network analysis (SNA) has drawn much attention in recent years, one bottleneck of SNA is these network data are too massive to handle. Furthermore, some network data are not accessible due to privacy problems. Therefore, we have to develop sampling methods to draw representative sample graphs from the population graph. In this paper, Metropolis-Hastings Random Walk (MHRW) and Random Walk with Jumps (RWwJ) sampling strategies are introduced, including the procedure of collecting nodes, the underlying mathematical theory, and corresponding estimators. We compared our methods and existing research outcomes and found that MHRW performs better when estimating degree distribution (61% less error than RWwJ) and graph order (0.69% less error than RWwJ), while RWwJ estimates follower and following ratio average and mutual relationship proportion in adjacent relationship with better results, with 13% less error and 6% less error than MHRW. We analyze the reasons for the outcomes and give possible future work directions.
△ Less
Submitted 12 May, 2022;
originally announced May 2022.
-
Bayesian Knockoff Generators for Robust Inference Under Complex Data Structure
Authors:
Michael J. Martens,
Anjishnu Banerjee,
Xinran Qi,
Yushu Shi
Abstract:
The recent proliferation of medical data, such as genetics and electronic health records (EHR), offers new opportunities to find novel predictors of health outcomes. Presented with a large set of candidate features, interest often lies in selecting the ones most likely to be predictive of an outcome for further study such that the goal is to control the false discovery rate (FDR) at a specified le…
▽ More
The recent proliferation of medical data, such as genetics and electronic health records (EHR), offers new opportunities to find novel predictors of health outcomes. Presented with a large set of candidate features, interest often lies in selecting the ones most likely to be predictive of an outcome for further study such that the goal is to control the false discovery rate (FDR) at a specified level. Knockoff filtering is an innovative strategy for FDR-controlled feature selection. But, existing knockoff methods make strong distributional assumptions that hinder their applicability to real world data. We propose Bayesian models for generating high quality knockoff copies that utilize available knowledge about the data structure, thus improving the resolution of prognostic features. Applications to two feature sets are considered: those with categorical and/or continuous variables possibly having a population substructure, such as in EHR; and those with microbiome features having a compositional constraint and phylogenetic relatedness. Through simulations and real data applications, these methods are shown to identify important features with good FDR control and power.
△ Less
Submitted 12 November, 2021;
originally announced November 2021.
-
Kernel regression for cause-specific hazard models with time-dependent coefficients
Authors:
Xiaomeng Qi,
Zhangsheng Yu
Abstract:
Competing risk data appear widely in modern biomedical research. Cause-specific hazard models are often used to deal with competing risk data in the past two decades. There is no current study on the kernel likelihood method for the cause-specific hazard model with time-varying coefficients. We propose to use the local partial log-likelihood approach for nonparametric time-varying coefficient esti…
▽ More
Competing risk data appear widely in modern biomedical research. Cause-specific hazard models are often used to deal with competing risk data in the past two decades. There is no current study on the kernel likelihood method for the cause-specific hazard model with time-varying coefficients. We propose to use the local partial log-likelihood approach for nonparametric time-varying coefficient estimation. Simulation studies demonstrate that our proposed nonparametric kernel estimator has a good performance under assumed finite sample settings. Finally, we apply the proposed method to analyze a diabetes dialysis study with competing death causes.
△ Less
Submitted 11 September, 2021; v1 submitted 23 July, 2021;
originally announced July 2021.
-
Meta-analysis of Censored Adverse Events
Authors:
Xinyue Qi,
Shouhao Zhou,
Christine B. Peterson,
Yucai Wang,
Xinying Fang,
Michael L. Wang,
Chan Shen
Abstract:
Meta-analysis is a powerful tool for assessing drug safety by combining treatment-related toxicological findings across multiple studies, as clinical trials are typically underpowered for detecting adverse drug effects. However, incomplete reporting of adverse events (AEs) in published clinical studies is a frequent issue, especially if the observed number of AEs is below a pre-specified study-dep…
▽ More
Meta-analysis is a powerful tool for assessing drug safety by combining treatment-related toxicological findings across multiple studies, as clinical trials are typically underpowered for detecting adverse drug effects. However, incomplete reporting of adverse events (AEs) in published clinical studies is a frequent issue, especially if the observed number of AEs is below a pre-specified study-dependent threshold. Ignoring the censored AE information, often found in lower frequency, can significantly bias the estimated incidence rate of AEs. Despite its importance, this common meta-analysis problem has received little statistical or analytic attention in the literature. To address this challenge, we propose a Bayesian approach to accommodating the censored and possibly rare AEs for meta-analysis of safety data. Through simulation studies, we demonstrate that the proposed method can improves accuracy in point and interval estimation of incidence probabilities, particularly in the presence of censored data. Overall, the proposed method provides a practical solution that can facilitate better-informed decisions regarding drug safety.
△ Less
Submitted 8 February, 2024; v1 submitted 19 January, 2021;
originally announced January 2021.
-
A Note on Bayesian Modeling Specification of Censored Data in JAGS
Authors:
Xinyue Qi,
Shouhao Zhou,
Martyn Plummer
Abstract:
Just Another Gibbs Sampling (JAGS) is a convenient tool to draw posterior samples using Markov Chain Monte Carlo for Bayesian modeling. However, the built-in function dinterval() to model censored data misspecifies the computation of deviance function, which may limit its usage to perform likelihood based model comparison. To establish an automatic approach to specify the correct deviance function…
▽ More
Just Another Gibbs Sampling (JAGS) is a convenient tool to draw posterior samples using Markov Chain Monte Carlo for Bayesian modeling. However, the built-in function dinterval() to model censored data misspecifies the computation of deviance function, which may limit its usage to perform likelihood based model comparison. To establish an automatic approach to specify the correct deviance function in JAGS, we propose a simple alternative modeling strategy to implement Bayesian model selection for analysis of censored outcomes. The proposed approach is applicable to a broad spectrum of data types, which include survival data and many other right-, left- and interval-censored Bayesian model structures.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
Gödel's Sentence Is An Adversarial Example But Unsolvable
Authors:
Xiaodong Qi,
Lansheng Han
Abstract:
In recent years, different types of adversarial examples from different fields have emerged endlessly, including purely natural ones without perturbations. A variety of defenses are proposed and then broken quickly. Two fundamental questions need to be asked: What's the reason for the existence of adversarial examples and are adversarial examples unsolvable? In this paper, we will show the reason…
▽ More
In recent years, different types of adversarial examples from different fields have emerged endlessly, including purely natural ones without perturbations. A variety of defenses are proposed and then broken quickly. Two fundamental questions need to be asked: What's the reason for the existence of adversarial examples and are adversarial examples unsolvable? In this paper, we will show the reason for the existence of adversarial examples is there are non-isomorphic natural explanations that can all explain data set. Specifically, for two natural explanations of being true and provable, Gödel's sentence is an adversarial example but ineliminable. It can't be solved by the re-accumulation of data set or the re-improvement of learning algorithm. Finally, from the perspective of computability, we will prove the incomputability for adversarial examples, which are unrecognizable.
△ Less
Submitted 25 February, 2020;
originally announced February 2020.
-
Mode Collapse and Regularity of Optimal Transportation Maps
Authors:
Na Lei,
Yang Guo,
Dongsheng An,
Xin Qi,
Zhongxuan Luo,
Shing-Tung Yau,
Xianfeng Gu
Abstract:
This work builds the connection between the regularity theory of optimal transportation map, Monge-Ampère equation and GANs, which gives a theoretic understanding of the major drawbacks of GANs: convergence difficulty and mode collapse.
According to the regularity theory of Monge-Ampère equation, if the support of the target measure is disconnected or just non-convex, the optimal transportation…
▽ More
This work builds the connection between the regularity theory of optimal transportation map, Monge-Ampère equation and GANs, which gives a theoretic understanding of the major drawbacks of GANs: convergence difficulty and mode collapse.
According to the regularity theory of Monge-Ampère equation, if the support of the target measure is disconnected or just non-convex, the optimal transportation mapping is discontinuous. General DNNs can only approximate continuous mappings. This intrinsic conflict leads to the convergence difficulty and mode collapse in GANs.
We test our hypothesis that the supports of real data distribution are in general non-convex, therefore the discontinuity is unavoidable using an Autoencoder combined with discrete optimal transportation map (AE-OT framework) on the CelebA data set. The testing result is positive. Furthermore, we propose to approximate the continuous Brenier potential directly based on discrete Brenier theory to tackle mode collapse. Comparing with existing method, this method is more accurate and effective.
△ Less
Submitted 7 February, 2019;
originally announced February 2019.
-
Deep Reinforcement Learning for Imbalanced Classification
Authors:
Enlu Lin,
Qiong Chen,
Xiaoming Qi
Abstract:
Data in real-world application often exhibit skewed class distribution which poses an intense challenge for machine learning. Conventional classification algorithms are not effective in the case of imbalanced data distribution, and may fail when the data distribution is highly imbalanced. To address this issue, we propose a general imbalanced classification model based on deep reinforcement learni…
▽ More
Data in real-world application often exhibit skewed class distribution which poses an intense challenge for machine learning. Conventional classification algorithms are not effective in the case of imbalanced data distribution, and may fail when the data distribution is highly imbalanced. To address this issue, we propose a general imbalanced classification model based on deep reinforcement learning. We formulate the classification problem as a sequential decision-making process and solve it by deep Q-learning network. The agent performs a classification action on one sample at each time step, and the environment evaluates the classification action and returns a reward to the agent. The reward from minority class sample is larger so the agent is more sensitive to the minority class. The agent finally finds an optimal classification policy in imbalanced data under the guidance of specific reward function and beneficial learning environment. Experiments show that our proposed model outperforms the other imbalanced classification algorithms, and it can identify more minority samples and has great classification performance.
△ Less
Submitted 5 January, 2019;
originally announced January 2019.
-
Sparse Fisher's discriminant analysis with thresholded linear constraints
Authors:
Ruiyan Luo,
Xin Qi
Abstract:
Various regularized linear discriminant analysis (LDA) methods have been proposed to address the problems of the classic methods in high-dimensional settings. Asymptotic optimality has been established for some of these methods in high dimension when there are only two classes. A major difficulty in proving asymptotic optimality for multiclass classification is that the classification boundary is…
▽ More
Various regularized linear discriminant analysis (LDA) methods have been proposed to address the problems of the classic methods in high-dimensional settings. Asymptotic optimality has been established for some of these methods in high dimension when there are only two classes. A major difficulty in proving asymptotic optimality for multiclass classification is that the classification boundary is typically complicated and no explicit formula for classification error generally exists when the number of classes is greater than two. For the Fisher's LDA, one additional difficulty is that the covariance matrix is also involved in the linear constraints. The main purpose of this paper is to establish asymptotic consistency and asymptotic optimality for our sparse Fisher's LDA with thresholded linear constraints in the high-dimensional settings for arbitrary number of classes. To address the first difficulty above, we provide asymptotic optimality and the corresponding convergence rates in high-dimensional settings for a large family of linear classification rules with arbitrary number of classes, and apply them to our method. To overcome the second difficulty, we propose a thresholding approach to avoid the estimate of the covariance matrix. We apply the method to the classification problems for multivariate functional data through the wavelet transformations.
△ Less
Submitted 5 August, 2015;
originally announced August 2015.
-
Signal extraction approach for sparse multivariate response regression
Authors:
Ruiyan Luo,
Xin Qi
Abstract:
In this paper, we consider multivariate response regression models with high dimensional predictor variables. One way to model the correlation among the response variables is through the low rank decomposition of the coefficient matrix, which has been considered by several papers for the high dimensional predictors. However, all these papers focus on the singular value decomposition of the coeffic…
▽ More
In this paper, we consider multivariate response regression models with high dimensional predictor variables. One way to model the correlation among the response variables is through the low rank decomposition of the coefficient matrix, which has been considered by several papers for the high dimensional predictors. However, all these papers focus on the singular value decomposition of the coefficient matrix. Our target is the decomposition of the coefficient matrix which leads to the best lower rank approximation to the regression function, the signal part in the response. Given any rank, this decomposition has nearly the smallest expected prediction error among all approximations to the the coefficient matrix with the same rank. To estimate the decomposition, we formulate a penalized generalized eigenvalue problem to obtain the first matrix in the decomposition and then obtain the second one by a least squares method. In the high-dimensional setting, we establish the oracle inequalities for the estimates. Compared to the existing theoretical results, we have less restrictions on the distribution of the noise vector in each observation and allow correlations among its coordinates. Our theoretical results do not depend on the dimension of the multivariate response. Therefore, the dimension is arbitrary and can be larger than the sample size and the dimension of the predictor. Simulation studies and application to real data show that the proposed method has good prediction performance and is efficient in dimension reduction for various reduced rank models.
△ Less
Submitted 5 August, 2015;
originally announced August 2015.
-
Demixed principal component analysis of population activity in higher cortical areas reveals independent representation of task parameters
Authors:
Dmitry Kobak,
Wieland Brendel,
Christos Constantinidis,
Claudia E. Feierstein,
Adam Kepecs,
Zachary F. Mainen,
Ranulfo Romo,
Xue-Lian Qi,
Naoshige Uchida,
Christian K. Machens
Abstract:
Neurons in higher cortical areas, such as the prefrontal cortex, are known to be tuned to a variety of sensory and motor variables. The resulting diversity of neural tuning often obscures the represented information. Here we introduce a novel dimensionality reduction technique, demixed principal component analysis (dPCA), which automatically discovers and highlights the essential features in compl…
▽ More
Neurons in higher cortical areas, such as the prefrontal cortex, are known to be tuned to a variety of sensory and motor variables. The resulting diversity of neural tuning often obscures the represented information. Here we introduce a novel dimensionality reduction technique, demixed principal component analysis (dPCA), which automatically discovers and highlights the essential features in complex population activities. We reanalyze population data from the prefrontal areas of rats and monkeys performing a variety of working memory and decision-making tasks. In each case, dPCA summarizes the relevant features of the population response in a single figure. The population activity is decomposed into a few demixed components that capture most of the variance in the data and that highlight dynamic tuning of the population to various task parameters, such as stimuli, decisions, rewards, etc. Moreover, dPCA reveals strong, condition-independent components of the population activity that remain unnoticed with conventional approaches.
△ Less
Submitted 22 October, 2014;
originally announced October 2014.