We propose a flexible regression model for multivariate mixed responses, where different numbers ... more We propose a flexible regression model for multivariate mixed responses, where different numbers of locations are used for each margin, and joined by a full association structure, which can be simplified by using a Parafac-based representation. This structure of dependence is more general, and properly nests the independence model, but it does not come at no cost. In fact, by adopting this representation, the number of parameters can be shown to increase exponentially with the number of the analysed outcomes. Therefore, we propose a parsimonious representation of this multi-way array through the Parafac model, Harshman, 1970 and Kapteyn et al. , 1986. This helps us define a flaexible model that can account for profile-specific heterogeneity, general form of dependence between profiles and a parsiomious representations of the latter.
Dropout represents a typical issue to be addressed when dealing with longitudinal studies. If the... more Dropout represents a typical issue to be addressed when dealing with longitudinal studies. If the mechanism leading to missing information is non-ignorable, inference based on the observed data only may be severely biased. A frequent strategy to obtain reliable parameter estimates is based on the use of individual-specific random coefficients that help capture sources of unobserved heterogeneity and, at the same time, define a reasonable structure of dependence between the longitudinal and the missing data process. We refer to elements in this class as random coefficient based dropout models (RCBDMs). We propose a dynamic, semi-parametric, version of the standard RCBDM to deal with discrete time to event. Time-varying random coefficients that evolve over time according to a non-homogeneous hidden Markov chain are considered to model dependence between longitudinal responses recorded from the same subject. A separate set of random coefficients is considered to model dependence betwee...
Longitudinal data are characterized by the dependence between observations coming from the same i... more Longitudinal data are characterized by the dependence between observations coming from the same individual. In a regression perspective, such a dependence can be usefully ascribed to unobserved features (covariates) specific to each individual. On these grounds, random parameter models with time-constant or time-varying structure are well established in the generalized linear model context. In the quantile regression framework, specifications based on random parameters have only recently known a flowering interest. We start from the recent proposal by Farcomeni (2012) on longitudinal quantile hidden Markov models, and extend it to handle potentially informative missing data mechanism. In particular, we focus on monotone missingness which may lead to selection bias and, therefore, to unreliable inferences on model parameters. We detail the proposed approach by re-analyzing a well known dataset on the dynamics of CD4 cell counts in HIV seroconverters and by means of a simulation study.
In longitudinal studies, subjects may be lost to follow up (a phenomenonwhich is often referred t... more In longitudinal studies, subjects may be lost to follow up (a phenomenonwhich is often referred to as attrition) or miss some of the planned visits thus generatingincomplete responses. When the probability for nonresponse, once conditionedon observed covariates and responses, still depends on the unobserved responses,the dropout mechanism is known to be informative. A common objectivein these studies is to build a general, reliable, association structure to account fordependence between the longitudinal and the dropout processes. Starting from theexisting literature, we introduce a random coefficient based dropout model wherethe association between outcomes is modeled through discrete latent effects; theselatent effects are outcome-specific and account for heterogeneity in the univariateprofiles. Dependence between profiles is introduced by using a bidimensional representationfor the corresponding distribution. In this way, we define a flexible latentclass structure, with possibly d...
In longitudinal studies, subjects may be lost to follow up and, thus, present incomplete response... more In longitudinal studies, subjects may be lost to follow up and, thus, present incomplete response sequences. When the mechanism underlying the dropout is nonignorable, we need to account for dependence between the longitudinal and the dropout process. We propose to model such a dependence through discrete latent effects, which are outcome-specific and account for heterogeneity in the univariate profiles. Dependence between profiles is introduced by using a probability matrix to describe the corresponding joint distribution. In this way, we separately model dependence within each outcome and dependence between outcomes. The major feature of this proposal, when compared with standard finite mixture models, is that it allows the nonignorable dropout model to properly nest its ignorable counterpart. We also discuss the use of an index of (local) sensitivity to nonignorability to investigate the effects that assumptions about the dropout process may have on model parameter estimates. The...
Linear quantile regression models represent a general class of models that aims at providing a de... more Linear quantile regression models represent a general class of models that aims at providing a detailed and robust picture of the (conditional) response distribution as function of a set of observed covariates. Longitudinal data represent an interesting potential field of application of such models; due to their peculiar features, they represent a substantial challenge, in that the standard, cross-sectional, model representation needs to be extended for dealing with such kind of data. In fact, repeated observations from the same statistical unit poses a problem of dependence; in a conditional perspective, this dependence could be ascribed to sources of unobserved, individual-specific, heterogeneity. Along these lines, quantile regression models have recently been extended to the analysis of longitudinal, continuous, responses, by modelling dependence via time-constant, see GeraciBottai2007, or time-varying, see Farcomeni2012, random effects. In this manuscript, we introduce a general quantile regression model for longitudinal, continuous, responses where time-varying and time-constant random parameters with unspecific distribution are jointly taken into account. A further feature of longitudinal designs is the presence of partially incomplete sequences, due to some individuals leaving the study before its designed end. The missing data generating process may produce a selection of units which can be informative with respect to the parameters of the longitudinal data model. Therefore, a further extension is needed. To deal with the case of irretrievable drop-out, we introduce a pattern mixture version of the linear quantile hidden Markov model, where we account for time-varying heterogeneity and for changes in the fixed effect vector due to differential propensities to stay in the study. The proposed models are illustrated using a well known benchmark dataset on longitudinal dynamics of CD4 cells and by means of a large scale simulation study, entailing different quantiles and both complete and partially complete (ie subject to drop-out) individual sequences.
Advances in Latent Variables Methods Models and Applications, Jun 11, 2013
A generalized linear mixed model with a nonparametric distribution for the random effect is propo... more A generalized linear mixed model with a nonparametric distribution for the random effect is proposed. In the context of nonparametric graphical models, we take advantage of the nonparanormal approach to build a flexible latent, individual specific structure for the longitudinal profiles. The nonparanormal method is particularly appealing since it acts on transformations of multivariate non-Gaussian random variables, and assumes that these transformations are multivariate Gaussian. Moreover, it is particularly convenient to handle the joint distribution for high dimensional variables. In the case of generalized linear mixed models, the normality assumption for the random effects may be too restrictive to represent the between subject distribution, especially when the longitudinal response is non-Gaussian.
A vast literature has recently concerned the measurement of quality dimensions such as access, ef... more A vast literature has recently concerned the measurement of quality dimensions such as access, effectiveness, performance and outcome of health services supplied by national health care providers. The main concern is to achieve a classification of administrative areas with respect to observed quality indicators. We describe a simple and effective procedure to achieve this goal which allows powerful testing of the hypothesized cluster structure. We describe the performance of this method on a dataset on preventable hospitalizations (PPH) in Italy during 1998, in order to highlight clusters of regions with homogeneous relative risk.
Quantile regression provides a detailed and robust picture of the distribution of a response vari... more Quantile regression provides a detailed and robust picture of the distribution of a response variable, conditional on a set of observed covariates. Recently, it has be been extended to the analysis of longitudinal continuous outcomes using either time-constant or time-varying random parameters. However, in real-life data, we frequently observe both temporal shocks in the overall trend and individual-specific heterogeneity in model parameters. A benchmark dataset on HIV progression gives a clear example. Here, the evolution of the CD4 log counts exhibits both sudden temporal changes in the overall trend and heterogeneity in the effect of the time since seroconversion on the response dynamics. To accommodate such situations, we propose a quantile regression model, where time-varying and time-constant random coefficients are jointly considered. Since observed data may be incomplete due to early drop-out, we also extend the proposed model in a pattern mixture perspective. We assess the performance of the proposals via a large-scale simulation study and the analysis of the CD4 count data.
We propose a flexible regression model for multivariate mixed responses, where different numbers ... more We propose a flexible regression model for multivariate mixed responses, where different numbers of locations are used for each margin, and joined by a full association structure, which can be simplified by using a Parafac-based representation. This structure of dependence is more general, and properly nests the independence model, but it does not come at no cost. In fact, by adopting this representation, the number of parameters can be shown to increase exponentially with the number of the analysed outcomes. Therefore, we propose a parsimonious representation of this multi-way array through the Parafac model, Harshman, 1970 and Kapteyn et al. , 1986. This helps us define a flaexible model that can account for profile-specific heterogeneity, general form of dependence between profiles and a parsiomious representations of the latter.
Dropout represents a typical issue to be addressed when dealing with longitudinal studies. If the... more Dropout represents a typical issue to be addressed when dealing with longitudinal studies. If the mechanism leading to missing information is non-ignorable, inference based on the observed data only may be severely biased. A frequent strategy to obtain reliable parameter estimates is based on the use of individual-specific random coefficients that help capture sources of unobserved heterogeneity and, at the same time, define a reasonable structure of dependence between the longitudinal and the missing data process. We refer to elements in this class as random coefficient based dropout models (RCBDMs). We propose a dynamic, semi-parametric, version of the standard RCBDM to deal with discrete time to event. Time-varying random coefficients that evolve over time according to a non-homogeneous hidden Markov chain are considered to model dependence between longitudinal responses recorded from the same subject. A separate set of random coefficients is considered to model dependence betwee...
Longitudinal data are characterized by the dependence between observations coming from the same i... more Longitudinal data are characterized by the dependence between observations coming from the same individual. In a regression perspective, such a dependence can be usefully ascribed to unobserved features (covariates) specific to each individual. On these grounds, random parameter models with time-constant or time-varying structure are well established in the generalized linear model context. In the quantile regression framework, specifications based on random parameters have only recently known a flowering interest. We start from the recent proposal by Farcomeni (2012) on longitudinal quantile hidden Markov models, and extend it to handle potentially informative missing data mechanism. In particular, we focus on monotone missingness which may lead to selection bias and, therefore, to unreliable inferences on model parameters. We detail the proposed approach by re-analyzing a well known dataset on the dynamics of CD4 cell counts in HIV seroconverters and by means of a simulation study.
In longitudinal studies, subjects may be lost to follow up (a phenomenonwhich is often referred t... more In longitudinal studies, subjects may be lost to follow up (a phenomenonwhich is often referred to as attrition) or miss some of the planned visits thus generatingincomplete responses. When the probability for nonresponse, once conditionedon observed covariates and responses, still depends on the unobserved responses,the dropout mechanism is known to be informative. A common objectivein these studies is to build a general, reliable, association structure to account fordependence between the longitudinal and the dropout processes. Starting from theexisting literature, we introduce a random coefficient based dropout model wherethe association between outcomes is modeled through discrete latent effects; theselatent effects are outcome-specific and account for heterogeneity in the univariateprofiles. Dependence between profiles is introduced by using a bidimensional representationfor the corresponding distribution. In this way, we define a flexible latentclass structure, with possibly d...
In longitudinal studies, subjects may be lost to follow up and, thus, present incomplete response... more In longitudinal studies, subjects may be lost to follow up and, thus, present incomplete response sequences. When the mechanism underlying the dropout is nonignorable, we need to account for dependence between the longitudinal and the dropout process. We propose to model such a dependence through discrete latent effects, which are outcome-specific and account for heterogeneity in the univariate profiles. Dependence between profiles is introduced by using a probability matrix to describe the corresponding joint distribution. In this way, we separately model dependence within each outcome and dependence between outcomes. The major feature of this proposal, when compared with standard finite mixture models, is that it allows the nonignorable dropout model to properly nest its ignorable counterpart. We also discuss the use of an index of (local) sensitivity to nonignorability to investigate the effects that assumptions about the dropout process may have on model parameter estimates. The...
Linear quantile regression models represent a general class of models that aims at providing a de... more Linear quantile regression models represent a general class of models that aims at providing a detailed and robust picture of the (conditional) response distribution as function of a set of observed covariates. Longitudinal data represent an interesting potential field of application of such models; due to their peculiar features, they represent a substantial challenge, in that the standard, cross-sectional, model representation needs to be extended for dealing with such kind of data. In fact, repeated observations from the same statistical unit poses a problem of dependence; in a conditional perspective, this dependence could be ascribed to sources of unobserved, individual-specific, heterogeneity. Along these lines, quantile regression models have recently been extended to the analysis of longitudinal, continuous, responses, by modelling dependence via time-constant, see GeraciBottai2007, or time-varying, see Farcomeni2012, random effects. In this manuscript, we introduce a general quantile regression model for longitudinal, continuous, responses where time-varying and time-constant random parameters with unspecific distribution are jointly taken into account. A further feature of longitudinal designs is the presence of partially incomplete sequences, due to some individuals leaving the study before its designed end. The missing data generating process may produce a selection of units which can be informative with respect to the parameters of the longitudinal data model. Therefore, a further extension is needed. To deal with the case of irretrievable drop-out, we introduce a pattern mixture version of the linear quantile hidden Markov model, where we account for time-varying heterogeneity and for changes in the fixed effect vector due to differential propensities to stay in the study. The proposed models are illustrated using a well known benchmark dataset on longitudinal dynamics of CD4 cells and by means of a large scale simulation study, entailing different quantiles and both complete and partially complete (ie subject to drop-out) individual sequences.
Advances in Latent Variables Methods Models and Applications, Jun 11, 2013
A generalized linear mixed model with a nonparametric distribution for the random effect is propo... more A generalized linear mixed model with a nonparametric distribution for the random effect is proposed. In the context of nonparametric graphical models, we take advantage of the nonparanormal approach to build a flexible latent, individual specific structure for the longitudinal profiles. The nonparanormal method is particularly appealing since it acts on transformations of multivariate non-Gaussian random variables, and assumes that these transformations are multivariate Gaussian. Moreover, it is particularly convenient to handle the joint distribution for high dimensional variables. In the case of generalized linear mixed models, the normality assumption for the random effects may be too restrictive to represent the between subject distribution, especially when the longitudinal response is non-Gaussian.
A vast literature has recently concerned the measurement of quality dimensions such as access, ef... more A vast literature has recently concerned the measurement of quality dimensions such as access, effectiveness, performance and outcome of health services supplied by national health care providers. The main concern is to achieve a classification of administrative areas with respect to observed quality indicators. We describe a simple and effective procedure to achieve this goal which allows powerful testing of the hypothesized cluster structure. We describe the performance of this method on a dataset on preventable hospitalizations (PPH) in Italy during 1998, in order to highlight clusters of regions with homogeneous relative risk.
Quantile regression provides a detailed and robust picture of the distribution of a response vari... more Quantile regression provides a detailed and robust picture of the distribution of a response variable, conditional on a set of observed covariates. Recently, it has be been extended to the analysis of longitudinal continuous outcomes using either time-constant or time-varying random parameters. However, in real-life data, we frequently observe both temporal shocks in the overall trend and individual-specific heterogeneity in model parameters. A benchmark dataset on HIV progression gives a clear example. Here, the evolution of the CD4 log counts exhibits both sudden temporal changes in the overall trend and heterogeneity in the effect of the time since seroconversion on the response dynamics. To accommodate such situations, we propose a quantile regression model, where time-varying and time-constant random coefficients are jointly considered. Since observed data may be incomplete due to early drop-out, we also extend the proposed model in a pattern mixture perspective. We assess the performance of the proposals via a large-scale simulation study and the analysis of the CD4 count data.
Uploads
Papers by Marco Alfo'