In disease control or elimination programs, diagnostics are essential for assessing the impact of... more In disease control or elimination programs, diagnostics are essential for assessing the impact of interventions, refining treatment strategies, and minimizing the waste of scarce resources. Although high-performance tests are desirable, increased accuracy is frequently accompanied by a requirement for more elaborate infrastructure, which is often not feasible in the developing world. These challenges are pertinent to mapping, impact monitoring, and surveillance in trachoma elimination programs. To help inform rational design of diagnostics for trachoma elimina-tion, we outline a nonparametric multilevel latent Markov modeling approach and apply it to 2 longitudinal cohort studies of trachoma-endemic communities in Tanzania (2000–2002) and The Gambia (2001–2002) to provide simultaneous inferences about the true population prevalence of Chlamydia trachomatis infection and disease and the sensitivity, specificity, and predictive values of 3 diagnostic tests for C. trachomatis infection...
Various global health initiatives are currently advocating the elimination of schistosomiasis wit... more Various global health initiatives are currently advocating the elimination of schistosomiasis within the next decade. Schistosomiasis is a highly debilitating tropical infectious disease with severe burden of morbidity and thus operational research accurately evaluating diagnostics that quantify the epidemic status for guiding effective strategies is essential. Latent class models (LCMs) have been generally considered in epidemiology and in particular in recent schistosomiasis diagnostic studies as a flexible tool for evaluating diagnostics because assessing the true infection status (via a gold standard) is not possible. However, within the biostatistics literature, classical LCM have already been criticised for real-life problems under violation of the conditional independence (CI) assumption and when applied to a small number of diagnostics (i.e. most often 3-5 diagnostic tests). Solutions of relaxing the CI assumption and accounting for zero-inflation, as well as collecting part...
Metron-International Journal of Statistics, Jul 30, 2015
Composite likelihood estimation has been proposed in the literature for handling intractable like... more Composite likelihood estimation has been proposed in the literature for handling intractable likelihoods. In particular, pairwise likelihood estimation has been recently proposed to estimate models with latent variables and random effects that involve high dimensional integrals. Pairwise estimators are asymptotically consistent and normally distributed but not the most efficient among consistent estimators. Vasdekis et al. (Biostatistics 15:677–689, 2014) proposed a weighted estimator that is found to be more efficient than the unweighted pairwise estimator produced by separate maximizations of pairwise likelihoods. In this paper, we propose a modification to that weighted estimator that leads to simpler computations and study its performance through simulations and a real application.
Journal of the Royal Statistical Society Series A: Statistics in Society, 2020
Summary We propose a multilevel structural equation model to investigate the interrelationships b... more Summary We propose a multilevel structural equation model to investigate the interrelationships between childhood socio-economic circumstances, partnership formation and stability, and mid-life health, using data from the 1958 British birth cohort. The structural equation model comprises latent class models that characterize the patterns of change in four dimensions of childhood socio-economic circumstances and a joint regression model that relates these categorical latent variables to partnership transitions in adulthood and mid-life health, while allowing for informative dropout. The model can be extended to handle multiple outcomes of mixed types and at different levels in a hierarchical data structure.
When missing data are produced by a non-ignorable nonresponse mechanism, analysis of the observed... more When missing data are produced by a non-ignorable nonresponse mechanism, analysis of the observed data should include a model for the probabilities of responding. In this paper we propose such models for nonresponse in survey questions which are treated as multiple-item measures of latent constructs and analysed using latent variable models. The nonresponse models that we describe include additional latent variables (latent response propensities) which determine the response probabilities. We argue that this model should be specified as flexibly as possible, and propose models where the response propensity is a categorical variable (a latent response class). This can be combined with any latent variable model for the survey items themselves, and an association between the latent variables measured by the items and the latent response propensities implies a model with non-ignorable nonresponse. We consider in particular the analysis of data from cross-national surveys, where the nonresponse model may also vary across the countries. The models are applied to analyse data on welfare attitudes in 29 countries in the European Social Survey.
This article studies the Type I error, false positive rates, and power of four versions of the La... more This article studies the Type I error, false positive rates, and power of four versions of the Lagrange multiplier test to detect measurement noninvariance in item response theory (IRT) models for binary data under model misspecification. The tests considered are the Lagrange multiplier test computed with the Hessian and cross-product approach, the generalized Lagrange multiplier test and the generalized jackknife score test. The two model misspecifications are those of local dependence among items and nonnormal distribution of the latent variable. The power of the tests is computed in two ways, empirically through Monte Carlo simulation methods and asymptotically, using the asymptotic distribution of each test under the alternative hypothesis. The performance of these tests is evaluated by means of a simulation study. The results highlight that, under mild model misspecification, all tests have good performance while, under strong model misspecification, the tests performance deter...
When four of the leading researchers in the field of quantitative social sciences team up to writ... more When four of the leading researchers in the field of quantitative social sciences team up to write a book together, you can expect nothing less than a brilliant work. That is what the first edition of “Analysis of Multivariate Social Science Data ” from 2002 was, and that’s what the current second edition is. This new edition contains additional chapters on regression analysis, confirmatory factor analysis including structural equation models, and multilevel models. The strength of this book lies in the right mixture of simple mathematical expressions, com-prehensive non-mathematical descriptions of various multivariate approaches, numerous in-teresting real-life data examples (almost half of each chapter is dedicated to examples), and, last but not least, detailed interpretation of the results. As in other Bartholomew books, well-known methods or certain parts of them are, in many cases, presented from a slightly different angle. This makes this book also interesting for experience...
Tests are a building block of our modern education system. Many tests are high-stake, such as adm... more Tests are a building block of our modern education system. Many tests are high-stake, such as admission, licensing, and certification tests, that can significantly change one's life trajectory. For this reason, ensuring fairness in educational tests is becoming an increasingly important problem. This paper concerns the issue of item preknowledge in educational tests due to item leakage. That is, a proportion of test takers have access to leaked items before a test is administrated, which leads to inflated performance on the set of leaked items. We develop methods for the simultaneous detection of cheating test takers and compromised items based on data from a single test administration, when both sets are completely unknown. Latent variable models are proposed for the modelling of (1) data consisting only of item-level binary scores and (2) data consisting of both item-level binary scores and response time, where the former is commonly available in paper-and-pencil tests and the...
Latent class models are used in social sciences for classifying individuals or objects into disti... more Latent class models are used in social sciences for classifying individuals or objects into distinct groups/classes based on responses to a set of observed indicators. The latent class model for mixed binary and metric variables (Br. J. Math. Statist. Psych. 49 (1996) 313) is extended to accommodate any type of data (including ordinal and nominal) and its use in Archaeometry for classifying archaeological findings/objects into groups is discussed. The models proposed are estimated using a full maximum like-lihood with the EM algorithm. Two data sets from archaeological findings are used to illustrate the methodology.
In disease control or elimination programs, diagnostics are essential for assessing the impact of... more In disease control or elimination programs, diagnostics are essential for assessing the impact of interventions, refining treatment strategies, and minimizing the waste of scarce resources. Although high-performance tests are desirable, increased accuracy is frequently accompanied by a requirement for more elaborate infrastructure, which is often not feasible in the developing world. These challenges are pertinent to mapping, impact monitoring, and surveillance in trachoma elimination programs. To help inform rational design of diagnostics for trachoma elimina-tion, we outline a nonparametric multilevel latent Markov modeling approach and apply it to 2 longitudinal cohort studies of trachoma-endemic communities in Tanzania (2000–2002) and The Gambia (2001–2002) to provide simultaneous inferences about the true population prevalence of Chlamydia trachomatis infection and disease and the sensitivity, specificity, and predictive values of 3 diagnostic tests for C. trachomatis infection...
Various global health initiatives are currently advocating the elimination of schistosomiasis wit... more Various global health initiatives are currently advocating the elimination of schistosomiasis within the next decade. Schistosomiasis is a highly debilitating tropical infectious disease with severe burden of morbidity and thus operational research accurately evaluating diagnostics that quantify the epidemic status for guiding effective strategies is essential. Latent class models (LCMs) have been generally considered in epidemiology and in particular in recent schistosomiasis diagnostic studies as a flexible tool for evaluating diagnostics because assessing the true infection status (via a gold standard) is not possible. However, within the biostatistics literature, classical LCM have already been criticised for real-life problems under violation of the conditional independence (CI) assumption and when applied to a small number of diagnostics (i.e. most often 3-5 diagnostic tests). Solutions of relaxing the CI assumption and accounting for zero-inflation, as well as collecting part...
Metron-International Journal of Statistics, Jul 30, 2015
Composite likelihood estimation has been proposed in the literature for handling intractable like... more Composite likelihood estimation has been proposed in the literature for handling intractable likelihoods. In particular, pairwise likelihood estimation has been recently proposed to estimate models with latent variables and random effects that involve high dimensional integrals. Pairwise estimators are asymptotically consistent and normally distributed but not the most efficient among consistent estimators. Vasdekis et al. (Biostatistics 15:677–689, 2014) proposed a weighted estimator that is found to be more efficient than the unweighted pairwise estimator produced by separate maximizations of pairwise likelihoods. In this paper, we propose a modification to that weighted estimator that leads to simpler computations and study its performance through simulations and a real application.
Journal of the Royal Statistical Society Series A: Statistics in Society, 2020
Summary We propose a multilevel structural equation model to investigate the interrelationships b... more Summary We propose a multilevel structural equation model to investigate the interrelationships between childhood socio-economic circumstances, partnership formation and stability, and mid-life health, using data from the 1958 British birth cohort. The structural equation model comprises latent class models that characterize the patterns of change in four dimensions of childhood socio-economic circumstances and a joint regression model that relates these categorical latent variables to partnership transitions in adulthood and mid-life health, while allowing for informative dropout. The model can be extended to handle multiple outcomes of mixed types and at different levels in a hierarchical data structure.
When missing data are produced by a non-ignorable nonresponse mechanism, analysis of the observed... more When missing data are produced by a non-ignorable nonresponse mechanism, analysis of the observed data should include a model for the probabilities of responding. In this paper we propose such models for nonresponse in survey questions which are treated as multiple-item measures of latent constructs and analysed using latent variable models. The nonresponse models that we describe include additional latent variables (latent response propensities) which determine the response probabilities. We argue that this model should be specified as flexibly as possible, and propose models where the response propensity is a categorical variable (a latent response class). This can be combined with any latent variable model for the survey items themselves, and an association between the latent variables measured by the items and the latent response propensities implies a model with non-ignorable nonresponse. We consider in particular the analysis of data from cross-national surveys, where the nonresponse model may also vary across the countries. The models are applied to analyse data on welfare attitudes in 29 countries in the European Social Survey.
This article studies the Type I error, false positive rates, and power of four versions of the La... more This article studies the Type I error, false positive rates, and power of four versions of the Lagrange multiplier test to detect measurement noninvariance in item response theory (IRT) models for binary data under model misspecification. The tests considered are the Lagrange multiplier test computed with the Hessian and cross-product approach, the generalized Lagrange multiplier test and the generalized jackknife score test. The two model misspecifications are those of local dependence among items and nonnormal distribution of the latent variable. The power of the tests is computed in two ways, empirically through Monte Carlo simulation methods and asymptotically, using the asymptotic distribution of each test under the alternative hypothesis. The performance of these tests is evaluated by means of a simulation study. The results highlight that, under mild model misspecification, all tests have good performance while, under strong model misspecification, the tests performance deter...
When four of the leading researchers in the field of quantitative social sciences team up to writ... more When four of the leading researchers in the field of quantitative social sciences team up to write a book together, you can expect nothing less than a brilliant work. That is what the first edition of “Analysis of Multivariate Social Science Data ” from 2002 was, and that’s what the current second edition is. This new edition contains additional chapters on regression analysis, confirmatory factor analysis including structural equation models, and multilevel models. The strength of this book lies in the right mixture of simple mathematical expressions, com-prehensive non-mathematical descriptions of various multivariate approaches, numerous in-teresting real-life data examples (almost half of each chapter is dedicated to examples), and, last but not least, detailed interpretation of the results. As in other Bartholomew books, well-known methods or certain parts of them are, in many cases, presented from a slightly different angle. This makes this book also interesting for experience...
Tests are a building block of our modern education system. Many tests are high-stake, such as adm... more Tests are a building block of our modern education system. Many tests are high-stake, such as admission, licensing, and certification tests, that can significantly change one's life trajectory. For this reason, ensuring fairness in educational tests is becoming an increasingly important problem. This paper concerns the issue of item preknowledge in educational tests due to item leakage. That is, a proportion of test takers have access to leaked items before a test is administrated, which leads to inflated performance on the set of leaked items. We develop methods for the simultaneous detection of cheating test takers and compromised items based on data from a single test administration, when both sets are completely unknown. Latent variable models are proposed for the modelling of (1) data consisting only of item-level binary scores and (2) data consisting of both item-level binary scores and response time, where the former is commonly available in paper-and-pencil tests and the...
Latent class models are used in social sciences for classifying individuals or objects into disti... more Latent class models are used in social sciences for classifying individuals or objects into distinct groups/classes based on responses to a set of observed indicators. The latent class model for mixed binary and metric variables (Br. J. Math. Statist. Psych. 49 (1996) 313) is extended to accommodate any type of data (including ordinal and nominal) and its use in Archaeometry for classifying archaeological findings/objects into groups is discussed. The models proposed are estimated using a full maximum like-lihood with the EM algorithm. Two data sets from archaeological findings are used to illustrate the methodology.
Uploads
Papers by Irini Moustaki