Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Josine Verhagen

Kidaptive, Psychometrics, Department Member
Replication attempts are essential to the empirical sciences. Successful replication attempts increase researchers’ confidence in the presence of an effect, whereas failed replication attempts induce skepticism and doubt. However, it is... more
Replication attempts are essential to the empirical sciences. Successful replication attempts increase researchers’ confidence in the presence of an effect, whereas failed replication attempts induce skepticism and doubt. However, it is often unclear to what extent a replication attempt results in success or failure. To quantify replication outcomes we propose a novel Bayesian replication test that compares the adequacy of 2 competing
hypotheses. The 1st hypothesis is that of the skeptic and holds that the effect is spurious; this is the null hypothesis that postulates a zero effect size, H0 :  0. The 2nd hypothesis is that of the proponent and holds that the effect is consistent with the one found in the original study, an effect that can be quantified by a
posterior distribution. Hence, the 2nd hypothesis—the replication hypothesis—is given by Hr :   “posterior distribution from original study.” The weighted-likelihood ratio between H0 and Hr quantifies the evidence that the data provide for replication success and failure. In addition to the new test, we present several other Bayesian tests that address different but related questions concerning a replication study. These tests pertain to the independent conclusions of the separate experiments, the difference in effect size between the original experiment and the replication attempt, and the overall conclusion based on the pooled results. Together, this suite of Bayesian tests allows a relatively complete formalization of the way in which the result of a replication attempt alters our knowledge of the phenomenon at hand. The use of all Bayesian replication tests is illustrated with 3
examples from the literature. For experiments analyzed using the t test, computation of the new replication test only requires the t values and the numbers of participants from the original study and the replication study.
Random item effects models provide a natural framework for the exploration of violations of measurement invariance without the need for anchor items. Within the random item effects modelling framework, Bayesian tests (Bayes factor,... more
Random item effects models provide a natural framework for the exploration of violations of measurement invariance without the need for anchor items. Within the random item effects modelling framework, Bayesian tests (Bayes factor, deviance information criterion) are proposed which enable multiple marginal invariance hypotheses to be tested simultaneously. The performance of the tests is evaluated with a simulation study which shows that the tests have high power and low Type I error rate. Data from the European Social Survey are used to test for measurement invariance of attitude towards immigrant items and to show that background information can be used to explain cross-national variation in item functioning.
Longitudinal surveys measuring physical or mental health status are a common method to evaluate treatments. Multiple items are administered repeatedly to assess changes in the underlying health status of the patient. Traditional models to... more
Longitudinal surveys measuring physical or mental health status are a common method to evaluate treatments. Multiple items are administered repeatedly to assess changes in the underlying health status of the patient. Traditional models to analyze the resulting data assume that the characteristics of at least some items are identical over measurement occasions. When this assumption is not met, this can result in ambiguous latent health status estimates. Changes in item characteristics over occasions are allowed in the proposed measurement model, which includes truncated and correlated random effects and a growth model for item parameters. In a joint estimation procedure adopting MCMC methods, both item and latent health status parameters are modeled as longitudinal random effects. Simulation study results show accurate parameter recovery. Data from a randomized clinical trial concerning the treatment of depression by increasing psychological acceptance showed significant item parameter shifts. For some items, the probability of responding in the middle category versus the highest or lowest category increased significantly over time. The resulting latent depression scores decreased more over time for the experimental group than for the control group and the amount of decrease was related to the increase in acceptance level.
Mega- or meta-analytic studies (e.g. genome-wide association studies) are increasingly used in behavior genetics. An issue in such studies is that phenotypes are often measured by different instruments across study cohorts, requiring... more
Mega- or meta-analytic studies (e.g. genome-wide association studies) are increasingly used in behavior genetics. An issue in such studies is that phenotypes are often measured by different instruments across study cohorts, requiring harmonization of measures so that more powerful fixed effect meta-analyses can be employed. Within the Genetics of Personality Consortium, we demonstrate for two clinically relevant personality traits, Neuroticism and Extraversion, how Item-Response Theory (IRT) can be applied to map item data from different inventories to the same underlying constructs. Personality item data were analyzed in >160,000 individuals from 23 cohorts across Europe, USA and Australia in which Neuroticism and Extraversion were assessed by nine different personality inventories. Results showed that harmonization was very successful for most personality inventories and moderately successful for some. Neuroticism and Extraversion inventories were largely measurement invariant ...