Skip to main content

Luis Pericchi

Followers

47

Following

9

Co-authors

8

Public Views

Interests

Uploads

Papers by Luis Pericchi

From p-Values to Posterior Probabilities of Null Hypotheses

Entropy

Minimum Bayes factors are commonly used to transform two-sided p-values to lower bounds on the po... more Minimum Bayes factors are commonly used to transform two-sided p-values to lower bounds on the posterior probability of the null hypothesis, in particular the bound −e·p·log(p). This bound is easy to compute and explain; however, it does not behave as a Bayes factor. For example, it does not change with the sample size. This is a very serious defect, particularly for moderate to large sample sizes, which is precisely the situation in which p-values are the most problematic. In this article, we propose adjusting this minimum Bayes factor with the information to approximate an exact Bayes factor, not only when p is a p-value but also when p is a pseudo-p-value. Additionally, we develop a version of the adjustment for linear models using the recent refinement of the Prior-Based BIC.

Converting P-Values in Adaptive Robust Lower Bounds of Posterior Probabilities to increase the reproducible Scientific "Findings

arXiv (Cornell University), Nov 16, 2017

Bayesian Analysis

Bayesian multilevel logistic regression models: a case study applied to the results of two questionnaires administered to university students

Computational Statistics

Bayesian multilevel models—also known as hierarchical or mixed models—are used in situations in w... more Bayesian multilevel models—also known as hierarchical or mixed models—are used in situations in which the aim is to model the random effect of groups or levels. In this paper, we conduct a simulation study to compare the predictive ability of 1-level Bayesian multilevel logistic regression models with that of 2-level Bayesian multilevel logistic regression models by using the prior Scaled Beta2 and inverse-gamma distributions to model the standard deviation in the 2-level. Then, these models are employed to estimate the correct answers in two questionnaires administered to university students throughout the first academic semester of 2018. The results show that 2-level models have a better predictive ability and provide more precise probability intervals than 1-level models, particularly when the prior Scaled Beta2 distribution is used to model the standard deviation in the second level. Moreover, the probability intervals of 1-level Bayesian multilevel logistic regression models pr...

• FULL-LENGTH ARTICLES • Comparative Longterm Mortality Trends in Cancer vs. Ischemic Heart Disease in Puerto Rico

Objective: Although contemporary mortality data are important for health assessment and planning ... more Objective: Although contemporary mortality data are important for health assessment and planning purposes, their availability lag several years. Statistical projection techniques can be employed to obtain current estimates. This study aimed to assess annual trends of mortality in Puerto Rico due to cancer and Ischemic Heart Disease (IHD), and to predict shorterm and longterm cancer and IHD mortality figures. Methods: Age-adjusted mortality per 100,000 population projections with a 50% interval probability were calculated utilizing a Bayesian statistical approach of Age-Period-Cohort dynamic model. Multiple cause-of-death annual files for years 1994-2010 for Puerto Rico were used to calculate shortterm (2011-2012) predictions. Longterm (2013-2022) predictions were based on quinquennial data. We also calculated gender differences in rates (men-women) for each study period. Results: Mortality rates for women were similar for cancer and IHD in the 1994-1998 period, but changed substanti...

to Electoral Processes Data from

to Electoral Processes Data from the USA

Abstract. A simple and quick general test to screen for numerical anom-alies is presented. It can... more Abstract. A simple and quick general test to screen for numerical anom-alies is presented. It can be applied, for example, to electoral processes, both electronic and manual. It uses vote counts in officially published vot-ing units, which are typically widely available and institutionally backed. The test examines the frequencies of digits on voting counts and rests on the First (NBL1) and Second Digit Newcomb–Benford Law (NBL2), and in a novel generalization of the law under restrictions of the maximum number of voters per unit (RNBL2). We apply the test to the 2004 USA presiden-tial elections, the Puerto Rico (1996, 2000 and 2004) governor elections, the 2004 Venezuelan presidential recall referendum (RRP) and the previous 2000 Venezuelan Presidential election. The NBL2 is compellingly rejected only in the Venezuelan referendum and only for electronic voting units. Our origi-nal suggestion on the RRP (Pericchi and Torres, 2004) was criticized by The Carter Center report (2005). A...

Crop responses to water at different stager of growth

Transformation of Survival Data to an Extreme Value Distribution

The Statistician, 1987

Prognostic indicators of chronic chagasic cardiopathy

International Journal of Cardiology, 1991

Ensayos clínicos bajo un enfoque bayesiano robusto con previas escépticas y optimistas

Revista Colombiana de Estadística, Jun 1, 2011

What is the Effect of Sample and Prior Distributions on a Bayesian Autoregressive Linear Model? An Application to Piped Water Consumption

SSRN Electronic Journal, 2014

Increasing the replicability for linear models via adaptive significance levels

TEST, 2022

Objective Bayes Factors for Informative Hypotheses:“Completing” the Informative Hypothesis and “Splitting” the Bayes Factors

... connections between FBF's and IBF's. 7.3.4 The Intrinsic Prior Approach Methods clo... more

Modelling outliers and structural breaks in dynamic linear models with a novel use of a heavy tailed prior for the variances: An alternative to the Inverted Gamma

arXiv: Methodology, 2011

Modelling outliers and structural breaks in dynamic linear models with a novel use of a heavy tai... more

Comparison Between Bayesian and Frequentist Tail Probability Estimates

In this paper, we investigate the reasons that the Bayesian estimator of the tail probability is ... more In this paper, we investigate the reasons that the Bayesian estimator of the tail probability is always higher than the frequentist estimator. Sufficient conditions for this phenomenon are established both by using Jensen's Inequality and by looking at Taylor series approximations, both of which point to the convexity of the distribution function.

A Case for Robust Bayesian priors with Applications to Binary Clinical Trials

Bayesian analysis is frequently confused with conjugate Bayesian analysis. This is particularly t... more Bayesian analysis is frequently confused with conjugate Bayesian analysis. This is particularly the case in the analysis of clinical trial data. Even though conjugate analysis is perceived to be simpler computationally (but see below, Berger’s prior), the price to be paid is high: such analysis is not robust with respect to the prior, i.e. changing the prior may affect the conclusions without bound. Furthermore conjugate Bayesian analysis is blind with respect to the potential conflict between the prior and the data. On the other hand, robust priors have bounded influence. The prior is discounted automatically when there are conflicts between prior information and data. In other words, conjugate priors may lead to a dogmatic analysis while robust priors promote self-criticism since prior and sample information are not on equal footing. The original proposal of robust priors was made by de-Finetti in the 1960’s. However, the practice has not taken hold in important areas where the Ba...

Bayesian model choice: What and why? (With discussion)

ABSTRACT

An unexpensive local urease test for detection of Helicobacter pylori in endoscopic patients

Journal of Hydrology 273 (2003) 257 www. elsevier. com/locate/jhydrol

From p-Values to Posterior Probabilities of Null Hypotheses

Entropy

Minimum Bayes factors are commonly used to transform two-sided p-values to lower bounds on the po... more Minimum Bayes factors are commonly used to transform two-sided p-values to lower bounds on the posterior probability of the null hypothesis, in particular the bound −e·p·log(p). This bound is easy to compute and explain; however, it does not behave as a Bayes factor. For example, it does not change with the sample size. This is a very serious defect, particularly for moderate to large sample sizes, which is precisely the situation in which p-values are the most problematic. In this article, we propose adjusting this minimum Bayes factor with the information to approximate an exact Bayes factor, not only when p is a p-value but also when p is a pseudo-p-value. Additionally, we develop a version of the adjustment for linear models using the recent refinement of the Prior-Based BIC.

Converting P-Values in Adaptive Robust Lower Bounds of Posterior Probabilities to increase the reproducible Scientific "Findings

arXiv (Cornell University), Nov 16, 2017

Bayesian Analysis

Bayesian multilevel logistic regression models: a case study applied to the results of two questionnaires administered to university students

Computational Statistics

Bayesian multilevel models—also known as hierarchical or mixed models—are used in situations in w... more Bayesian multilevel models—also known as hierarchical or mixed models—are used in situations in which the aim is to model the random effect of groups or levels. In this paper, we conduct a simulation study to compare the predictive ability of 1-level Bayesian multilevel logistic regression models with that of 2-level Bayesian multilevel logistic regression models by using the prior Scaled Beta2 and inverse-gamma distributions to model the standard deviation in the 2-level. Then, these models are employed to estimate the correct answers in two questionnaires administered to university students throughout the first academic semester of 2018. The results show that 2-level models have a better predictive ability and provide more precise probability intervals than 1-level models, particularly when the prior Scaled Beta2 distribution is used to model the standard deviation in the second level. Moreover, the probability intervals of 1-level Bayesian multilevel logistic regression models pr...

• FULL-LENGTH ARTICLES • Comparative Longterm Mortality Trends in Cancer vs. Ischemic Heart Disease in Puerto Rico

Objective: Although contemporary mortality data are important for health assessment and planning ... more Objective: Although contemporary mortality data are important for health assessment and planning purposes, their availability lag several years. Statistical projection techniques can be employed to obtain current estimates. This study aimed to assess annual trends of mortality in Puerto Rico due to cancer and Ischemic Heart Disease (IHD), and to predict shorterm and longterm cancer and IHD mortality figures. Methods: Age-adjusted mortality per 100,000 population projections with a 50% interval probability were calculated utilizing a Bayesian statistical approach of Age-Period-Cohort dynamic model. Multiple cause-of-death annual files for years 1994-2010 for Puerto Rico were used to calculate shortterm (2011-2012) predictions. Longterm (2013-2022) predictions were based on quinquennial data. We also calculated gender differences in rates (men-women) for each study period. Results: Mortality rates for women were similar for cancer and IHD in the 1994-1998 period, but changed substanti...

to Electoral Processes Data from

to Electoral Processes Data from the USA

Abstract. A simple and quick general test to screen for numerical anom-alies is presented. It can... more Abstract. A simple and quick general test to screen for numerical anom-alies is presented. It can be applied, for example, to electoral processes, both electronic and manual. It uses vote counts in officially published vot-ing units, which are typically widely available and institutionally backed. The test examines the frequencies of digits on voting counts and rests on the First (NBL1) and Second Digit Newcomb–Benford Law (NBL2), and in a novel generalization of the law under restrictions of the maximum number of voters per unit (RNBL2). We apply the test to the 2004 USA presiden-tial elections, the Puerto Rico (1996, 2000 and 2004) governor elections, the 2004 Venezuelan presidential recall referendum (RRP) and the previous 2000 Venezuelan Presidential election. The NBL2 is compellingly rejected only in the Venezuelan referendum and only for electronic voting units. Our origi-nal suggestion on the RRP (Pericchi and Torres, 2004) was criticized by The Carter Center report (2005). A...

Crop responses to water at different stager of growth

Transformation of Survival Data to an Extreme Value Distribution

The Statistician, 1987

Prognostic indicators of chronic chagasic cardiopathy

International Journal of Cardiology, 1991

Ensayos clínicos bajo un enfoque bayesiano robusto con previas escépticas y optimistas

Revista Colombiana de Estadística, Jun 1, 2011

What is the Effect of Sample and Prior Distributions on a Bayesian Autoregressive Linear Model? An Application to Piped Water Consumption

SSRN Electronic Journal, 2014

Increasing the replicability for linear models via adaptive significance levels

TEST, 2022

Objective Bayes Factors for Informative Hypotheses:“Completing” the Informative Hypothesis and “Splitting” the Bayes Factors

... connections between FBF's and IBF's. 7.3.4 The Intrinsic Prior Approach Methods clo... more

Modelling outliers and structural breaks in dynamic linear models with a novel use of a heavy tailed prior for the variances: An alternative to the Inverted Gamma

arXiv: Methodology, 2011

Modelling outliers and structural breaks in dynamic linear models with a novel use of a heavy tai... more

Comparison Between Bayesian and Frequentist Tail Probability Estimates

In this paper, we investigate the reasons that the Bayesian estimator of the tail probability is ... more In this paper, we investigate the reasons that the Bayesian estimator of the tail probability is always higher than the frequentist estimator. Sufficient conditions for this phenomenon are established both by using Jensen's Inequality and by looking at Taylor series approximations, both of which point to the convexity of the distribution function.

A Case for Robust Bayesian priors with Applications to Binary Clinical Trials

Bayesian analysis is frequently confused with conjugate Bayesian analysis. This is particularly t... more Bayesian analysis is frequently confused with conjugate Bayesian analysis. This is particularly the case in the analysis of clinical trial data. Even though conjugate analysis is perceived to be simpler computationally (but see below, Berger’s prior), the price to be paid is high: such analysis is not robust with respect to the prior, i.e. changing the prior may affect the conclusions without bound. Furthermore conjugate Bayesian analysis is blind with respect to the potential conflict between the prior and the data. On the other hand, robust priors have bounded influence. The prior is discounted automatically when there are conflicts between prior information and data. In other words, conjugate priors may lead to a dogmatic analysis while robust priors promote self-criticism since prior and sample information are not on equal footing. The original proposal of robust priors was made by de-Finetti in the 1960’s. However, the practice has not taken hold in important areas where the Ba...

Bayesian model choice: What and why? (With discussion)

ABSTRACT

An unexpensive local urease test for detection of Helicobacter pylori in endoscopic patients

Journal of Hydrology 273 (2003) 257 www. elsevier. com/locate/jhydrol

What is the effect of sample and prior distributions on a Bayesian autoregressive linear model? An application to piped water consumption

by Valor Público - Centro de estudios e incidencia, Luis Pericchi, Andres Ramirez Hassan, and Jhonatan Cardona Jimemez

In this paper we analyze the effect of four possible alternatives regarding the prior distributio... more In this paper we analyze the effect of four possible alternatives regarding the prior distributions in a linear model with autoregressive errors to predict piped water consumption: Normal-Gamma, Normal-Scaled Beta two, Studentized-Gamma and Student's t-Scaled Beta two. We show the effects of these prior distributions on the posterior distributions under different assumptions associated with the coefficient of variation of prior hyperparameters in a context where there is a conflict between the sample information and the elicited hyperparameters. We show that the posterior parameters are less affected by the prior hyperparameters when the Studentized-Gamma and Student's t-Scaled Beta two models are used. We show that the Normal-Gamma model obtains sensible outcomes in predictions when there is a small sample size. However, this property is lost when the experts overestimate the certainty of their knowledge. In the case that the experts greatly trust their beliefs, it is a good idea to use Student's t distribution as the prior distribution, because we obtain small posterior predictive errors. In addition, we find that the posterior predictive distributions using one of the versions of Student's t as prior are robust to the coefficient of variation of the prior parameters. Finally, it is shown that the Normal-Gamma model has a posterior distribution of the variance concentrated near zero when there is a high level of confidence in the experts' knowledge: this implies a narrow posterior predictive credibility interval, especially using small sample sizes.