Model Validation-Tutorial

The document discusses various methods for validating regression models, including split-sample validation, cross-validation, and bootstrap validation. Bootstrap validation involves building models on resampled datasets and helps correct for optimism in apparent performance estimates. The document recommends bootstrap for validating models built on smaller sample sizes.

Uploaded by

IgnacioCortésFuentes

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views

Model Validation-Tutorial

Uploaded by

IgnacioCortésFuentes

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 35

 Introduction: Model validation

 Bootstrap method
 Predictive performance
 Use bootstrap and other methods for model
validation
 Demonstrate association: Evaluation the
relationship between an outcome and
covariates
e.g, Association Between Helicopter vs
Ground Emergency Medical Services and
Survival for Adults With Major Trauma
JAMA. 2012;307(15):1602-1610.
 We are interested in the beta coefficient of the
regression model, e.g.,
In the multivariable regression model, for
patients transported to level I trauma
centers, helicopter transport was
associated with an improved odds of
survival compared with ground transport
(odds ratio [OR], 1.16; 95% CI, 1.14-1.17;
P < .001).
 Prediction and forecasting:
e.g., Regression Tree Analysis.
Decompensated Heart Failure: Classification
and Risk Stratification for In-Hospital
Mortality in Acutely.
JAMA. 2005;293(5):572-580
Predictive score construction:
e.g., score (H) is generally based on the results
of regression model: H=(β1×covariate A )+
(β2×covariate B)+(β3×covariate C), and so on,
where β1, β2, and β3 denote the estimates of
beta coefficients for covariates A, B, and C and
were obtained by fitting the regression model
for the outcome of interest.
 Model validation is applied to regression
models for prediction purpose.
MODEL VALIDATION in general has at least
two parts:

1. Model selection:
to choose the best model based on model
performance.
2. Model assessment:
to estimate performance for a final
chosen model.
 Here we study various methods for model
assessment ( how well the model is to predict a future
outcome?)

Internal validation of predictive models: Efficiency of

some procedures for logistic regression analysis

E W. Steyerberg et. al
Journal of Clinical Epidemiology 54 (2001) 774–781
 Randomly sampling, with replacement, from
an original dataset for use in obtaining
statistical inference.
 Bootstrap theory says that the distance between
the population mean and sample mean is
similar to the distance between sample mean
and bootstrap ‘subsample’ mean.
95% CI for:
Correlation coefficient
CV = SD/mean
AUC of ROC
Median
 External validation: use a training (derivation)
data to build the model and a test (validation)
data to validate the model.
 example: old vs new patients, one vs another
dataset,
 Internal validation: use the same dataset for
model building and validation.
1. we use regression analysis to construct the
predictive model to provide an estimate of patient
outcome.
2. The apparent performance of the model on this
training set will be better than the performance in
another data set, even if the latter test set consists of
patients from the same population. (this is called
optimism)
Population Sample Inference
Regular Population Sample Standard error
statistical mean mean
procedure

Bootstrap Sample Subsample Bootstrap standard

mean mean deviation

Model validation Testing data Training data Optimism

1. Data: GUSTO-I data gives 30-day mortality in patients with
acute myocardial infarction. this data set consists of 40,830
patients, of whom 2851 (7.0%) had died at 30 days.
 Response(Y): 30 day mortality
 Predictors(X): age > 65 years, high risk (anterior infarct
location or previous MI), diabetes, shock, hypotension
(systolic blood pressure< 100 mmHg), tachycardia
(pulse > 80), relief of chest pain > 1 hr, female gender.
 Produce training set and test set based on
GUSTO-1 data (EPV: event per variable)
• Example: EPV=5, 7% event rate
=> training data set: 5*8=40 death out of 571 patients
=> test data set: 2811 death out of 40259 patients
1. concordance: the c statistic. For binary
outcomes, c is identical to the area under the
receiver operating characteristic (ROC)
curve; c varies between 0.5 and 1.0 for
sensible models (the higher the better)
2. The calibration slope is the regression
coefficient b in a logistic model with the
predictive score as the only covariate:
logit(mortality) = a+ b * predictive score.
Well-calibrated models have a slope of 1,
while models providing too extreme
predictions have a slope less than 1.
3. The Brier score (or average prediction error) is
calculated as Sum(y_i -p_i)^2/n, where y_i
denotes the observed outcome and p_i the
prediction for subject i in the data set of n
subjects.
4. D is a scaled version of the model chi-square,
which is a function of log-likelihood
5. R^2 as a measure of explained variation.
5. A few methods to estimate model
performance (Table 1)
1) Split sample: randomly split the training data in two
parts: one to develop the model and another to
measure its performance. The split was made once
and at random.
2) cross-validation: With split-half cross-validation,
the model is developed on one randomly drawn half
and tested on the other and vice versa. The average
is taken as estimate of performance. Other fractions
of subjects may be left out (e.g., 10% to test a
model developed on 90% of the sample). This
procedure is repeated 10 times, such that all
subjects have once served to test the model.
 To improve the stability of the cross-validation,
the whole procedure can be repeated several
times, taking new random subsamples. The
most extreme cross-validation procedure is to
leave one subject out at a time, which is
equivalent to the jack-knife technique.
3) Bootstrapping replicates the process of
sample generation from an underlying
population by drawing samples with
replacement from the original data set, of the
same size as the original data set. Models
may be developed in bootstrap samples and
tested in the original sample.
a) regular bootstrap: the model as estimated in the
bootstrap sample was evaluated in the bootstrap
sample and in the original sample. The performance
in the bootstrap sample represents estimation of the
apparent performance, and the performance in the
original sample represents test performance. The
difference between these performances is an
estimate of the optimism in the apparent
performance.
 This difference is averaged to obtain a stable
estimate of the optimism. internally validated
performance .
 optimism= average (bootstrap performance –
test performance).
 Estimated performance
= apparent performance – optimism.
6. Performance: (Fig2 and Table 2)
7. Conclusión:
1) split-sample approach tends to produce larger difference
between estimated performance and test performance, unless
a very large sample is available.
2) However, with a large sample size (e.g., EPV > 40), optimism
is small and the apparent estimates of model performance are
attractive because of their stability.
3) Regular bootstrapping provides better estimates of internal
validity of logistic regression models constructed in smaller
samples (e.g., EPV10)
 What is bootstrap?
 Model validation
 Internal vs external model validation
 Optimism in internal validation
 Using bootstrap and other methods to correct
optimism
1. Efron and Tibshirani (1993), An Introduction to the Bootstrap,
Chapman &Hall/CRC
2. Internal validation of predictive models: Efficiency of some procedures for
logistic regression analysis
E W. Steyerberg et. al Journal of Clinical Epidemiology 54 (2001) 774–781

Sophia Rabe-Hesketh, Anders Skrondal - Multilevel and Longitudinal Modeling Using Stata. 2 Vols.-Stata Press (2012)
100% (2)
Sophia Rabe-Hesketh, Anders Skrondal - Multilevel and Longitudinal Modeling Using Stata. 2 Vols.-Stata Press (2012)
1,030 pages
Regression Modeling Strategies
No ratings yet
Regression Modeling Strategies
506 pages
Arabic Maqamaat
100% (6)
Arabic Maqamaat
88 pages
Bootstrap
No ratings yet
Bootstrap
12 pages
MSD_Model_diagnostics_1
No ratings yet
MSD_Model_diagnostics_1
43 pages
Improvements On CVBootstrap
No ratings yet
Improvements On CVBootstrap
14 pages
S, Anno LXIII, N. 2, 2003: Tatistica
No ratings yet
S, Anno LXIII, N. 2, 2003: Tatistica
22 pages
Validation Model 2024-2
No ratings yet
Validation Model 2024-2
37 pages
Course Regression Model Strategies PDF
No ratings yet
Course Regression Model Strategies PDF
307 pages
Bootstrap
No ratings yet
Bootstrap
33 pages
Rms PDF
No ratings yet
Rms PDF
506 pages
Course PDF
No ratings yet
Course PDF
403 pages
Improvements On Cross Validation The 632 Bootstrap Method
No ratings yet
Improvements On Cross Validation The 632 Bootstrap Method
14 pages
Flachaire 03a
No ratings yet
Flachaire 03a
16 pages
Bootstrap Resampling Methods: Something For Nothing?: Gary L. Grunkemeier,, and Yingxing Wu
No ratings yet
Bootstrap Resampling Methods: Something For Nothing?: Gary L. Grunkemeier,, and Yingxing Wu
3 pages
2022 Final exam_all
No ratings yet
2022 Final exam_all
9 pages
Module 4
No ratings yet
Module 4
33 pages
Lecture 7 Classification
No ratings yet
Lecture 7 Classification
52 pages
Steps in Logistic Regression
No ratings yet
Steps in Logistic Regression
5 pages
Lecture 12 Regression
No ratings yet
Lecture 12 Regression
55 pages
BMJ h2622 Full
No ratings yet
BMJ h2622 Full
2 pages
Data Mining and Model Selection
No ratings yet
Data Mining and Model Selection
27 pages
Glossary of Statistical Terms
No ratings yet
Glossary of Statistical Terms
19 pages
Diagnostic Tests2
No ratings yet
Diagnostic Tests2
25 pages
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
cheatsheet
No ratings yet
cheatsheet
4 pages
Bootstrap Method
No ratings yet
Bootstrap Method
28 pages
Méthode de Bootstrapping
No ratings yet
Méthode de Bootstrapping
28 pages
Chapter 1. Bootstrap Method: 1.1 The Practice of Statistics
No ratings yet
Chapter 1. Bootstrap Method: 1.1 The Practice of Statistics
28 pages
A Leisurely Look at The Bootstrap, The Jackknife, and Cross-Validation (1983 13s) - BRADLEY EFRON
No ratings yet
A Leisurely Look at The Bootstrap, The Jackknife, and Cross-Validation (1983 13s) - BRADLEY EFRON
13 pages
Steyerberg Prediction Modeling 7 Steps Jan10
No ratings yet
Steyerberg Prediction Modeling 7 Steps Jan10
45 pages
Five miths about variable selection
No ratings yet
Five miths about variable selection
5 pages
Linear Regression
100% (2)
Linear Regression
28 pages
Regression Modeling Strategies - With Applications To Linear Models by Frank E. Harrell
100% (4)
Regression Modeling Strategies - With Applications To Linear Models by Frank E. Harrell
598 pages
Regression Modeling PDF
100% (1)
Regression Modeling PDF
598 pages
2015 Book RegressionModelingStrategies-1 PDF
No ratings yet
2015 Book RegressionModelingStrategies-1 PDF
598 pages
An Introduction To Bootstrap Methods With Applications To R
No ratings yet
An Introduction To Bootstrap Methods With Applications To R
236 pages
Quick Answers: To Your Initial Review Concerns
No ratings yet
Quick Answers: To Your Initial Review Concerns
11 pages
Bootstrap Student Presentation
100% (1)
Bootstrap Student Presentation
36 pages
Multiple Linear Regression 1
No ratings yet
Multiple Linear Regression 1
115 pages
Statistical Modelling of Epidemiological Data
No ratings yet
Statistical Modelling of Epidemiological Data
87 pages
REGRESSION ANALYSIS 1 and 2 Notes
No ratings yet
REGRESSION ANALYSIS 1 and 2 Notes
9 pages
0 Regularization PDF
No ratings yet
0 Regularization PDF
88 pages
Features Election
No ratings yet
Features Election
18 pages
Wasserman 8 PDF
No ratings yet
Wasserman 8 PDF
12 pages
Introduction of Regression
No ratings yet
Introduction of Regression
57 pages
MI_Unit 5
No ratings yet
MI_Unit 5
72 pages
DDMA05_ModelSelection
No ratings yet
DDMA05_ModelSelection
28 pages
M1 - Evaluating Predictive Performance
No ratings yet
M1 - Evaluating Predictive Performance
58 pages
5 CV Boot-Handout PDF
No ratings yet
5 CV Boot-Handout PDF
44 pages
Day 6 Session 1 MLR
No ratings yet
Day 6 Session 1 MLR
25 pages
CPPD Statistics Impact Pharmacy Practice
No ratings yet
CPPD Statistics Impact Pharmacy Practice
29 pages
Lecture 5 Chapter 3
No ratings yet
Lecture 5 Chapter 3
56 pages
STATISTIC%20AND%20DATA%20SCIENCE%20II.pdf
No ratings yet
STATISTIC%20AND%20DATA%20SCIENCE%20II.pdf
37 pages
Bootstrap Up
No ratings yet
Bootstrap Up
5 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
15Multiple Linear Regression
No ratings yet
15Multiple Linear Regression
168 pages
logistCal
No ratings yet
logistCal
24 pages
Network Meta-Analysis for Decision-Making
From Everand
Network Meta-Analysis for Decision-Making
Sofia Dias
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Sample Sizes for Clinical, Laboratory and Epidemiology Studies
From Everand
Sample Sizes for Clinical, Laboratory and Epidemiology Studies
David Machin
No ratings yet
RUBRIC FOR SCORING _ Practice in Biostatistics
No ratings yet
RUBRIC FOR SCORING _ Practice in Biostatistics
1 page
Sap PDF
No ratings yet
Sap PDF
135 pages
Syllabus - MCA - I Semester - New - Revised - 2013 PDF
No ratings yet
Syllabus - MCA - I Semester - New - Revised - 2013 PDF
9 pages
Power System Control and Operation
No ratings yet
Power System Control and Operation
3 pages
Calculus Abstract 1 PDF
No ratings yet
Calculus Abstract 1 PDF
3 pages
Research Hypothesis
No ratings yet
Research Hypothesis
22 pages
Basics in Number Theory
No ratings yet
Basics in Number Theory
17 pages
Linear Algebra in Details
No ratings yet
Linear Algebra in Details
88 pages
Collaboration and Lms Matrix
No ratings yet
Collaboration and Lms Matrix
5 pages
Single-Source Shortest Paths
No ratings yet
Single-Source Shortest Paths
43 pages
Micro PS1 Fall 2019 Sol Key
No ratings yet
Micro PS1 Fall 2019 Sol Key
7 pages
SCF Project
No ratings yet
SCF Project
10 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
Chap 1 Physics Measurement
No ratings yet
Chap 1 Physics Measurement
17 pages
SPM Physics Paper 3
No ratings yet
SPM Physics Paper 3
10 pages
Shortcuts in Reasoning Verbal Non Verbal Amp Analytical For Competitive Exams (Sscstudy - Com)
No ratings yet
Shortcuts in Reasoning Verbal Non Verbal Amp Analytical For Competitive Exams (Sscstudy - Com)
154 pages
Double-Species Slurry Flow in A Horizontal Pipeline: P. V. Skudarnov C. X. Lin M. A. Ebadian
No ratings yet
Double-Species Slurry Flow in A Horizontal Pipeline: P. V. Skudarnov C. X. Lin M. A. Ebadian
8 pages
Python Numpy Array Tutorial (Article) - DataCamp
No ratings yet
Python Numpy Array Tutorial (Article) - DataCamp
40 pages
Lab 2 Edt FRRM Sahil
No ratings yet
Lab 2 Edt FRRM Sahil
10 pages
Ma 8452 SNM Hostel Coaching Sheet
No ratings yet
Ma 8452 SNM Hostel Coaching Sheet
7 pages
Vdocuments - MX - Maurice Nicoll Living Time PDF
No ratings yet
Vdocuments - MX - Maurice Nicoll Living Time PDF
40 pages
Maths Paper - Question Paper
No ratings yet
Maths Paper - Question Paper
6 pages
Margham Publications: Price List Cum Order Form For The Year 2010-2011
No ratings yet
Margham Publications: Price List Cum Order Form For The Year 2010-2011
6 pages
MG 412 Advanced Hyrdometallugy Test 1 2017 Eng. T R Sithole: Instruction
No ratings yet
MG 412 Advanced Hyrdometallugy Test 1 2017 Eng. T R Sithole: Instruction
2 pages
Backwards Heat Equation Info
No ratings yet
Backwards Heat Equation Info
1 page
ECE 553/531L (Feedback and Control System) : University Vision
No ratings yet
ECE 553/531L (Feedback and Control System) : University Vision
3 pages
Student Worksheet Part 3
No ratings yet
Student Worksheet Part 3
8 pages
Fully-Differential Amplifiers TI PAPERS
No ratings yet
Fully-Differential Amplifiers TI PAPERS
28 pages
Worksheet 4
No ratings yet
Worksheet 4
22 pages