Exact Logistic

Paper P254-25
R
Performing Exact Logistic Regression with the SASSystem
Robert E. Derr, SAS Institute Inc., Cary, NC

The METHODOLOGY section in this paper presents
the logistic regression model and the different likelihoods, then explains how the exact analysis algorithm
implemented in PROC LOGISTIC works; details on
the reported statistics are available in the appendix.
The SYNTAX section describes the new statements
and options in the LOGISTIC procedure for the exact
methods. The EXAMPLES section provides several
examples to illustrate the syntax and the usefulness
of the method.
ABSTRACT
Exact logistic regression has become an important
analytical technique, especially in the pharmaceutical industry, since the usual asymptotic methods for
analyzing small, skewed, or sparse data sets are unreliable. Inference based on enumerating the exact distributions of sufcient statistics for parameters
of interest in a logistic regression model, conditional
on the remaining parameters, is computationally infeasible for many problems. Hirji, Mehta, and Patel
(1987) developed an efcient algorithm for generating the required conditional distributions, thus making these methods computationally available. This
paper discusses the theory and methods for exact
logistic regression and illustrates their application in
Version 8 of the SAS System with new facilities in
the LOGISTIC procedure.
Dose-Response Study
First, consider a small dose-response study to motivate the usefulness of exact logistic regression. Researchers are interested in analyzing how mortality
rates change with respect to dosage of a drug. The
dose data set contains life/death outcomes for six levels of drug dosage (0 to 5). Three subjects are given
each specic dose of the drug, and the number of
deaths are recorded.
INTRODUCTION
data dose;
input Dose Deaths Total @@;
datalines;
0 0 3
1 0 3
2 0 3
3 0 3
4 1 3
5 2 3
;
run;
Many clinical trials deal with the comparison of populations of subjects with categorical responses. Historically, statistical inference for such studies involve
large-sample approximations, and tting logistic regression models to such data is performed through
the unconditional likelihood function.
However,
asymptotic methods may be inadequate when sample sizes are small or the data are sparse, skewed, or
heavily tied. Exact conditional inference remains valid
in such situations.
All of the cells have counts that are less than 5, which
makes the applicability of large sample theory questionable. For each subject receiving dosage ,
, let
if the subject died,
other
. Then the linear logistic
wise, and
, which ts a common intercept and slope for
model for this problem is
The LOGISTIC, GENMOD, PROBIT, and CATMOD

procedures perform unconditional likelihood inference
for logit models, and the PHREG procedure can perform asymptotic conditional likelihood inference for
logit models. SAS users have requested the ability
to perform exact tests for logistic regression modeling. Many exact statistical tests have already been
added to the FREQ and NPAR1WAY procedures, and
in Release 8.1, SAS/STAT software includes exact
logistic regression for binary (dichotomous) response
variables in the LOGISTIC procedure.
the subjects. In the PROC LOGISTIC invocation below, the EXACT statement requests an exact analysis
and the ESTIMATE option produces exact parameter
estimates.
proc logistic data=dose descending;
model Deaths/Total = Dose;
exact Dose / estimate=both;
run;
Figure 1 displays some of the unconditional asymptotic results that are produced by default. The likelihood ratio and score tests reject the null hypothesis that is zero. However, the Wald test does not
reject this null hypothesis. The seemingly conicting
conclusions of these tests are a telltale sign that the
large-sample approximation is unreliable. The estimates for the intercept and the slope both have
-values greater than , indicating marginal inuence. The condence limits for the odds ratio of the
dose parameter contains the value , from which you
could conclude, if you accept the model, that there is
no change in mortality with a change in dosage.
Exact Conditional Analysis

Exact Odds Ratios
Parameter
Dose
Figure 2.
Chi-Square
DF
Pr > ChiSq
8.1478
5.7943
2.7249
1
1
1
0.0043
0.0161
0.0988
Likelihood Ratio
Score
Wald
6.049
1.123
353.000
p-Value
0.0245
(continued)
The unconditional asymptotic and conditional exact

results produce somewhat conicting conclusions for
this example. Stokes, Davis, and Koch (1995) recommend looking at the exact results when sample sizes
are small and the approximate -values are less than
. For this example, the small sample size and the
conicting results for the asymptotic hypothesis tests
indicate that an exact analysis would be more appropriate.
Testing Global Null Hypothesis: BETA=0

Test
95% Confidence
Limits
Estimate
METHODOLOGY
Analysis of Maximum Likelihood Estimates

Parameter
DF
Estimate
Standard
Error
Chi-Square
Intercept
Dose
1
1
-9.4745
2.0804
5.5677
1.2603
2.8958
2.7249
The theory of exact conditional logistic regression

analysis was originally laid out by Cox (1970), and the
computational method employed in PROC LOGISTIC
is described in Hirji, Mehta, and Patel (1987). Other
references that provide useful summaries of the
derivations include Cox and Snell (1989), Agresti
(1990), and Mehta and Patel (1995).
Pr > ChiSq
0.0888
0.0988
Odds Ratio Estimates

Point
Estimate
Effect
Dose
Figure 1.
8.007
95% Wald
Confidence Limits
0.677
94.679
This section summarizes the methodology behind

logistic regression and explains how the algorithm for
exact computations works.
Output from Asymptotic Analysis
Figure 2 shows the results from the EXACT statement. The -values in the Conditional Exact Tests
table lead to rejecting the null hypothesis that is zero
(no conclusions can be made about since it is conditioned away). Note that the -values for the asymptotic estimates are larger than those for the exact estimates; however, Stokes, Davis, and Koch (1995) observe that, in general, the exact methods tend to produce more conservative results. The Exact Parameter Estimates table shows that the slope is esti , and since the condence
mated to be
interval for the odds ratio of does not contain , the
odds of death increase signicantly with dosage. Note
that the exact tests do not produce standard errors for
the estimates.
Logistic Regression
independent Bernoulli random variables

having observed values
.
, let
For each observation
be a vector of ex . Let
planatory variables, and denote

be the event probability for
, and denote
each
. Then
, or
the logistic regression model is
Consider

where
vector.
Test
Dose
Score
Probability
Statistic
5.4724
0.0110
--- p-Value --Exact

Mid
0.0245
0.0245
0.0190
0.0190
Exact Parameter Estimates

Parameter
Dose
Figure 2.
Estimate
1.8000
95% Confidence
Limits
0.1157
5.8665

is the unknown parameter
The joint probability of the observed is a product of

Bernoulli functions:
Conditional Exact Tests

Effect
p-Value
0.0245
Output from EXACT Analysis

2
Suppose you have the following data, and you want to

nd the permutation distribution of the sufcient statistics for x1 conditional on those for x0.
Unconditional likelihood inference is based on maximizing this likelihood function, and several asymptotic statistics (likelihood ratio, score, and Wald) can
be used to perform hypothesis tests.
Observation
1
2
3
4
To perform conditional inference, rst observe that the

sufcient statistics for the in the unconditional likeli
hood function are the corresponding
,
where is a realization of . To create the probability
, sum over
density function (pdf) for
all binary sequences that generate an observable
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
x0
1
1
1
1
x1
1
1
2
0
Here, the observed data are

,
, and . The observed is

computed as
, so you are conditioning on
. Tabulate the 16 possible
vectors and their resulting vectors:
where

is the number of
sequences that generate . Suppose the param are nuisance parameters; that
eters
is, the current analysis is geared toward the last parameters . Denote the sufcient statistics for the
, the correnuisance parameters as
sponding observed values as , and the correspondas . Similarly, dene , ,
ing columns of
and for the parameters of interest. The nuisance
parameters can be removed from the analysis by conditioning on their sufcient statistics to create the conditional likelihood
where
y
0
1
0
1
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
0
0
0
0
1
1
1
1
0
0
0
0
1
1
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
0
1
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
1
2
1
2
2
3
1
2
2
3
2
3
3
4
0
0
2
2
1
1
3
3
1
1
3
3
2
2
4
4
The conditional distribution is derived from this joint

distribution by extracting every vector with :
2
1
2
2
2
3
total
is the number of vectors such that

and .
Conditional asymptotic inference is performed by

maximizing the conditional likelihood and producing
conditional statistics similar to the unconditional likelihood case.
Frequency
2
2
2
6
Probability
2/6
2/6
2/6
1
Generating the conditional distribution from complete

enumeration of the joint distribution is conceptually
simple; however, this method becomes computationally infeasible very quickly. For example, if you had
only observations, youd have to scan through
different vectorsmore than a billion! You can reduce the number of vectors to look at if you are condi-
Conditional exact inference is based on generating

the conditional distribution for the parameters of interest. This distribution is called the permutation or
exact conditional distribution. The conditional pdf
is denoted as . The
following section describes the generation of this distribution, and details about the tests and inferences
are provided in the appendix.
tioning on the intercept by processing
vec-
tors, but this does not improve the situation much.

The multivariate shift algorithm developed by Hirji,
Mehta, and Patel (1987) is a faster method of generating and counting the vectors for larger problems. The algorithm is based on the following ob and a deservation. Given any
, let

and
sign
Exact Conditional Distribution

The goal of the exact conditional analysis is to determine how likely the observed response is with
.
respect to all possible responses
One way to proceed is to generate every vector for
which , and count the number of vectors
for which is equal to each unique .

.
.
.

.
.
.
be the
rst rows of each matrix. Write the sufcient statistic

based on these rows as . A recursion
relation results: .
the trade-off is that a list of all valid nodes in

a stage must be saved, increasing memory usage.
The previous example is used to illustrate how this

relation is exploited.
Note that, in order to obtain the correct distribution, each node descended from this combined
node must count as outcomes.
Figure 3 displays a tree diagram where each row (after the rst) corresponds to an observation , and
each node of the tree is denoted by a pair of digits
representing the value of . The top node in the
tree is initially set to 00, and indicates that

and
, or . Each row of the tree
is numbered; these numbers represent the stages of
the algorithm. To move down the branches, add
times the next value of x0 x1 to the current value
and . For example, startof , for
00, take
ing at the zeroth stage with
as the value of the
left branch of the rst stage, and
for the right branch.
0:
00
1:
00
2:
00
3:
00
11
11
11
12
11
23
11
23
22
22
34
Stages of the Multivariate Shift Algorithm
The following table displays the distribution created

possible
from the frequency table of the
vectors from the nal stage of Figure 3.
0
0
1
0
1
1
1
2
2
1
2
2
2
3
3
2
3
3
3
4
4
4
total
Frequency
1
1
2
1
2
2
2
1
2
1
1
16
Probability
1/16
1/16
2/16
1/16
2/16
2/16
2/16
1/16
2/16
1/16
1/16
1
Since the rst two observations have the same

covariate values, you can jump from stage 0 to
stage 2 by combining the rst two observations,
incrementing the values in stage 0 along three
, and modbranches with for
. This saves search
time at the expense of computing binomial coefcients.

Once a distribution is computed for a set of effects, a distribution for any subset of these effects can be produced by scanning the larger
distribution. In the example, the conditional distribution for was produced from the
joint distribution by extracting members having .
PROC LOGISTICs implementation of the multivariate

shift algorithm automatically utilizes these shortcuts
to improve performance. The bulk of the computation time and memory is consumed by the creation of
the exact joint distribution. After the joint distribution
for a set of effects is created, the computational effort
required to produce hypothesis tests and parameter
estimates for any subset of the effects is (relatively)
trivial.
The conditional distribution obtained for the observed

is the same as previously generated.
EXACT CAPABILITIES OF PROC LOGISTIC
There are ve shortcuts you can observe from the example:
The infeasibility criterion is more effective when

the larger covariate values are processed rst.
For example, if the value of x0 for the fourth
observation was instead of , then you could
obtain a from the third stage node,
and hence you would have to process the extra
nodes.
ifying the counts by
4: 00 10 12 22 11 21 23 33 11 21 23 33 22 32 34 44
Figure 3.
In the third stage, there is no way to get from

to in one step by adding 0 or 1 times
; similarly, if the value of in the third stage
is , it cannot be reduced to the necessary value
of . These illustrate what Hirji, Mehta, and Patel (1987) call infeasibility criteria.
The exact conditional logistic regression analysis in

PROC LOGISTIC provides
There are two nodes in the second stage

of Figure 3, and the branches below those
two values are identical. Computation time is
signicantly reduced if you process an entire
stage and combine identical nodes; however,
two tests for the null hypothesis that the parameters for the effects specied in the EXACT statement are zero: the exact probability test and the
exact conditional scores test. For each test, the
Conditional Exact Tests table displays
events/trials form, but the response variable must

have at most two levels. Options specied in parentheses after the EXACTOPTIONS option apply to every EXACT statement in the program. The following
options are available:
a test statistic
an exact -value, which is the probability of
obtaining a more extreme statistic than the
observed, assuming the null hypothesis
a mid -value, which adjusts for the discreteness of the distribution
MAXTIME=seconds
STATUSTIME=seconds
parameter estimates and odds ratios for each

effect in the EXACT statement conditional on
the values of all the other parameters in the
model. For each estimate, the Exact Parameter Estimates and Exact Odds Ratios tables
display
The MAXTIME= option species the maximum clock

time (in seconds) that PROC LOGISTIC can use to
calculate the permutation distributions. If the limit is
exceeded, the procedure halts all computations and
prints a note to the SAS LOG. The default maximum
clock time is seven days.
the exact conditional maximum likelihood

estimate (CMLE), or, in cases where the
CMLE does not exist, the median unbiased
estimate
one- or two-sided condence limits
a one- or two-sided -value for testing that
the parameter estimate is zero or the odds
ratio is one
The STATUSTIME= option species a time interval (in

seconds) for printing a status line in the SAS LOG.
You can use this status line to track the progress of
the computation of the exact conditional distributions.
The time interval you specify is approximate; the actual time intervals may vary for larger problems. By
default, no status reports are produced.
optionally, output data sets containing the derived distributions and summary statistics
EXACT Options
Several options can be specied in each EXACT
statement. The available options are
Note that hypothesis tests can be generated for each

individual effect in an EXACT statement or for all effects simultaneously. See the appendix for more detailed information about the reported tests and statistics.
ALPHA=value
ESTIMATE<=keyword>
JOINT
JOINTONLY
ONESIDED
OUTDIST=SAS data set
SYNTAX
The following statements control the exact analyses
in the LOGISTIC procedure. Items within the <> are
optional.
The ALPHA= option species the signicance level for

the condence limits for the parameters; the (default)
value of 0.05 results in 95% condence limits.
PROC LOGISTIC <EXACTONLY>

<EXACTOPTIONS(options)>;
EXACT <label >effects < /options>;
The ESTIMATE option requests parameter estimates,

condence intervals, and tests for each individual parameter (conditional on all other parameters) specied in the EXACT statement. Optional keywords can
be specied; the default ESTIMATE=PARM option requests parameter estimates, ESTIMATE=ODDS requests the odds ratios, and ESTIMATE=BOTH requests both parameter estimates and the odds ratios.
Several EXACT statements may be specied in any

program, but they must follow the MODEL statement. The new EXACTOPTIONS option in the PROC
LOGISTIC statement affects every exact analysis requested, whereas options in an EXACT statement
are local to that statement. For each EXACT statement, you can include an identifying label enclosed in
quotes, and specify any effects in the MODEL statement or the keyword intercept. The analysis conditions on any other effects (possibly including the intercept) not specied in the EXACT statement.
The JOINT option requests a test that all the parameters for the EXACT statement are simultaneously
equal to zero in addition to the tests of the individual
parameters, while the JOINTONLY option suppresses
the default individual tests. The test is indicated in the
Conditional Exact Tests table by the label Joint.
PROC LOGISTIC Options
The ONESIDED option requests one-sided condence intervals and -values for the individual parameter estimates and odds ratios. Note that the twosided -values are twice the one-sided -values.
The EXACTONLY option suppresses the unconditional likelihood analyses that PROC LOGISTIC usually performs, and only the exact analyses are executed. Input data sets can be in single-trial or
5
As with all SAS procedure output, you can use ODS

(Output Delivery System) to create output data sets
of the values included in these tables by specifying a
statement such as the following:
The OUTDIST= data set contains all of the exact conditional distributions requested in its EXACT statement. This data set contains the possible sufcient
statistics for the effects specied in the EXACT statement, the counts derived from the multivariate shift algorithm, the probability of occurrence, and the score
value for each sufcient statistic. When you request
an OUTDIST= data set, the observed sufcient statistics are displayed in the Sufcient Statistics table.
ods output SuffStats=suff ExactTests=test

ExactParmEst=est ExactOddsRatio=odds;
Note that, at this writing, the exact facilities are still

under development and the syntax and listing format
may change.
Use with Other Statements and Options

Several existing options can be used in conjunction
with the EXACT statement. You can dene classication effects and strata using the CLASS statement,
you can process the data using BY groups, and you
can include a frequency variable with the FREQ statement. The NOINT option in the MODEL statement
suppresses the intercept term.
EXAMPLES
The following examples illustrate different types of exact analysis. The data in these examples were constructed solely for illustrative purposes. The Sparse
Data example illustrates that the MLE for the unconditional likelihood analysis may not exist, rendering
the asymptotic inference impossible, while the exact
conditional inference is still plausible. The Stratied
Analyses example demonstrates how to use exact
conditional analysis to adjust for within-strata correlation. The Crossover Clinical Trial example is a popular phase II analysis for the pharmaceutical industry.
If you receive messages indicating that the NewtonRaphson iterations for the parameter estimates or
condence intervals did not converge, specifying the
ABSFCONV=, FCONV=, XCONV=, or MAXITER=
options in the MODEL statement may help.
Exact analyses are not performed when you specify
a WEIGHT statement, a non-logit link, an offset variable, the NOFIT option, or a model-selection method.
Sparse Data
There are several types of data for which unconditional maximum likelihood estimates fail to exist, or
for which the theory is not applicable. For data with
small cell counts, tests based on the asymptotic normality of the maximum likelihood estimates may not
be valid. For other data, the maximum likelihood estimates may not exist and the estimated dispersion
matrix may be unbounded. In this example, the data
set separate contains variables which perfectly predict the response, yielding a complete separation of
data points.
Output
PROC LOGISTIC presents the exact conditional analysis results in several tables:
The Conditional Exact Tests table displays the

score and probability statistics for testing that all
parameters for the specied effects are zero. By
default, tests for a single-effect model are produced, but tests for multiple-effect models can
also be requested. Exact and mid -values are
also generated.
data separate;
input A B Response count @@;
datalines;
0 0 1 1 0 1 0 2 1 0 1 8 1 1 1 21
;
The Exact Parameter Estimates table displays

the individual parameter estimates (conditional
on all other parameters in the model), condence limits, and a -value for testing that the
parameter is zero.
The following statements t the logistic regression

model:
The Exact Odds Ratios table displays odds ratios for individual parameters, condence limits,
and a -value for testing that the odds ratio is 1.
A B
The Sufcient Statistics table displays the sufcient statistic for each parameter in the model.
This table is only generated when you also
specify the OUTDIST= option to output the distribution to a SAS data set. The information is
useful for certain further analyses.
The JOINT option tests the joint hypothesis that

and the ESTIMATE option produces the individual parameter estimates of and . The
OUTDIST= option creates a data set containing all
permutation distributions required for this analysis.
to identify the row that contains the observed values.

You can see that it is intercept A B , corresponding to the second, ninth, and thirteenth rows
in Figure 6. Note that only the joint distribution for the
A and B variables was computed from the multivariate shift algorithm; the univariate conditional distributions were extracted from the joint distribution to save
CPU time. The OUTDIST= data set has three values
in the distribution for the A variable and two for the
B variable. If the permutation distribution is degenerate (has only one value), then the procedure does not
produce any statistics and does not output the distribution. However, for small distributions, you have to
decide whether there is enough information on which
to base the estimates; in this simple example, there is
probably too little information contained in the conditional distribution for the B variable.
proc logistic data=separate;

freq count;
model Response=A B;
exact A B / joint estimate
outdist=dist;
proc print data=dist;
run;
Figure 4 shows that the usual asymptotic analysis indicates that complete separation has occurred. You
can see that the parameter estimates do not converge
if you specify both the ITPRINT and NOCHECK options in the MODEL statement. However, exact tests
and estimates for the conditional analysis can still be
computed and are displayed in Figure 5.
Model Convergence Status
Complete separation of data points detected.
Figure 4.
Convergence Status
Obs
In Figure 5, the joint exact test of A and B is signicant, but the B parameter appears insignicant. The
median unbiased estimate is created instead of the
CMLE because the value of the observed sufcient
statistic lies at an extreme of the derived distribution,
implying that the CMLE does not exist. Even though
the asymptotic results are unreliable, the exact analysis allows you to conclude that there is a signicant
effect due to A.
Count
Score
1
2
3
4
5
6
7
8
9
10
11
12
13
0
0
1
1
1
2
2
2
0
1
2
.
.
1
2
0
1
2
0
1
2
.
.
.
1
2
2
1
8
37
42
28
168
210
1
42
210
2
1
20.2622
21.1153
8.9654
4.4055
4.9644
5.5822
0.7281
0.9929
22.0000
4.5023
0.1995
0.5000
2.0000
Figure 6.
Prob
0.00403
0.00202
0.01613
0.07460
0.08468
0.05645
0.33871
0.42339
0.00395
0.16601
0.83004
0.66667
0.33333
OUTDIST= Data Set
Stratied Analyses
If your data are collected from different hospitals or

different families, you can perform a stratied analysis to control for the within group correlation. The
strata are treated as nuisance parameters and a conditional likelihood removes them from the analysis.
Your model contains a different intercept term for each
stratum:
Sufficient Statistics
Parameter
Value
Intercept
A
B
2
0
2

Effect
Test
Joint
Score
Probability
Score
Probability
Score
Probability
A
B
Statistic
21.1153
0.00202
22.0000
0.00395
2.0000
0.3333
--- p-Value --Exact

Mid
0.0020
0.0020
0.0040
0.0040
0.3333
0.3333
0.0010
0.0010
0.0020
0.0020
0.1667
0.1667

where indexes the strata, are the strata intercepts, and indexes the subjects within the strata.

Parameter
A
B
Estimate
-3.8398*
0.6931*
95% Confidence
Limits
-Infinity
-2.9704
-1.0718
Infinity
p-Value
With PROC LOGISTIC, you can specify a stratication

variable by including it in the CLASS statement. For
example, a stratication variable that has three levels
can be parameterized as
0.0079
0.6667
NOTE: * indicates a median unbiased estimate.
Figure 5.
Output from EXACT Analysis
Stratum
1
2
3
Figure 6 displays the three permutation distributions

created with the OUTDIST= option; the joint distribution of A and B conditional on the intercept is contained in observations 1 through 8, the distribution for
A conditional on the intercept and B is in observations
9 through 11, and the distribution for B conditional
on the intercept and A is in observations 12 and 13.
The Sufcient Statistics table in Figure 5 allows you
Level 1
1
0
0
Level 2
0
1
0
where the usual intercept term represents the last

strata level, and the other strata levels are a combination of the intercept and the appropriate level
term. This is dened in the CLASS statement with
the PARAM=REF option. Alternatively, you can pa7
Z to be if the response is an event and if the response is a nonevent. This variable is used as the
time variable as well as the censoring indicator (with
as the censored value) in the MODEL statement
of PROC PHREG. Also specify the TIES=DISCRETE
option to request the discrete logistic model, and the
STRATA statement to specify the strata to be conditioned on.
rameterize the stratum variable as

Stratum
1
2
3
Level 1
1
0
0
Level 2
0
1
0
Level 3
0
0
1
This is dened in the CLASS statement with the

PARAM=GLM option. Since strata and intercepts are
conditioned out of this analysis, either form is reasonable.
proc phreg;
freq count;
strata Stratum;
model Z*Z(2)=X1 X2 / ties=discrete;
run;
The stratied data set includes a response variable

Y, two explanatory variables X1 and X2, and a stratication variable. The Z variable will be used in a later
analysis.
data stratified;
input Stratum
Z = 2 - Y;
datalines;
1 0 1 1 1 2 0 1
1 0 2 1 1 2 0 2
1 1 1 0 1 2 1 2
1 1 2 0 1 2 1 3
1 1 3 0 2
;
The output of PROC PHREG is shown in Figure 8.

Y X1 X2 count @@;
Testing Global Null Hypothesis: BETA=0
Test
2
2
0
1
3
3
1
2
3
3
3
3
3
0
0
1
1
1
1
2
1
2
3
0
1
0
2
2
2
1
1
2
1
Chi-Square
Likelihood Ratio
Score
Wald
DF
Pr > ChiSq
9.6425
7.9291
4.6510
2
2
2
0.0081
0.0190
0.0977
Analysis of Maximum Likelihood Estimates

Variable DF
X1
X2
In the following statements, the stratication variable,

which is dened in the CLASS statement, is included
in the MODEL statement but left out of the EXACT
statement, implying that it is a nuisance effect to be
conditioned on for the analysis of the X1 and X2 effects of interest.
1
1
Figure 8.
Parameter
Estimate
2.32474
-1.11430
Standard
Error Chi-Square Pr > ChiSq
1.11585
0.72917
4.3404
2.3353
0.0372
0.1265
Hazard
Ratio
10.224
0.328
PROC PHREG Results
Comparing Figure 7 with Figure 8, you can see that

the value of the conditional score statistic for testing
is
the overall null hypothesis
for both the asymptotic conditional analysis in PROC
PHREG and the exact analysis in PROC LOGISTIC.
However, PROC PHREG computes a -value of
by comparing the value of the conditional score statistic to a chi-squared distribution with degrees of freedom (since there are two parameters), while PROC
LOGISTIC derives a -value of from the exact conditional distribution. Inference on individual parameters is often not the same between the exact conditional analysis and the asymptotic conditional likelihood results.
proc logistic descending exactonly;

freq count;
class Stratum / param=ref;
model Y=Stratum X1 X2;
exact X1 X2 / jointonly estimate;
run;
In Figure 7, the joint exact test for the X1 and X2

parameters rejects the null hypothesis. However, the
X2 parameter appears insignicant.
Crossover Clinical Trial

Effect
Test
Statistic
--- p-Value --Exact

Mid
Joint
Score
Probability
7.9291
0.000612
0.0165
0.0077
One common use of conditional logistic regression is

in a crossover clinical trial. In this example, the subjects are given a sequence of drugs, and their response to each drug is recorded. Each subject is
considered to be a separate stratum. The goal is to
determine if the drugs have the same effect, adjusting for period and carryover effects. In this example,
researchers give 15 different subjects three different
drugs (A,B,P=placebo) in three consecutive periods
(P1,P2,P3), and their response in each period is for
improvement and for no improvement. The carryover effect is a classication variable indicating which
drug was given in the preceding period.
0.0162
0.0074

Parameter
X1
X2
Figure 7.
95% Confidence
Limits
Estimate
1.9979
-1.0097
0.3140
-2.9152
5.2012
0.4142
p-Value
0.0126
0.1931
Exact Results
This exact analysis should be compared to an asymptotic conditional likelihood analysis, which is available
with the PHREG procedure. First, dene a variable
8
, while the corresponding -value is the

probability of getting a less likely (more extreme)
statistic,
data Crossover (drop=P1 P2 P3);

input Subject P1$ P2$ P3$ Improve @@;
Period=1; Drug=P1; Carry=0; output;
input Improve @@;
Period=2; Drug=P2; Carry=P1; output;
input Improve @@;
Period=3; Drug=P3; Carry=P2; output;
datalines;
1 A B P 0 0 0
8 B P A 0 0 1
2 A B P 1 1 0
9 B P A 1 0 1
3 A B P 0 1 1
10 B P A 0 1 0
4 A P B 1 0 1
11 P A B 0 1 0
5 A P B 1 0 0
12 P B A 1 0 1
6 B A P 0 0 0
13 P B A 0 0 1
7 B A P 1 1 0
14 P B A 0 1 0
15 P B A 0 1 1
;
For the exact conditional scores test, the conditional

mean and variance matrix of the (conditional on ) are calculated, and the score statistic for the observed value,
Drug A
Drug B

is compared to the score for each member of the distribution
where indexes the subject, are the subject intercepts, indexes the period, and the are indicator
variables taking the value when the condition is true.
Note that this model ignores carryover effects.
The resulting -value is
proc logistic descending exactonly;

class Subject Drug Period/ param=ref;
model Improve=Subject Drug Period;
exact one
Drug Period/ jointonly;
exact two
Drug / jointonly;
exact three Period / jointonly;
run;
where
, and
there exist with

.
The mid- statistic, dened as
Even though three EXACT statements are invoked in

this example, PROC LOGISTIC only computes the
permutation distribution for the joint test of the drug
and period parameters; the other two distributions are
derived from the joint distribution.
was proposed by Lancaster (1961) to compensate for

the discreteness of a distribution. Refer to Agresti
(1992) for more information.
The exact conditional score -value for the test of signicance of all the parameters is ; hence, you
cannot reject the null hypothesis. However, the exact
conditional score -value for the test of no drug effects, , is , while the -value for the
, is , which
test of no period effects,
suggests that the period term should be dropped from
this model.
Inference for a Single Parameter

Exact parameter estimates are derived for a single
parameter by regarding all the other parameters
as nuisance parameters. The appropriate sufcient statistics are
and
, with their observed values denoted by the lowercase . Hence, the
conditional pdf used to create the parameter estimate
for is
APPENDIX
Hypothesis Tests
Using the same notation as in the METHODOLOGY

section, consider testing the null hypothesis

against the alternative
,
. Under the null hypothesis,
conditional on
the test statistic for the exact probability test is just
where
there exist with
, and .
The model to be t is
for
there exist with
and
The maximum exact conditional likelihood estimate is

the quantity which maximizes the conditional pdf.
9
REFERENCES
A Newton-Raphson algorithm is used to perform this

search. However, if the observed attains either its
minimum or maximum value in the permutation dis or
tribution (that is, either
), then the conditional pdf is

monotonically increasing in and cannot be maximized. In this case, a median unbiased estimate
(Hirji, Tsiatis, and Mehta 1989; Hirji and Tang 1998)
, and a
is produced that satises
Newton-Raphson-type algorithm is used to perform

the search.
Agresti, Alan (1990), Categorical Data Analysis, New

York: John Wiley & Sons, Inc.
Agresti, A. (1992), A Survey of Exact Inference for
Contingency Tables, Statistical Science, 7, 131177.
Cox, D.R. (1970), Analysis of Binary Data, New York:
Chapman and Hall.
Cox, D.R. and Snell, E.J. (1989), Analysis of Binary
Data, Second Edition, New York: Chapman and Hall.
Likelihood ratio tests based on the conditional pdf are

against various alterused to test the null
,
natives. For testing against the alternative
the critical region for the UMP test consists of the upper tail of values for in the permutation distribution.
Thus, the one-sided signicance level is the
probability of a more extreme (greater) value:
Hirji, Karim F. and Tang, Man-Lai (1998), A Comparison of Tests for Trend, Communications in
StatisticsTheory and Methods, 27, 943963.
Hirji, Karim F., Tsiatis, Anastasios A., and Mehta,
Cyrus R. (1989), Median Unbiased Estimation for Binary Data, American Statistician, 43, 711.
The one-sided signicance level

is
Hirji, Karim F., Mehta, Cyrus R., and Patel, Nitin R.

(1987), Computing Distributions for Exact Logistic
Regression, JASA, 82, 11101117.
Lancaster, H. O., (1961), Signicance Tests in Discrete Distributions, JASA, 56, 223234.
against
Mehta, Cyrus R. and Patel, Nitin R. (1995), Exact

Logistic Regression: Theory and Examples, Statistics in Medicine, 14, 21432160.
Stokes, Maura E., Davis, Charles S., and Koch, Gary

G. (1995), Categorical Data Analysis Using the SAS
System, Cary, NC: SAS Institute Inc.
The minimum of these one-sided levels is reported

when the ONESIDED option is specied. The two is
sided signicance level against
calculated as
CONTACT INFORMATION
Your comments and questions are valued and encouraged. Contact the author at
Robert E. Derr, SAS Institute Inc., SAS Campus

Drive, R5245, Cary, NC 27513. Phone (919) 6778000 ext 6137.
FAX (919) 677-4444.
E-mail
Bob.Derr@sas.com
An upper % condence limit for corresponding to the observed is the solution of

, while the lower condence limit is
. A Newtonthe solution of
Raphson procedure is used to search for the solutions.
SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks
of SAS Institute Inc. in the USA and other countries.
indicates USA registration.
R
ACKNOWLEDGMENTS
Other brand and product names are registered trademarks or trademarks of their respective companies.
I am grateful to Virginia Clark, Greg Goodwin, Ying

So, Maura Stokes, and Randy Tobias of the Applications Division at SAS Institute for their valuable assistance in the preparation of this manuscript.
Version 3.0
10

Exact Logistic

Uploaded by

Copyright:

Available Formats

Exact Logistic

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Exact Logistic

Uploaded by

Copyright:

Available Formats

Paper P254-25

Robert E. Derr, SAS Institute Inc., Cary, NC

, which ts a common intercept and slope for

model for this problem is

The LOGISTIC, GENMOD, PROBIT, and CATMOD

Exact Conditional Analysis

The unconditional asymptotic and conditional exact

Testing Global Null Hypothesis: BETA=0

Analysis of Maximum Likelihood Estimates

The theory of exact conditional logistic regression

Odds Ratio Estimates

This section summarizes the methodology behind

Output from Asymptotic Analysis

independent Bernoulli random variables

Exact Conditional Analysis

--- p-Value --Exact

Exact Parameter Estimates

The joint probability of the observed is a product of

Conditional Exact Tests

Output from EXACT Analysis

Suppose you have the following data, and you want to

To perform conditional inference, rst observe that the

Here, the observed data are

The conditional distribution is derived from this joint

is the number of vectors such that

Conditional asymptotic inference is performed by

Generating the conditional distribution from complete

Conditional exact inference is based on generating

tioning on the intercept by processing

tors, but this does not improve the situation much.

Exact Conditional Distribution

rst rows of each matrix. Write the sufcient statistic

the trade-off is that a list of all valid nodes in

The previous example is used to illustrate how this

tree is initially set to 00, and indicates that

Stages of the Multivariate Shift Algorithm

The following table displays the distribution created

Since the rst two observations have the same

. This saves search

time at the expense of computing binomial coefcients.

PROC LOGISTICs implementation of the multivariate

The conditional distribution obtained for the observed

EXACT CAPABILITIES OF PROC LOGISTIC

There are ve shortcuts you can observe from the example:

The infeasibility criterion is more effective when

ifying the counts by

In the third stage, there is no way to get from

The exact conditional logistic regression analysis in

There are two nodes in the second stage

events/trials form, but the response variable must

parameter estimates and odds ratios for each

The MAXTIME= option species the maximum clock

the exact conditional maximum likelihood

The STATUSTIME= option species a time interval (in

Note that hypothesis tests can be generated for each

The ALPHA= option species the signicance level for

PROC LOGISTIC <EXACTONLY>

The ESTIMATE option requests parameter estimates,

Several EXACT statements may be specied in any