Journal of Statistical Software

JSS Journal of Statistical Software
November 2014, Volume 61, Issue 11. http://www.jstatsoft.org/
Fitting Accelerated Failure Time Models in Routine

Survival Analysis with R Package aftgee
Sy Han Chiou Sangwook Kang Jun Yan

University of Minnesota, Duluth Yonsei University University of Connecticut
Abstract
Accelerated failure time (AFT) models are alternatives to relative risk models which
are used extensively to examine the covariate effects on event times in censored data
regression. Nevertheless, AFT models have been much less utilized in practice due to
lack of reliable computing methods and software. This paper describes an R package aft-
gee that implements recently developed inference procedures for AFT models with both
the rank-based approach and the least squares approach. For the rank-based approach,
the package allows various weight choices and uses an induced smoothing procedure that
leads to much more efficient computation than the linear programming method. With the
rank-based estimator as an initial value, the generalized estimating equation approach is
used as an extension of the least squares approach to the multivariate case. Additional
sampling weights are incorporated to handle missing data needed as in case-cohort stud-
ies or general sampling schemes. A simulated dataset and two real life examples from
biomedical research are employed to illustrate the usage of the package.
Keywords: case-cohort, efficiency, Gehan weight, generalized estimating equation, Gρ class,

induced smoothing, least squares, log-rank, Prentice-Wilcoxon weight, rank-based, weighted
estimating equation.
1. Introduction
The linear regression model is the most commonly used regression model in data analysis
for uncensored data. When survival data are right-censored, two of the most frequently used
regression models are the relative risk model (Cox 1972) and the accelerate failure time (AFT)
model (e.g., Kalbfleisch and Prentice 2002, Chapter 4). The AFT model is appealing because
it is analogous to the classical linear regression approach, directly linking the expected failure
time to covariates. The AFT model with an unspecified error distribution is known as the
semiparametric AFT model, which has been studied extensively and is an alternative to the
2 aftgee: Fitting AFT Models with R
relative risk model with an unspecified baseline hazard function. Two methods for fitting
such models have been popular. One is the rank-based approach motivated by inverting
the weighted log-rank test (Prentice 1978). Its asymptotic properties have been rigorously
studied by Tsiatis (1990) and Ying (1993). The other method is an extension of the least
squares principle, such as the Buckley-James (BJ) estimator (Buckley and James 1979). The
theoretical properties of the BJ estimator were investigated in Ritov (1990) and Lai and Ying
(1991). Due to lack of efficient and reliable computing algorithms, both approaches have
not been widely used in practice until recently (Jin, Lin, Wei, and Ying 2003; Jin, Lin, and
Ying 2006b,c). Our R package aftgee (Chiou, Kang, and Yan 2014c) aims to provide an easy
access to AFT models with both methods based on the recent methodological developments.
Package aftgee is available from the Comprehensive R Archive Network (CRAN) at http:
//CRAN.R-project.org/package=aftgee.
Several packages for AFT models are available for the R environment (R Core Team 2014).
For parametric AFT models, where the error distribution is parametrically specified, one can
use survreg in package survival (Therneau 2014), psm in package rms (Harrel 2014) or aftreg
in package eha (Brostöm 2014). Misspecified error distributions in parametric AFT modeling
may lead to bias in estimation and false conclusion under the presents of censoring. For
semiparametric AFT models with unspecified error distribution, one can use bj in package
rms (Harrel 2014) or lss in package lss (Jin and Huang 2007). Function bj provides the
BJ estimator but it has several limitations: it computes the variance estimator based on
non-censored observations only which, although this has been reported to behave well in
simulation studies, lacks theoretical justification (Wei 1992); its convergence is slow and not
guaranteed; and it is only implemented for univariate failure time data. Package lss provides a
rank-based estimator with Gehan’s weight obtained from a linear programming approach (Jin
et al. 2003) and a least squares estimator with an iterative algorithm starting from the rank-
based estimator (Jin et al. 2006b). The variance estimators for both methods are bootstrap
based with validity theoretically justified. Nevertheless, there are several features lss fall
short. Its rank-based estimator is limited to Gehan’s weight which may not be the optimal
weight (Tsiatis 1990). The linear programming approach used for the rank-based estimator
is computationally very intensive, which also affects the least squares estimator through the
initial estimator. The bootstrap based variance estimation is very time consuming. Although
easily fixable, the package does not support user-specified initial values for the least squares
estimator. For clustered failure times, it operates with working independence and disregards
the within-cluster dependence, which may lead to efficiency loss especially when the within-
cluster dependence is strong (Chiou, Kang, Kim, and Yan 2014a).
Our package aftgee overcomes the aforementioned limitations in existing implementations and
provides a set of comprehensive tools for semiparametric AFT models in practical survival
analysis. For the rank-based estimator with Gehan’s weight, we implemented the induced
smoothing approach which is much faster than the linear programming approach without
loss in accuracy (Brown and Wang 2005, 2007). The induced smoothing approach has been
extended to work with any general weight (in addition to Gehan’s weight; Chiou, Kang, and
Yan 2013). Our efficient sandwich variance estimators provide much faster alternatives to the
full bootstrap variance estimation (Chiou, Kang, and Yan 2014b). With the fast rank-based
estimators as initial estimators, we implemented an iterative least squares procedure method
that extends generalized estimating equations (GEE) to clustered censored data (Chiou et al.
2014a). The resulting estimator is robust to misspecification of the working covariance matrix,
Journal of Statistical Software 3
and the efficiency is higher when the working covariance structure is closer to the truth.
Furthermore, these methodologies are generalized to incorporate additional sampling weights
for handing missing data and various sampling schemes (Chiou, Kang, and Yan 2014d).
Because of these features, the aftgee package is appealing to analysts who would like to fit
AFT models in their routine analysis of survival data.
The rest of the article is organized as follows. In the next section, we introduce the notations
and model formulation for the univariate AFT model. The multivariate extension is presented
in Section 3. Incorporation of sampling weight with application to case-cohort data is extended
in Section 4. Detailed usages are described in Section 5. A simulated dataset, an univariate
example and a multivariate example are used for illustration in Section 6. Conclusion and
some remarks are summarized in Section 7.
2. Univariate AFT model

For i = 1, . . . , n, let Ti , Ci and Xi be the log-transformed failure time, censoring time and the
p × 1 covariate vector for the ith subject. It is assumed that Ti is conditionally independent
of Ci given Xi . An univariate semiparametric AFT model has the form
Ti = Xi> β + i , i = 1, . . . , n,
where β is an unknown p × 1 vector of regression parameters, i ’s are independent and iden-

tically distributed random variables with an unspecified distribution. It is also assumed that
i ’s are independent of Xi . In the presence of right censoring, the observed data are indepen-
dent copies of (Yi , ∆i , Xi ), i = 1, . . . , n, where Yi = min(Ti , Ci ), ∆i = I(Ti < Ci ), and I(·) is
the indicator function.
2.1. Rank-based estimator

The regression parameters can be estimated by solving the rank-based weighted estimating
equation
n
" Pn #
j=1 Xj I[ej (β) ≥ ei (β)]
X
Un,ϕ (β) = ϕi (β)∆i Xi − Pn = 0, (1)
i=1 j=1 I[ej (β) ≥ ei (β)]
where ei (β) = Yi − Xi> β and ϕi (β) is a possibly data-dependent nonnegative weight func-
tion with values between 0 and 1. Let F̂ei (β) (t) be the estimated cumulative distribution
function
Pn based on the censored residual ei (β)’s. Some common choices of ϕi (β) are 1,
−1 ρ
n i=1 I[ej (β) ≥ ei (β)], 1 − F̂ei (β) (t) and [1 − F̂ei (β) (t)] , ρ ≥ 0, corresponding to log-
rank (Prentice 1978), Gehan (Gehan 1965), Prentice-Wilcoxon (Prentice 1978) and the more
general Gρ class (Harrington and Fleming 1982), respectively. The Kaplan-Meier estimator is
typically used to obtain F̂ei (β) (t). The solution of Equation 1, β̂n,ϕ , is consistent to the true
parameter, β0 , and is asymptotically normal (Tsiatis 1990; Ying 1993). Noting that Equa-
tion 1 with Gehan’s weight is the gradient of an objective function, Jin et al. (2003) used a
linear programming approach to obtain the estimator, which is computationally demanding,
especially for larger datasets and for obtaining variance estimators through bootstrap. In our
implementation, we used the Barzilai-Borwein spectral method implemented in package BB
(Varadhan and Gilbert 2009) to solve Equation 1 directly.
A computationally more efficient approach is the induced smoothing procedure of Brown

and Wang (2005, 2007). The idea is to replace the nonsmooth estimating equations with a
smoothed version, whose solutions are asymptotically equivalent to those of the former. Define
an independent p × 1 standard normal random vector Z and a p × p matrix Γn such that
Γ2n = Σn where Σn is a symmetric positive definite matrix. The induced smoothing procedure
replaces Un,ϕ (β) with EZ [Un,ϕ (β + n−1/2 Γn Z)], where the expectation is taken with respect
to Z. A choice of Σn is the identity matrix (Brown and Wang 2007). Some other forms of
Σn might be considered but the differences are minimal in practice (Chiou et al. 2014b).
With Gehan’s weight, the denominator of the ratio in Equation 1 gets canceled, and the
resulting smooth estimating equation is
n Xn
X ej (β) − ei (β)
Ũn,G (β) = ∆i (Xi − Xj )Φ = 0, (2)
rij
i=1 j=1
2 = (X − X )> Σ (X − X ) and Φ(·) is the standard normal cumulative distribution

where rij i j n i j
function. The estimating equation in Equation 2 is monotone and continuously differentiable
with respect to β, hence, its root can be found with standard numerical methods such as the
Barzilai-Borwein spectral method. The solution to Equation 2, denoted by β̃n,G , is consistent
to β0 and has the same asymptotic distribution as that obtained from the nonsmooth version
(Brown and Wang 2005, 2007). Its variance can be approximated from a resampling procedure
similar to that in Jin et al. (2003). Even with the smoothed equations, the resampling
procedure can still be very time consuming. Alternative variance estimation procedures, such
as those proposed by Chiou et al. (2014b), are recommended.
Deriving the smoothed estimating equations with general weights is challenging because
EZ [Un,ϕ (β + n−1/2 Γn Z)] involves the expectation of the ratio of two random quantities. In
addition, ϕi (β) also depends on β. Holding ϕi (β) evaluated at some initial estimator b, we
propose an approximation which replaces the expectation of the ratio with the ratio of the
expectations of the two terms:
n
" Pn #
j=1 Xj Φ[κij (β)]
X
Ũn,ϕ (b, β) = ∆i ϕi (b) Xi − Pn = 0, (3)
i=1 j=1 Φ[κij (β)]
where κij (β) = [ej (β) − ei (β)]/rij . The asymptotic equivalence between Equation 3 and the
smooth version of Equation 1 for the log-rank weight is established in Chiou et al. (2013).
For general weights, the regression parameters can be estimated from an iterative induced
smoothing procedure with the following steps:
(0)
1. Obtain an initial estimate β̃n,ϕ = bn of β and initialize with m = 1.
(m) (m−1) (m)
2. Update β̂n,ϕ by solving Ũn,ϕ (β̂n,ϕ , β̂n,ϕ ) = 0.
(m−1) (m)
3. Increase m by one and repeat Step 2 until |β̃n,ϕ,q − β̃n,ϕ,q | < t for all q = 1, . . . , p,
(m) (m)
where β̃n,ϕ,q is the qth component of β̃n,ϕ and t is a prefixed tolerance.
A simple choice of the initial estimator is the easy-to-compute Gehan’s estimator, β̃n,G .
Since estimating Equation 3 is not necessarily monotone in β, it might cause numerical prob-
lems in solving the estimating equations. Inspired by a discussion in Jin et al. (2003), a
reweighted induced smoothing strategy based on monotone estimating functions is considered

(Chiou et al. 2013):
n X
X n
Un,φ (β) = ∆i φi (β)(Xi − Xj )I[ej (β) ≥ ei (β)] = 0, (4)
i=1 j=1
where φi (β) = ϕi (β)/ nj=1 I[ej (β) ≥ ei (β)]. Fixing the weight φi (b) evaluated at b and
P
applying induced smoothing on Equation 4 lead to
n X
n
X ej (β) ≥ ei (β)
Ũn,φ (b, β) = ∆i φi (b)(Xi − Xj )Φ = 0. (5)
rij
i=1 j=1
This is the same as Equation 2 except for the weight φi (b) which is free from β. For an
initial estimator b of β, an estimator β̃n,φ can be obtained from the iterative procedure with
Ũn,ϕ (b, β) replaced by Ũn,φ (b, β). Using the arguments in Jin et al. (2003), the consistency and
asymptotic normality of the resulting estimators can be established (Chiou et al. 2013). The
equations within each iteration can be solved with package BB. The convergence is usually fast
with the initial Gehan’s estimator. Variance estimation can be done with the full resampling
method (Jin et al. 2003) or a fast sandwich variance estimator (Chiou et al. 2014b, 2013).
2.2. Least squares approach

The other method for fitting AFT model is the least squares approach. With survival data
from right censoring, Buckley and James (1979) replaced each response Ti with the condi-
tional expectation Ŷi (β) = Eβ (Ti |Yi , ∆i , Xi ), where the expectation is evaluated at regression
coefficients β. In particular,
"R ∞ #
ei (b) tdF̂ei (β) (t)
Ŷi (b) = ∆i Yi + (1 − ∆i ) + Xi> b .
1 − F̂ei (β) [ei (b)]
The theoretical properties of the BJ estimator have been studied by Ritov (1990) and Lai and
Ying (1991). The method, however, is rarely used in practice due to numerical challenges.
Jin, Lin, Wei, and Ying (2006a) proposed a more practical solution that generalizes the BJ
estimator. Given an initial estimator bn of β, the least squares estimator is the solution of
the following estimating equation
n
X
Un,ls (β, b) = (Xi − X̄)> (Ŷi (b) − Xi β) = 0, (6)
i=1
where X̄ = ni=1 Xi /n. The solution to Un,ls (β, β) = 0 is the BJ estimator. The advantage for
P
fixing the initial value bn is to avoid numerical complexity caused by solving Equation 6 which
is neither continuous nor monotone in β. Jin et al. (2006a) devised an iterative procedure
(m) (m−1) (0)
β̂n,ls = Ln (β̂n,ls ) for m > 1 with β̂n,ls = bn where
" n #−1 " n #
X X
Ln (b) = (Xi − X̄)> (Xi − X̄) (Xi − X̄)> Ŷi (b) − Ȳ (b) ,
i=1 i=1
and Ȳ (b) = ni=1 Ŷi (b)/n. If the initial estimator bn is consistent and asymptotically normal,
P
(m)
then β̂n,ls is also consistent and asymptotically normal for every m (Jin et al. 2006b). A good
candidate for the initial estimator is the induced smoothing Gehan estimator. The variance
of the resulting estimator can be approximated by a resampling procedure (Jin et al. 2006b).
3. Multivariate AFT model

Suppose now we have a random sample of n independent clusters with Ki margin in the ith
cluster. For i = 1, . . . , n and k = 1, . . . , Ki , let Tik , Cik and Xik be the log-transformed failure
time, censoring time and the p × 1 covariate vector for margin k in cluster i. Further define
the censored failure time and censoring indicator to be Yik = min(Tik , Cik ) and ∆ik = I(Tik <
Cik ). The multivariate AFT model takes the following form
>
Tik = Xik β + ik , i = 1, . . . , n, k = 1, . . . , Ki , (7)
where β is an unknown p × 1 vector of regression parameters and the error terms, i =

{i1 , . . . , iKi } are independent and identically distributed random variables with an unspeci-
fied distribution throughout clusters. In the presence of censoring, the observed data consists
of copies of {Yik , ∆ik , Xik }, for i = 1, . . . , n and k = 1, . . . , Ki where Yik = min(Tik , Cik ) and
∆ik = I(Tik < Cik ).
3.1. Rank-based estimator

The regression parameters in Equation 7 can be estimated from the following rank-based
weighted estimating equation
Ki
n X
" Pn PKj #
j=1 l=1 Xjl I[ejl (β) ≥ eik (β)]
X
Un,ϕ (β) = ϕik (β)∆ik Xik − Pn PKj = 0, (8)
i=1 k=1 j=1 l=1 I[ejl (β) ≥ eik (β)]
where eik (β) = Yik − Xik> β and ϕ (β) is a possibly data-dependent nonnegative weight func-
ik
tion. If Ki = 1 for all i = 1, . . . n, Equation 8 will reduce to Equation 1. This estimating
equation also yields a consistent estimator for β0 (Jin et al. 2006a). Applying the aforemen-
tioned induced smoothing technique, the smoothed version of Equation 8 with Gehan’s weight
is
n XKi X n X Kj
X ejl (β) − eik (β)
Ũn,G (β) = ∆ik (Xik − Xjl )Φ = 0, (9)
rikjl
i=1 k=1 j=1 l=1
2
where rikjl = (Xik −Xjl )> Σn (Xik −Xjl ). The consistency and asymptotic properties continue
to hold (Johnson and Strawderman 2009). The multivariate version of Equations 3 and 5 are
Ki
n X
" Pn PKj #
j=1 l=1 Xj Φ[κikjl (β)]
X
Ũn,ϕ (β) = ∆ik ϕik (b) Xik − Pn PKj = 0, (10)
i=1 k=1 j=1 l=1 Φ[κikjl (β)]
and
Ki X
n X Kj
n X
X ejl (β) ≥ eik (β)
Ũn,φ (β) = ∆ik φik (β)(Xik − Xjl )Φ = 0, (11)
rikjl
i=1 k=1 j=1 l=1
PKj
where κikjl (β) = [ejl (β) − eik (β)]/rikjl and φik (β) = ϕik (β)/ nj=1 l=1
P
I(ejl (β) ≥ eik (β)).
The same iterative procedure as for the univariate case can be used and the asymptotic
properties of the resulting estimator, β̃n,φ , continue to hold.
3.2. GEE approach

Depending on the set up of the design matrix Xi , the multivariate AFT model accommodates
margin-specific regression coefficients, identical regression coefficients across margins, and
their mix. For instance, a covariate could be appropriate for one margin but unsuitable
for the other; a covariate could have a different effect in the regression model of different
margins. The error term i ’s are assumed to be independent and identically distributed across
clusters, but within a cluster, the components of i1 , . . . , iKi do not need to follow a common
distribution and may be correlated. We generalize the GEE approach to multivariate AFT
modeling which accounts for multivariate dependence through working correlation structures
to improve efficiency (Chiou et al. 2014a).
Define Ωi−1 α(b) to be an Ki × Ki nonsingular working weight matrix that may involve

additional working parameters α that depends on an initial value b of β0 . For i = 1, . . . , n

and k = 1, . . . , Ki , let Ŷi (b), Yi , Ti , Ci and ∆i be Ki × 1 vectors formed by stacking Ŷik (b),
Yik , Cik and ∆ik , where
" R∞ #
eik (b) udF̂eik (β) >
Ŷik (b) = ∆ik Yik + (1 − ∆ik ) + Xik b .
1 − F̂eik (β) {eik (b)}
A generalization of Equation 6 can be expressed as

n
X
(Xi − X̄)> Ω−1

Un,GEE (β, b, α) = i α(b) (Ŷi (b) − Xi β) = 0, (12)
i=1
Pn
where X̄ = i=1 Xi /n. Given α and b, the solution to Equation 12 has a closed form
Ln (b, α) =
" n #−1 " n #
X X
(Xi − X̄)> Ω−1 (Xi − X̄)> Ω−1

i α(b) (Xi − X̄) i α(b) Ŷi (b) − Ȳ(b) ,
i=1 i=1
Pn
where Ȳ(b) = i Ŷi (b)/n.
The GEE estimator, denoted by β̂n,GEE , can be obtained from an iterative procedure:
(0)
1. Obtain an initial estimate β̂n,GEE = bn of β and initialize with m = 1.
(m−1) (m−1)
2. Obtain an estimate α̂n of α given β̂n,GEE , α̂n (β̂n,GEE ).
(m) (m−1) (m−1)

3. Update with β̂n,GEE = Ln β̂n,GEE , α̂n (β̂n,GEE ) .
4. Increase m by one and repeat 2 and 3 until convergence.

The iteration proceeds with the aid of function geese in package geepack (Højsgaard, Halekoh,
and Yan 2014; Halekoh, Højsgaard, and Yan 2006). The estimator reduces to the least squares
estimator of Jin et al. (2006a) when the working weight matrices Ωi ’s are the identity matrices.
We refer to Chiou et al. (2014a) for more details. The working parameter estimate α̂n does
not affect the consistency of the GEE estimator, but may affect its efficiency. Higher efficiency
can be achieved if Ωi is closer to the covariance matrix of Ŷi (b) and even an imperfect working
weight still improves the efficiency (Chiou et al. 2014a). The variance of the estimator can
again be estimated by resampling procedures.
4. Incorporating sampling weight

So far, we have assumed that we have full access to the whole data (Yik , ∆ik , Xik ), i = 1, . . . , n,
k = 1, . . . , Ki . In many cases, however, the full cohort data may not be available. For example,
a case-cohort design (Prentice 1978) is known to be cost-effective when the proportion of the
event of interest is rare or covariates are expensive to measure. Under this design, covariates
are measured for all the subjects who experienced the event of interest by the end of the
observation period but only for a subset of subjects who did not. This design is a special
case of the more general stratified case-cohort design where a subcohort is selected via a
stratified random sampling from S mutually exclusive strata in the original full cohort. For
both designs, covariates are measured only for those who were selected into the sample. Thus,
the observed data from each design are not complete and statistical methods which do not
account for this missingness in covariates could result in biased estimates. One typical method
employed to adjust for biases is to weight a complete observation by the inverse of the inclusion
probability.
Suppose a simple random sampling was used at the cluster level. Let ψis be the strata
indicator (ψis = 1 if the ith cluster is in the sth stratum and ψis = 0 otherwise) and ξis be
the sampling indicator (ξi = 1 if the ith PScluster is sampled and ξi = 0 otherwise). The case-
cohort weight, hi , for cluster i is hi = s=1 ξi ψis /pn,s , where pn,s is the inclusion probability
for the sth stratum for s = 1, . . . , S. The weight-adjusted version of Equations 10 and 11 are,
respectively,
n X Ki
" Pn PKj #
X j=1 l=1 h jl Xj Φ(κ ikjl (β))
Ũn,ϕ (β) = hik ∆ik ϕik (b) Xik − Pn PK = 0, (13)
j
i=1 k=1 j=1 l=1 hjl Φ(κikjl (β))
and
Ki X
n X Kj
n X
X ejl (β) ≥ eik (β)
Ũn,φ (β) = hi hj ∆i φi (β)(Xik − Xjl )Φ = 0. (14)
rikjl
i=1 k=1 j=1 l=1
Note that if we sample all subjects within each strata, then Equations 13 and 14 reduce to
Equations 10 and 11, respectively. The variance estimation can be obtained via resampling
procedures or fast sandwich variance estimators similar to the unweighted versions (Chiou
et al. 2014b,d).
5. Package implementation
The two major functions in package aftgee are aftsrr for the rank-based approach and aftgee
for the least squares or GEE approach. The synopsis of aftsrr is:
aftsrr(formula, data, subset, id = NULL, contrasts = NULL,

strata = NULL, weights = NULL, rankWeights = "gehan", method = "sm",
variance = "ISMB", B = 100, SigmaInit = NULL, control = aftgee.control())
The required arguments are formula and data. Argument formula specifies the model to be
fit with the variables coming with data. The formula is the same as the argument of function
survreg in package survival, with response created from Surv. The ‘Surv’ object consists
of two columns, where the first column is the survival time or censored time and the second
column is the censoring indicator, indicating right censored data. Since ranks are invariant to
location shift, the intercept cannot be estimated and the estimation will ignore the intercept
term whether it is specified or not. Clusters are defined by vector id. The weights argument
is a vector containing sampling weights (hi ) as described in Section 4. When data arise
from a stratified design, a vector of integers that specifies the stratification is indicated in
strata. The length of the arguments id, weights and strata needs to be the same as the
number of observations. The rank weight, controlled by argument rkWeight, includes the
aforementioned log-rank weight ("logrank"), Gehan’s weight ("gehan"), Prentice-Wilcoxon
weight ("PW") and general Gρ class weight ("GP"). Argument method determines the type of
weighted estimating equations to be used. When method = "nonsm", regression parameters
are estimated by directly solving the nonsmooth estimating Equations 1 or 8. When method =
"sm" and rkWeight = "gehan", the induced smoothing estimating Equations 2 or 9 are used.
For the non-Gehan’s weights, method = "sm" and method = "monosm" apply the iterative
procedure with the smooth estimating Equations 3 and 5, respectively. The initial values for
the variance estimator, or the Σn in the smoothing progress, are determined by sigmainit.
The identity matrix is used for sigmainit, if it is left unspecified.
Given a point estimate, variance estimates can be obtained from several approaches which
are specified by argument variance. A straightforward but computationally inefficient vari-
ance estimator is the multiplier bootstrap approach ("MB"). A more efficient method is to
consider sandwich variance estimators (Chiou et al. 2014b). Suppose the variance of the es-
timator has a sandwich form, Σ = A−1 V (A−1 )> where V is the asymptotic variance of the
estimating function and A is the slope matrix. Chiou et al. (2014b) proposed to estimate V
by either a closed-form formulation (CF) or through bootstrap the estimating equations (MB).
The bootstrapping estimate of V is much less demanding than the full multiplier bootstrap,
because it only involves evaluations of estimating equations instead of solving them. On the
other hand, to estimate the slope matrix A, Chiou et al. (2014b) proposed three methods
based on the induced smoothing approach (IS), smoothed Huang’s approach (sH) motivated
by Huang (2002) or Zeng and Lin’s approach (ZL) by Zeng and Lin (2008). Combinations
between estimating V and A yield six sandwich estimators, "ISCF", "ISCF", "ZLCF", "ZLMB",
"sHCF", "sHMB" for variance. When a bootstrap is needed, the bootstrap size is controlled
by B with default value 100.
The convergence for the procedure is controlled by relative tolerance. The iteration stops and
the output is given when the tolerance is met or iteration reaches the pre-specified maximum
iteration number. The default relative tolerance is set at 0.001 and the default maximum
number of iterations is set to 50.
aftgee.control(maxiter = 50, reltol = 0.001, trace = FALSE)
The maximum number of iterations is controlled by maxiter and relative convergence toler-
ance is controlled by reltol. A logical value, trace, is used to determine whether to print
the output for each iteration.
The least squares estimator can be obtained by calling aftgee with the following arguments
aftgee(formula, data, subset, id = NULL, contrasts = NULL,

weights = NULL, margin = NULL, corstr = "independence",
binit = "srrgehan", B = 100, control = aftgee.control())
Most of the arguments and the convergence criterion of aftsrr are shared by aftgee. With
aftgee, the intercept, if included, is estimated by the mean of the estimated cumulative
distribution function based on the censored residual computed from the slope estimator.
The margin argument is a vector with the same length as data. It is used to specify the
marginal distribution within clusters. Identical marginal distributions are assumed with un-
specified margin. A character string, corstr, is used to specify the working correlation
structure, as offered by package geepack. Four working correlation structures are indepen-
dence ("independence"), exchangeable ("exchangeable"), autoregressive model of order one
("ar1") and unstructured ("unstructured"). The default is "independence". The initial
value is specified by binit with default "srrgehan" giving the induced smooth rank-based
estimator with Gehan’s weight. Alternatively, although not recommended, the simple linear
regression with censored observations ignored ("lm"), can also be used for faster results.
In the uncensored case, aftgee with independent working correlation structure will return
an ordinary least squares estimate. In the multivariate case, efficiency can be improved in
aftgee when the working correlation structure is close to the true correlation even in the
absent of censoring. A more detailed multivariate illustration is presented in a kidney cather
data in Section 6.3.
6. Illustrations
6.1. Simulated data

Let T be the log-transformed failure time generated from the univariate AFT model
T = 2 + X1 + X2 + ,
where X1 is Bernoulli with rate 0.5 and X2 is a standard normal variable. The error term,
, follows an exponential distribution with mean 3. The censoring time was generated from
Uniform(0, τ ) with τ adjusted to yield approximately 50% censoring rate. A dataset with 500
subjects was generated with the following code:
R> datgen <- function(n = 500, tau = 327) {

+ x1 <- rbinom(n, 1, 0.5)
+ x2 <- rnorm(n)
+ e <- rweibull(n, 1, 3)
+ T <- exp(2 + x1 + x2 + e)
+ cstime <- runif(n, 0, tau)
+ delta <- (T < cstime) * 1
+ Y <- pmin(T, cstime)
+ out <- data.frame(T = T, Y = Y, delta = delta, x1 = x1, x2 = x2)
+ }
R> set.seed(1)
R> mydata <- datgen()
On a 3.3 GHz linux machine, we start with the comparison between two versions of Gehan’s
estimators: one is the nonsmooth version estimated from Equation 1 fitted with lss and the
other is the smooth version estimated from Equation 2 fitted with aftsrr. For lss, variances
are estimated with the multiplier bootstrap approach with bootstrap sample size 100. For
aftsrr, in addition to the fully bootstrapping variance estimator from Equation 2 ("MB"), the
sandwich variance estimator using induced smoothing approach ("ISMB") is also considered.
R> library("aftgee")
R> library("survival")
R> library("lss")
R> system.time(rk.lss <- lss(Surv(log(Y), delta) ~ x1 + x2, data = mydata,
+ gehanonly = TRUE, mcsize = 100))
user system elapsed

412.08 1.27 414.88
R> system.time(rk.srrMB <- aftsrr(Surv(Y, delta) ~ x1 + x2,

+ data = mydata, variance = "MB"))
user system elapsed

111.040 0.311 111.877
R> system.time(rk.srrISMB <- aftsrr(Surv(Y, delta) ~ x1 + x2,

+ data = mydata, variance = "ISMB"))
user system elapsed

3.276 0.015 3.319
R> rbind(rk.lss = c(rk.lss$betag), srrMB = coef(rk.srrMB),

+ srrISMB = coef(rk.srrISMB), deparse.level = 2)
x1 x2
rk.lss 0.9412 0.9496
srrMB 0.9399 0.9499
srrISMB 0.9399 0.9499
R> rbind(rk.lss = rk.lss$gehansd, srrMB = sqrt(diag(vcov(rk.srrMB)$MB)),

+ srrISMB = sqrt(diag(vcov(rk.srrISMB)$ISMB)), deparse.level = 2)
x1 x2
rk.lss 0.1552 0.07141
srrMB 0.1333 0.07133
srrISMB 0.1385 0.06940
The output indicates that aftsrr clearly outperforms lss in timing. The timing result also
suggests that the efficient sandwich estimator is substantially faster.
We next fit the simulated data with the least squares approach. For the parametric approach,
survreg is used with dist = "lognormal". For the semiparametric approach, the BJ esti-
mator (bj from package rms), the lss estimator and the aftgee estimator are considered.
R> library("rms")
R> system.time(ls.bj <- bj(Surv(Y, delta) ~ x1 + x2, data = mydata))
user system elapsed

0.089 0.000 0.090
R> system.time(ls.lss <- lss(Surv(log(Y), delta) ~ x1 + x2,

+ data = mydata, mcsize = 100))
Warning: Solution may be nonunique
Converged. Criteria Satisfied: 0.001

user system elapsed
401.506 1.065 403.746
R> system.time(ls.gee <- aftgee(Surv(Y, delta) ~ x1 + x2, data = mydata))
user system elapsed

17.428 0.042 17.504
R> system.time(ls.sur <- survreg(Surv(Y, delta) ~ x1 + x2, data = mydata,

+ dist = "lognormal"))
user system elapsed

0.011 0.000 0.011
R> rbind(bj = coef(ls.bj), lss = c(NA, ls.lss$lse), gee = coef(ls.gee),

+ sur = coef(ls.sur), deparse.level = 2)
Intercept x1 x2
bj 4.509 0.9833 0.9332
lss NA 0.9838 0.9338
gee 4.510 0.9838 0.9338
sur 4.313 0.9126 0.8652
bj lss aftgee survreg

β̂ β0 Bias MSE Bias MSE Bias MSE Bias MSE
β̂0 5.000 −0.480 0.261 −0.473 0.252 −0.701 0.506
β̂1 1.000 −0.005 0.035 −0.008 0.029 −0.005 0.033 −0.080 0.031
β̂2 1.000 −0.007 0.008 −0.001 0.008 −0.001 0.007 −0.082 0.014
Table 1: Comparison of the bj, lss, aftgee and survreg estimators. Bias and MSE represent
the bias and mean squared error of the estimator, respectively. The true regression coefficient
is β0 . Each cell is the average of 1000 replicates.
R> rbind(bj = sqrt(diag(vcov(ls.bj))), lss = c(NA, ls.lss$sd),

+ gee = sqrt(diag(vcov(ls.gee))), sur = sqrt(diag(vcov(ls.sur)))[1:3],
+ deparse.level = 2)
Intercept x1 x2
bj 0.07846 0.1261 0.06448
lss NA 0.1932 0.09152
gee 0.10588 0.1758 0.09061
sur 0.11078 0.1672 0.08321
Estimation for both the lss estimator and the aftgee estimator are based on a rank-based
initial value that is invariant to the intercept. Once the slope estimator is obtained, the lss
estimator left out the intercept whereas the bj and aftgee approach estimated the intercept
by the mean of the estimated cumulative distribution function based on the censored residual
computed from the slope estimator. The semiparametric methods from bj, lss and aftgee
provide fairly close point estimates. In terms of timing, the lss estimator took the longest
with more than six minutes. For further investigation, the estimation performance is assessed
via bias and mean squared error with a full scale simulation. Table 1 summarizes the results
for 1000 replicates.
The performance of bj, lss and aftgee are similar in terms of the biases and mean squared
errors. As expected, when the error distribution is misspecified in the parametric model, the
survreg approach produced a biased estimate.
6.2. National Wilms’ tumor study

We next illustrate the usage of the arguments weights and rkWeight in aftsrr with cohort
studies conducted by the national Wilms’ tumor study group (NWTSG; D’Angio et al. 1989;
Green et al. 1998). The dataset is available in the survival package as nwtco. The interest
of the study is to assess the relationship between the likability of central lab histology mea-
surement (histol) and days to tumor relapse (edrel). In addition to the likability of central
lab histology measurement (1 = unfavorable, 0 = favorable), we also include patient’s age
(age) in years as covariate. There are two study groups (study), NWTSG-3 and NWTSG-4,
denoting the third and the fourth Wilms’ tumor studies. Patients are further categorized into
four stages (stage) with stage 4 being the latest and most severest. The dataset consists
of 4028 patients, among which, 571 patients experienced tumor relapse (rel = 1) and 3457
patients did not (rel = 0).
To take advantage of the full cohort data, we fit the full-cohort data with aftsrr using two
types of sandwich variance estimators ("ISMB" and "ISCF").
R> data("nwtco", package = "survival")

R> nwtco$age <- nwtco$age/12
R> head(nwtco, 5)
seqno instit histol stage study rel edrel age in.subcohort

1 1 2 2 1 3 0 6075 2.083 FALSE
2 2 1 1 2 3 0 4121 4.167 FALSE
3 3 2 2 1 3 0 6069 0.750 FALSE
4 4 2 1 4 3 0 6200 2.333 TRUE
5 5 2 2 2 3 0 1244 4.583 FALSE
R> set.seed(1)
R> system.time(fit.IS <- aftsrr(Surv(edrel, rel) ~ histol + age,
+ data = nwtco, variance = c("ISCF", "ISMB")))
user system elapsed

75.628 1.209 76.917
The summary gives the following information:
R> summary(fit.IS)
Call:
aftsrr(formula = Surv(edrel, rel) ~ histol + age, data = nwtco,
variance = c("ISCF", "ISMB"))
Variance Estimator: ISCF

Estimate StdErr z.value p.value
histol -3.221 0.144 -22.40 <2e-16 ***
age -0.231 0.026 -9.03 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Variance Estimator: ISMB

histol -3.221 0.152 -21.13 <2e-16 ***
age -0.231 0.024 -9.49 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
All point estimators and variance estimators are close to each other. The coefficients of
central histological lab diagnosis is found to be significantly different from zero. In addition,
the coefficient for the central lab histological diagnosis is negative. This suggests patients who
do not favor the central lab histological diagnosis tend to have shorter time to tumor relapse.
With the same dataset, we next demonstrate incorporating weights via a case-cohort design.
Define cases and controls as those who experience the event of interest by the end of the
study period and who do not, respectively. In nwtco, there are 571 cases who experienced
the relapse of tumor and 3457 controls who did not experience the relapse of tumor. The
case-cohort sample is the union of all the cases and the sub-cohort sample selected via a
simple random sampling. The case-cohort sample of the data had 1154 subjects, including all
571 cases and 583 controls. This gave sampling weights 1 and 5.93 for the cases and controls,
respectively. The following codes give a summary of the case-cohort weight, hi .
R> table(nwtco$in.subcohort, nwtco$rel)
0 1
FALSE 2874 486
TRUE 583 85
R> nwtco$in.casecohort <- (nwtco$in.subcohort | nwtco$rel == 1)

R> nwtco$hi <- 0
R> nwtco$hi <- ifelse(nwtco$in.casecohort & nwtco$rel == 1, 1, nwtco$hi)
R> nwtco$hi <- ifelse(nwtco$in.casecohort & nwtco$rel == 0, 5.93, nwtco$hi)
R> table(nwtco$hi)
0 1 5.93
2874 571 583
For the case-cohort design, we also demonstrate the usage of different rank weights in a rank-
based approach; we considered the Gehan’s, log-rank and PW weights. For the log-rank and
PW weights, the monotone function approach was used. Jin and Huang (2007)’s lss was
not considered in this analysis because it does not have the capability of handling general
rank weights and sampling weights. Standard errors are estimated with the efficient sandwich
variance estimator, ZLMB. Commands for these estimators are presented below followed by a
summary.
R> system.time(fit.gh <- aftsrr(Surv(edrel, rel) ~ histol + age,

+ weights = hi, data = nwtco, variance = "ZLMB", subset = in.casecohort))
user system elapsed

29.311 0.083 29.542
R> system.time(fit.lk <- aftsrr(Surv(edrel, rel) ~ histol + age,

+ weights = hi, data = nwtco, variance = "ZLMB", rankWeights = "logrank",
+ subset = in.casecohort))
user system elapsed

35.490 0.097 35.697
R> system.time(fit.pw <- aftsrr(Surv(edrel, rel) ~ histol + age,

+ weights = hi, data = nwtco, variance = "ZLMB", rankWeights = "PW",
+ method = "monosm", subset = in.casecohort))
user system elapsed

84.716 0.299 85.453
R> summary(fit.gh)
Call:
subset = in.casecohort, weights = hi, variance = "ZLMB")
Variance Estimator: ZLMB

histol -3.133 0.164 -19.06 <2e-16 ***
age -0.204 0.038 -5.44 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R> summary(fit.lk)
Call:
subset = in.casecohort, weights = hi, rankWeights = "logrank",
variance = "ZLMB")

histol -3.891 0.191 -20.33 <2e-16 ***
age -0.208 0.057 -3.62 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R> summary(fit.pw)
Call:
subset = in.casecohort, weights = hi, rankWeights = "PW",
method = "monosm", variance = "ZLMB")

histol -3.793 0.105 -36.10 <2e-16 ***
age -0.209 0.026 -8.14 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Although the differences in standard errors among the three weights are noticeable, they all
lead to the same conclusion. The p values suggest that the coefficient of central histological
diagnosis is significantly different from zero and had a significant effect on the time to relapse.
Compared to the full cohort analysis, all point estimates are reasonably close. This result is
also found in the full cohort analysis.
6.3. Kidney catheter data

A bivariate failure time example from a kidney catheter study is used to illustrate the least
squares approach feature of aftgee. The kidney catheter data are available in the survival
package as kidney (McGilchrist and Aisbett 1991). The interest of the study is to examine
the time to infection from the point of catheter insertion for patients using portable dialysis
equipment. The catheter is removed when the infection occurs and reinserted after some
pre-determined time. If catheters are removed for reasons other than infection, then the time
to infection is treated as censored. The data contain 38 patients, each having exactly two
insertions, where two observations on time to infection was recorded. The two covariates
considered are patient’s age (age in years) and gender (sex = 0 if male, sex = 1 if female).
We first fit bivariate AFT models with identical error margins and identical regression coeffi-
cients for the two margins. Since it is reasonable to expect some correlation between the two
recurrence times for a given patient, we can model this in the least squares approach with
some dependent working covariance structure. We will first fit the least squares approach with
a working independent covariance structure and then with an exchangeable working structure.
For both least squares approaches, we use the induced smoothing rank-based estimator with
Gehan’s weight as the initial estimator. The standard errors are estimated by the multiplier
resampling method with bootstrap size 100.
R> data("kidney", package = "survival")

R> set.seed(123)
R> fit.ind <- aftgee(Surv(time, status) ~ age + sex, id = id, data = kidney)
R> fit.ex <- aftgee(Surv(time, status) ~ age + sex, id = id, data = kidney,
+ corstr = "ex")
R> summary(fit.ind)
Call:
aftgee(formula = Surv(time, status) ~ age + sex, data = kidney,
id = id)
AFTGEE Estimator
(Intercept) 2.071 0.609 3.40 0.001 ***
age -0.005 0.008 -0.64 0.523
sex 1.374 0.346 3.96 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R> summary(fit.ex)
Call:
aftgee(formula = Surv(time, status) ~ age + sex, data = kidney,
id = id, corstr = "ex")
AFTGEE Estimator
(Intercept) 2.070 0.688 3.01 0.003 **
age -0.005 0.010 -0.54 0.589
sex 1.374 0.369 3.72 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The coefficient of sex is found to be significantly different from zero for both models. This
suggests that female patients tend to have longer recurrence times to infection. The efficiency
gain is expected on average but, unfortunately, this dataset does not show an efficiency gain.
In addition to the common marginal error distribution and common coefficient assumption,
we also consider the case where the marginal error distributions and regression coefficients
are different. In this case, we need to specify margin and construct the corresponding block
diagonal design matrix. After the block diagonal design matrix is constructed, least squares
estimators with both independent covariance working structure and exchangeable working
structure are fitted. For each model, we use the smooth Gehan estimator as the initial value.
R> kidney$margin <- as.factor(rep(1:2, 38))

R> fit2.ind <- aftgee(Surv(time, status) ~ age:margin + sex:margin +
+ margin - 1, id = id, margin = margin, data = kidney)
R> fit2.ex <- aftgee(Surv(time, status) ~ age:margin + sex:margin +
+ margin - 1, id = id, margin = margin, data = kidney, corstr = "ex")
R> summary(fit2.ind)
Call:
aftgee(formula = Surv(time, status) ~ age:margin + sex:margin +
margin - 1, data = kidney, id = id, margin = margin)
AFTGEE Estimator
margin1 1.676 0.804 2.08 0.037 *
margin2 2.542 0.881 2.89 0.004 **
age:margin1 -0.013 0.011 -1.18 0.238
age:margin2 0.005 0.013 0.41 0.679
margin1:sex 1.744 0.439 3.97 <2e-16 ***
margin2:sex 0.895 0.451 1.99 0.047 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
R> summary(fit2.ex)
Call:
aftgee(formula = Surv(time, status) ~ age:margin + sex:margin +
margin - 1, data = kidney, id = id, margin = margin, corstr = "ex")
AFTGEE Estimator
margin1 1.672 0.897 1.86 0.062 .
margin2 2.544 0.901 2.82 0.005 **
age:margin1 -0.013 0.012 -1.14 0.253
age:margin2 0.005 0.012 0.47 0.638
margin1:sex 1.744 0.467 3.73 <2e-16 ***
margin2:sex 0.887 0.473 1.88 0.060 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This model allows hypothesis testing of equal coefficiencts for each covariate across the two
margins from Wald-type tests with covariance matrix estimates. However, the covariates of
age and sex are found to be not significantly different across the two margins, with p values
of 0.87 and 0.06, respectively, under the exchangeable structure. The coefficients of sex for
both margins and the two working structures are found to be significantly different from zero.
These results coincide with those from the common margin model. The efficiency gain from
the exchangeable structure in the margin-specific case is also absent. This is probably because
there is not much strength to borrow with distinctive marginal fits.
7. Conclusion
Package aftgee provides an easy access to fitting semiparametric AFT models for possibly
clustered failure times with both rank-based approaches and the least squares approach.
For rank-based approaches, we implemented the induced smoothing method with Gehan’s
weight and extended it to allow arbitrary rank weight. The method is much faster than
those based on linear programming. Computationally efficient sandwich variance estimators
are provided for all the estimators, and additional sampling weight can be incorporated for
various sampling schemes. Our least squares approach uses rank-based estimators as initial
estimators in an iterative estimation procedure. For clustered data, we exploited within-
cluster dependence through working correlation structure in a GEE framework which enhances
efficiency when within-cluster dependence is strong. The implementation is fast and reliable,
making it possible for AFT models to be much more widely applied in routine survival analysis.
Our package can be expanded in several directions. The current version allows weights for
handling missing data in the rank-based approach, similar weights can also be made available
to our GEE approach. For the rank-based approach with clustered data, Wang and Fu
(2011) considered estimating equations that can be decomposed into between- and within-
cluster estimating equations for better efficiency. An implementation of this method would
be desirable. To account for measurement errors in covariates, package simexaft (He, Xiong,
and Yi 2012) implemented a simulation-extrapolation approach for AFT models. Such an
approach can be extended to the semiparametric AFT model. Furthermore, our methods can
also be extended to accommodate survival data other than with right censoring.
Acknowledgments
This research was partially supported by NSF DMS grant 120922.
References
Brostöm G (2014). eha: Event History Analysis. R package version 2.4-1, URL http:
//CRAN.R-project.org/package=eha.
Brown BM, Wang YG (2005). “Standard Errors and Covariance Matrices for Smoothed Rank
Estimators.” Biometrika, 92(1), 149–158.
Brown BM, Wang YG (2007). “Induced Smoothing for Rank Regression with Censored Sur-
vival Times.” Statistics in Medicine, 26(4), 828–836.
Buckley J, James I (1979). “Linear Regression with Censored Data.” Biometrika, 66(3),
429–436.
Chiou SH, Kang S, Kim J, Yan J (2014a). “Marginal Semiparametric Multivariate Accelerated
Failure Time Model with Generalized Estimating Equations.” Lifetime Data Analysis,
20(4), 599–618.
Chiou SH, Kang S, Yan J (2013). “Rank-Based Estimating Equations with General Weight for
Accelerated Failure Time Models: An Induced Smoothing Approach.” Technical Report 39,
Department of Statistics, University of Connecticut.
Chiou SH, Kang S, Yan J (2014b). “Fast Accelerated Failure Time Modeling for Case-Cohort
Data.” Statistics and Computing, 24(4), 559–568.
Chiou SH, Kang S, Yan J (2014c). aftgee: Accelerated Failure Time Model with General-
ized Estimating Equations. R package version 1.0-0, URL http://CRAN.R-project.org/
package=aftgee.
Chiou SH, Kang S, Yan J (2014d). “Semiparametric Accelerated Failure Time Modeling
for Clustered Failure Times from Stratified Sampling.” Technical report. doi:10.1080/
01621459.2014.917978. Forthcoming.
Cox DR (1972). “Regression Models and Life-Tables.” Journal of the Royal Statistical Society
B, 34(2), 187–220.
D’Angio GJ, Breslow N, Beckwith JB, Evans A, Baum E, Delorimier A, Fernbach D,

Hrabovsky E, Jones B, Kelalis P, Othersen HB, Tefft M, Thomas PRM (1989). “Treat-
ment of Wilms’ Tumor. Results of the Third National Wilms’ Tumor Study.” Cancer,
64(2), 349–360.
Gehan EA (1965). “A Generalized Wilcoxon Test for Comparing Arbitrarily Singly-Censored

Samples.” Biometrika, 52(1/2), 203–223.
Green DM, Breslow NE, Beckwith JB, Finklestein JZ, Grundy PE, Thomas PR, Kim T,
Shochat SJ, Haase GM, Ritchey ML, Kelalis PP, D’Angio GJ (1998). “Comparison be-
tween Single-Dose and Divided-Dose Administration of Dactinomycin and Doxorubicin for
Patients with Wilms’ Tumor: A Report from the National Wilms’ Tumor Study Group.”
Journal of Clinical Oncology, 16(1), 237–245.
Halekoh U, Højsgaard S, Yan J (2006). “The R Package geepack for Generalized Estimating
Equations.” Journal of Statistical Software, 15(2), 1–11. URL http://www.jstatsoft.
org/v15/i02/.
Harrel Jr FE (2014). rms: Regression Modeling Strategies. R package version 4.2-1, URL
http://CRAN.R-project.org/package=rms.
Harrington DP, Fleming TR (1982). “A Class of Rank Test Procedures for Censored Survival
Data.” Biometrika, 69(3), 133–143.
He W, Xiong J, Yi GY (2012). “SIMEX R Package for Accelerated Failure Time Models with
Covariate Measurement Error.” Journal of Statistical Software, Code Snippets, 46(1), 1–14.
URL http://www.jstatsoft.org/v46/c01/.
Højsgaard S, Halekoh U, Yan J (2014). geepack: Generalized Estimating Equation Package.

R package version 1.2-0, URL http://CRAN.R-project.org/package=geepack.
Huang Y (2002). “Calibration Regression of Censored Lifetime Medical Cost.” Journal of the
American Statistical Association, 97(457), 318–327.
Jin Z, Huang L (2007). “lss: An S-PLUS/R Program for the Accelerated Failure Time Model to
Right Censored Data Based on Least-Squares Principle.” Computer Methods and Programs
in Biomedicine, 86(1), 45–50.
Jin Z, Lin DY, Wei LJ, Ying Z (2003). “Rank-Based Inference for the Accelerated Failure
Time Model.” Biometrika, 90(2), 341–353.
Jin Z, Lin DY, Wei LJ, Ying Z (2006a). “Rank Regression Analysis of Multivariate Falure
Time Data Based on Marginal Linear Models.” Scandinavian Journal of Statistics, 33(1),
1–23.
Jin Z, Lin DY, Ying Z (2006b). “On Least-Squares Regression with Censored Data.”
Biometrika, 93(1), 147–161.
Jin Z, Lin DY, Ying Z (2006c). “Rank Regression Analysis of Multivariate Failure Time Data
Based on Marginal Linear Models.” Scandinavian Journal of Statistics, 33(1), 1–23.
Johnson LM, Strawderman RL (2009). “Induced Smoothing for the Semiparametric Accel-
erated Failure Time Model: Asymptotics and Extensions to Clustered Data.” Biometrika,
96(3), 577–590.
Kalbfleisch JD, Prentice RL (2002). The Statistical Analysis of Failure Time Data. John
Wiley & Sons.
Lai TL, Ying Z (1991). “Large Sample Theory of a Modified Buckley-James Estimator for
Regression Analysis with Censored Data.” The Annals of Statistics, 19(3), 1370–1402.
McGilchrist CA, Aisbett CW (1991). “Regression with Frailty in Survival Analysis.” Biomet-
rics, 47(2), 461–466.
Prentice RL (1978). “Linear Rank Tests with Right Censored Data.” Biometrika, 65(1),
167–180.
R Core Team (2014). R: A Language and Environment for Statistical Computing. R Founda-
tion for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
Ritov Y (1990). “Estimation in a Linear Regression Model with Censored Data.” The Annals
of Statistics, 18(1), 303–328.
Therneau T (2014). survival: A Package for Survival Analysis in S. R package version

2.37-7, URL http://CRAN.R-project.org/package=survival.
Tsiatis AA (1990). “Estimating Regression Parameters Using Linear Rank Tests for Censored
Data.” The Annals of Statistics, 18(1), 354–372.
Varadhan R, Gilbert P (2009). “BB: An R Package for Solving a Large System of Nonlinear
Equations and for Optimizing a High-Dimensional Nonlinear Objective Function.” Journal
of Statistical Software, 32(4), 1–26. URL http://www.jstatsoft.org/v32/i04/.
Wang YG, Fu L (2011). “Rank Regression for Accelerated Failure Time Model with Clustered
and Censored Data.” Computational Statistics & Data Analysis, 55(7), 2334–2343.
Wei LJ (1992). “The Accelerated Failure Time Model: A Useful Alternative to the Cox
Regression Model in Survival Analysis.” Statistics in Medicine, 11(14–15), 1871–1879.
Ying Z (1993). “A Large Sample Study of Rank Estimation for Censored Regression Data.”
The Annals of Statistics, 21(1), 76–99.
Zeng D, Lin DY (2008). “Efficient Resampling Methods for Nonsmooth Estimating Functions.”
Biostatistics, 9(2), 355–363.
Affiliation:
Sy Han Chiou
Department of Mathematics and Statistics
University of Minnesota, Duluth
1117 University Drive,
Duluth, MN 55812-3000, United States of America
Telephone: 218/726-7032
Fax: 218/726-8399
E-mail: schiou@d.umn.edu
Sangwook Kang
Department of Applied Statistics
Yonsei University
50 Yonsei Road
Seodaemun-Gu, Seoul 120-749, Korea
E-mail: kanggi1@yonsei.ac.kr
Jun Yan
Department of Statistics
University of Connecticut
215 Glenbrook Road U-4120
Storrs, CT 06279-4120, United States of America
Telephone: 860/486-3414
Fax: 860/486-4113
E-mail: jun.yan@uconn.edu
Journal of Statistical Software http://www.jstatsoft.org/

published by the American Statistical Association http://www.amstat.org/
Volume 61, Issue 11 Submitted: 2012-11-15
November 2014 Accepted: 2014-06-13

Journal of Statistical Software

Uploaded by

Copyright:

Available Formats

Journal of Statistical Software

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Journal of Statistical Software

Uploaded by

Copyright:

Available Formats

JSS Journal of Statistical Software

November 2014, Volume 61, Issue 11. http://www.jstatsoft.org/

Fitting Accelerated Failure Time Models in Routine

Sy Han Chiou Sangwook Kang Jun Yan

Keywords: case-cohort, efficiency, Gehan weight, generalized estimating equation, Gρ class,

2. Univariate AFT model

where β is an unknown p × 1 vector of regression parameters, i ’s are independent and iden-

2.1. Rank-based estimator

A computationally more efficient approach is the induced smoothing procedure of Brown

2 = (X − X )> Σ (X − X ) and Φ(·) is the standard normal cumulative distribution

reweighted induced smoothing strategy based on monotone estimating functions is considered

2.2. Least squares approach

3. Multivariate AFT model

where β is an unknown p × 1 vector of regression parameters and the error terms, i =

3.1. Rank-based estimator

3.2. GEE approach

additional working parameters α that depends on an initial value b of β0 . For i = 1, . . . , n

A generalization of Equation 6 can be expressed as

(m) (m−1) (m−1)

4. Increase m by one and repeat 2 and 3 until convergence.

4. Incorporating sampling weight

aftsrr(formula, data, subset, id = NULL, contrasts = NULL,

number of iterations is set to 50.

aftgee.control(maxiter = 50, reltol = 0.001, trace = FALSE)

aftgee(formula, data, subset, id = NULL, contrasts = NULL,

6.1. Simulated data

R> datgen <- function(n = 500, tau = 327) {

user system elapsed

R> system.time(rk.srrMB <- aftsrr(Surv(Y, delta) ~ x1 + x2,

user system elapsed

R> system.time(rk.srrISMB <- aftsrr(Surv(Y, delta) ~ x1 + x2,

user system elapsed

R> rbind(rk.lss = c(rk.lss$betag), srrMB = coef(rk.srrMB),

R> rbind(rk.lss = rk.lss$gehansd, srrMB = sqrt(diag(vcov(rk.srrMB)$MB)),

user system elapsed

R> system.time(ls.lss <- lss(Surv(log(Y), delta) ~ x1 + x2,

Warning: Solution may be nonunique

Converged. Criteria Satisfied: 0.001

R> system.time(ls.gee <- aftgee(Surv(Y, delta) ~ x1 + x2, data = mydata))

user system elapsed

R> system.time(ls.sur <- survreg(Surv(Y, delta) ~ x1 + x2, data = mydata,

user system elapsed

R> rbind(bj = coef(ls.bj), lss = c(NA, ls.lss$lse), gee = coef(ls.gee),

bj lss aftgee survreg

R> rbind(bj = sqrt(diag(vcov(ls.bj))), lss = c(NA, ls.lss$sd),

6.2. National Wilms’ tumor study

R> data("nwtco", package = "survival")

seqno instit histol stage study rel edrel age in.subcohort

user system elapsed

The summary gives the following information:

Variance Estimator: ISCF

Variance Estimator: ISMB

R> table(nwtco$in.subcohort, nwtco$rel)

R> nwtco$in.casecohort <- (nwtco$in.subcohort | nwtco$rel == 1)

R> system.time(fit.gh <- aftsrr(Surv(edrel, rel) ~ histol + age,

user system elapsed

R> system.time(fit.lk <- aftsrr(Surv(edrel, rel) ~ histol + age,

user system elapsed

R> system.time(fit.pw <- aftsrr(Surv(edrel, rel) ~ histol + age,

where β is an unknown p × 1 vector of regression parameters, i ’s are independent and iden-

where β is an unknown p × 1 vector of regression parameters and the error terms, i =