0% found this document useful (0 votes)

2 views

Assignment

Uploaded by

rhrakib216

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views

Assignment

Uploaded by

rhrakib216

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 20

ASSIGNMENT

HUMAUN FARID SOHAG

2025-02-17

#Problem-3.1
The data below are remission times, in weeks, for a group of 30 patients with leukemia who received similar
treatment. Asterisks denote censoring times.
1, 1, 2, 4, 4, 6, 6, 6, 7, 8, 9, 9, 10, 12, 13, 14, 18, 19, 24, 26, 29, 31*, 42, 45*, 50*, 57, 60, 71*, 85*, 91.

1. Obtain and plot the Kaplan-Meier estimate Ŝ(t) of the survivor function for remission time.
2. Obtain approximate .95 confidence intervals for the median remission time and for the probability that
remission lasts over 26 weeks.

3. Plot log(log Ŝ(t)) and log Â (t) on the same graph, where Ĥ(t) is the Nelson-Aalen estimate. Is there
much difference?

# Load necessary libraries

#install.packages("survival")
#install.packages("survminer")
library(survival)
library(survminer)
library(ggplot2)
library(MASS)

(1) Obtain and plot the Kaplan-Meier estimate Ŝ(t) of the survivor function for
remission time.

• Define remission times

remission_times <- c(1, 1, 2, 4, 4, 6, 6, 6, 7, 8, 9, 9, 10, 12, 13, 14,

18, 19, 24, 26, 29, 31, 42, 45, 50, 57, 60, 71, 85, 91)
remission_times

## [1] 1 1 2 4 4 6 6 6 7 8 9 9 10 12 13 14 18 19 24 26 29 31 42 45 50
## [26] 57 60 71 85 91

• Define censoring status (1 = event, 0 = censored)

1
censoring <- rep(1, length(remission_times))
censoring

## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

censoring[c(22, 24, 25, 28, 29)] <- 0 # Censored at 31, 45, 50, 71, and 85 weeks
censoring

## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 1 1 0 0 1

• Create a survival object

surv_obj <- Surv(time = remission_times, event = censoring)

surv_obj

## [1] 1 1 2 4 4 6 6 6 7 8 9 9 10 12 13 14 18 19 24
## [20] 26 29 31+ 42 45+ 50+ 57 60 71+ 85+ 91

• Kaplan-Meier estimate

km_fit <- survfit(surv_obj ~ 1)

km_fit

## Call: survfit(formula = surv_obj ~ 1)

##
## n events median 0.95LCL 0.95UCL
## [1,] 30 25 13.5 9 42

summary(km_fit)

## Call: survfit(formula = surv_obj ~ 1)

##
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 1 30 2 0.933 0.0455 0.8482 1.000
## 2 28 1 0.900 0.0548 0.7988 1.000
## 4 27 2 0.833 0.0680 0.7101 0.978
## 6 25 3 0.733 0.0807 0.5910 0.910
## 7 22 1 0.700 0.0837 0.5538 0.885
## 8 21 1 0.667 0.0861 0.5176 0.859
## 9 20 2 0.600 0.0894 0.4480 0.804
## 10 18 1 0.567 0.0905 0.4144 0.775
## 12 17 1 0.533 0.0911 0.3816 0.745
## 13 16 1 0.500 0.0913 0.3496 0.715
## 14 15 1 0.467 0.0911 0.3183 0.684
## 18 14 1 0.433 0.0905 0.2878 0.652
## 19 13 1 0.400 0.0894 0.2581 0.620
## 24 12 1 0.367 0.0880 0.2291 0.587
## 26 11 1 0.333 0.0861 0.2010 0.553
## 29 10 1 0.300 0.0837 0.1737 0.518
## 42 8 1 0.263 0.0812 0.1432 0.481
## 57 5 1 0.210 0.0801 0.0994 0.444
## 60 4 1 0.158 0.0754 0.0617 0.402
## 91 1 1 0.000 NaN NA NA

2
• Plot the Kaplan-Meier estimate

ggsurvplot(km_fit, data = data.frame(remission_times, censoring),

conf.int = TRUE,
title = "Kaplan-Meier Estimate of Survivor Function",
xlab = "Time (weeks)",
ylab = "Survival Probability",
risk.table = TRUE)

Kaplan−Meier Estimate of Survivor Function

Strata + All

1.00
Survival Probability

0.75

0.50

0.25
+ + +
+ +
0.00
0 25 50 75 100
Time (weeks)
Number at risk
Strata

All 30 11 6 2 0
0 25 50 75 100
Time (weeks)

(2) Obtain approximate .95 confidence intervals for the median remission time
and # for the probability that remission lasts over 26 weeks.

• Confidence interval for median remission time

ci_median <- summary(km_fit)$table[c("median", "0.95LCL", "0.95UCL")]

ci_median

## median 0.95LCL 0.95UCL

## 13.5 9.0 42.0

• Probability that remission lasts over 26 weeks

3
prob_26 <- summary(km_fit, times = 26)$surv
prob_26

## [1] 0.3333333

ci_26 <- summary(km_fit, times = 26)$lower

ci_26

## [1] 0.2009553

• Print results

cat("Approximate 95% CI for median remission time:\n")

## Approximate 95% CI for median remission time:

print(ci_median)

## median 0.95LCL 0.95UCL

## 13.5 9.0 42.0

cat("\nEstimated probability of remission lasting over 26 weeks: ", prob_26, "\n")

##
## Estimated probability of remission lasting over 26 weeks: 0.3333333

cat("95% CI for P(T > 26 weeks): ", ci_26, "\n")

## 95% CI for P(T > 26 weeks): 0.2009553

(c) Plot log(log Ŝ(t)) and log Â (t) on the same graph, where Ĥ(t) is the Nelson-
Aalen estimate. Is there much difference?

• Nelson-Aalen estimate

na_fit <- survfit(coxph(surv_obj ~ 1), type = "aalen")

na_fit

## Call: survfit(formula = coxph(surv_obj ~ 1), type = "aalen")

##
## n events median 0.95LCL 0.95UCL
## [1,] 30 25 14 9 42

• Extract cumulative hazard estimates

4
na_df <- data.frame(time = na_fit$time, H_t = cumsum(na_fit$n.event / na_fit$n.risk))
na_df

## time H_t
## 1 1 0.06666667
## 2 2 0.10238095
## 3 4 0.17645503
## 4 6 0.29645503
## 5 7 0.34190957
## 6 8 0.38952862
## 7 9 0.48952862
## 8 10 0.54508418
## 9 12 0.60390770
## 10 13 0.66640770
## 11 14 0.73307437
## 12 18 0.80450294
## 13 19 0.88142602
## 14 24 0.96475935
## 15 26 1.05566844
## 16 29 1.15566844
## 17 31 1.15566844
## 18 42 1.28066844
## 19 45 1.28066844
## 20 50 1.28066844
## 21 57 1.48066844
## 22 60 1.73066844
## 23 71 1.73066844
## 24 85 1.73066844
## 25 91 2.73066844

• Plot log(-log(S(t))) and log(H(t))

km_df <- data.frame(time = km_fit$time, S_t = km_fit$surv)

km_df

5
## 17 31 0.3000000
## 18 42 0.2625000
## 19 45 0.2625000
## 20 50 0.2625000
## 21 57 0.2100000
## 22 60 0.1575000
## 23 71 0.1575000
## 24 85 0.1575000
## 25 91 0.0000000

km_df <- km_df[km_df$S_t > 0, ] # Remove zero survival values to avoid log(0)
km_df

## time S_t
## 1 1 0.9333333
## 2 2 0.9000000
## 3 4 0.8333333
## 4 6 0.7333333
## 5 7 0.7000000
## 6 8 0.6666667
## 7 9 0.6000000
## 8 10 0.5666667
## 9 12 0.5333333
## 10 13 0.5000000
## 11 14 0.4666667
## 12 18 0.4333333
## 13 19 0.4000000
## 14 24 0.3666667
## 15 26 0.3333333
## 16 29 0.3000000
## 17 31 0.3000000
## 18 42 0.2625000
## 19 45 0.2625000
## 20 50 0.2625000
## 21 57 0.2100000
## 22 60 0.1575000
## 23 71 0.1575000
## 24 85 0.1575000

ggplot() +
geom_line(data = km_df, aes(x = log(time), y = log(-log(S_t))), color = "blue", size = 1) +
geom_line(data = na_df, aes(x = log(time), y = log(H_t)), color = "red", size = 1) +
labs(title = "Comparison of Log-Transformed Survival and Hazard Functions",
x = "log(Time)", y = "log(-log(S(t))) / log(H(t))") +
theme_minimal()

## Warning: Using ‘size‘ aesthetic for lines was deprecated in ggplot2 3.4.0.
## i Please use ‘linewidth‘ instead.
## This warning is displayed once every 8 hours.
## Call ‘lifecycle::last_lifecycle_warnings()‘ to see where this warning was
## generated.

6
Comparison of Log−Transformed Survival and Hazard Functions
1

0
log(−log(S(t))) / log(H(t))

−1

−2

0 1 2 3 4
log(Time)

--------------------------------------------------------------------------------------------------------------------

#Problem-3.2
The data below show survival times (in months) of patients with Hodgkin’s dis- ease who were treated
with nitrogen mustards (Bartolucci and Dickey, 1977). Group A patients received little or no prior therapy,
whereas Group B patients received heavy prior therapy. Starred observations are censoring times.
Group A 1.25, 1.41, 4.98, 5.25, 5.38, 6.92, 8.89, 10.98, 11.18, 13.11, 13.21, 16.33, 19.77, 21.08, 21.84*, 22.07,
31.38*, 32,62*, 37.18*, 42.92. Group B 1,05, 2.92, 3.61, 4,20, 4.49, 6.72, 7.31, 9.08, 9.11, 14.49*, 16.85,
18.82*, 26.59*, 30.26*, 41.34*.

1. Obtain and compare Kaplan-Meier estimates for the two groups. Does there appear to be a difference
in the 1-year survival probability for the two types of patients? Give confidence limits for S(1) and for
the median survival time 1.50 for each group.
2. Use plots of the Nelson-Aalen estimate (t) to examine and compare the two life distributions.
3. Do any parametric models whereby one tions suggest themselves? might compare the two distribu-

7
(1) Obtain and compare Kaplan-Meier estimates for the two groups. Does there
appear to be a difference in the 1-year survival probability for the two types of
patients? Give confidence limits for S(1) and for the median survival time 1.50
for each group.

• Survival times for Group A and B

group_A <- c(1.25, 1.41, 4.98, 5.25, 5.38, 6.92, 8.89, 10.98, 11.18, 13.11,
13.21, 16.33, 19.77, 21.08, 21.84, 22.07, 31.38, 32.62, 37.18, 42.92)
censor_A <- rep(1, length(group_A))
censor_A

## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

length(censor_A)

## [1] 20

censor_A[c(15,17,18,19)]<-0 # 1 = event, 0 = censored

censor_A

## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 0 0 1

group_B <- c(1.05, 2.92, 3.61, 4.20, 4.49, 6.72, 7.31, 9.08, 9.11, 14.49,16.85,18.82,26.59,30.26,41.34)
length(group_B)

## [1] 15

censor_B <- rep(1, length(group_B)) # Last value is censored

censor_B[c(10,12,13,14,15)]<-0
censor_B

## [1] 1 1 1 1 1 1 1 1 1 0 1 0 0 0 0

• Create survival objects

surv_A <- Surv(group_A, censor_A);surv_A

## [1] 1.25 1.41 4.98 5.25 5.38 6.92 8.89 10.98 11.18 13.11
## [11] 13.21 16.33 19.77 21.08 21.84+ 22.07 31.38+ 32.62+ 37.18+ 42.92

surv_B <- Surv(group_B, censor_B);surv_B

## [1] 1.05 2.92 3.61 4.20 4.49 6.72 7.31 9.08 9.11 14.49+
## [11] 16.85 18.82+ 26.59+ 30.26+ 41.34+

• Fit Kaplan-Meier estimates

8
km_A <- survfit(surv_A ~ 1);km_A

## Call: survfit(formula = surv_A ~ 1)

##
## n events median 0.95LCL 0.95UCL
## [1,] 20 16 13.2 8.89 NA

summary(km_A)

## Call: survfit(formula = surv_A ~ 1)

##
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 1.25 20 1 0.95 0.0487 0.859 1.000
## 1.41 19 1 0.90 0.0671 0.778 1.000
## 4.98 18 1 0.85 0.0798 0.707 1.000
## 5.25 17 1 0.80 0.0894 0.643 0.996
## 5.38 16 1 0.75 0.0968 0.582 0.966
## 6.92 15 1 0.70 0.1025 0.525 0.933
## 8.89 14 1 0.65 0.1067 0.471 0.897
## 10.98 13 1 0.60 0.1095 0.420 0.858
## 11.18 12 1 0.55 0.1112 0.370 0.818
## 13.11 11 1 0.50 0.1118 0.323 0.775
## 13.21 10 1 0.45 0.1112 0.277 0.731
## 16.33 9 1 0.40 0.1095 0.234 0.684
## 19.77 8 1 0.35 0.1067 0.193 0.636
## 21.08 7 1 0.30 0.1025 0.154 0.586
## 22.07 5 1 0.24 0.0980 0.108 0.534
## 42.92 1 1 0.00 NaN NA NA

km_B <- survfit(surv_B ~ 1);km_B

## Call: survfit(formula = surv_B ~ 1)

##
## n events median 0.95LCL 0.95UCL
## [1,] 15 10 9.08 4.49 NA

summary(km_B)

## Call: survfit(formula = surv_B ~ 1)

##
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 1.05 15 1 0.933 0.0644 0.815 1.000
## 2.92 14 1 0.867 0.0878 0.711 1.000
## 3.61 13 1 0.800 0.1033 0.621 1.000
## 4.20 12 1 0.733 0.1142 0.540 0.995
## 4.49 11 1 0.667 0.1217 0.466 0.953
## 6.72 10 1 0.600 0.1265 0.397 0.907
## 7.31 9 1 0.533 0.1288 0.332 0.856
## 9.08 8 1 0.467 0.1288 0.272 0.802
## 9.11 7 1 0.400 0.1265 0.215 0.743
## 16.85 5 1 0.320 0.1239 0.150 0.684

9
• Plot Kaplan-Meier curves

ggsurvplot_combine(
list("Group A" = km_A, "Group B" = km_B),
conf.int = TRUE, risk.table = TRUE,
xlab = "Time (Months)", ylab = "Survival Probability",
title = "Kaplan-Meier Estimates for Group A and B",
legend.title = "Groups"
)

Kaplan−Meier Estimates for Group A and B

Groups + Group A::All + Group B::All

1.00
Survival Probability

0.75

0.50
+
+ + + + +
0.25 ++ +

0.00
0 10 20 30 40
Time (Months)
Number at risk
Groups

Group A::All 20 13 7 4 1
Group B::All 15 6 3 2 1
0 10 20 30 40
Time (Months)

• Compute 95% confidence intervals for S(1) and median survival time

summary(km_A, times = 1.50) # Confidence limits for Group A at 1.50 months

## Call: survfit(formula = surv_A ~ 1)

##
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 1.5 18 2 0.9 0.0671 0.778 1

summary(km_B, times = 1.50) # Confidence limits for Group B at 1.50 months

## Call: survfit(formula = surv_B ~ 1)

##
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 1.5 14 1 0.933 0.0644 0.815 1

10
(2) Use plots of the Nelson-Aalen estimate (t) to examine and compare the two
life distributions.
• Nelson-Aalen estimates for cumulative hazard

na_A <- survfit(surv_A ~ 1, type = "fleming-harrington");na_A

## Call: survfit(formula = surv_A ~ 1, type = "fleming-harrington")

##
## n events median 0.95LCL 0.95UCL
## [1,] 20 16 13.2 8.89 NA

na_B <- survfit(surv_B ~ 1, type = "fleming-harrington");na_B

## Call: survfit(formula = surv_B ~ 1, type = "fleming-harrington")

##
## n events median 0.95LCL 0.95UCL
## [1,] 15 10 9.08 4.49 NA

• Plot Nelson-Aalen estimates

plot(na_A$time, -log(km_A$surv), type = "s", col = "blue", lwd = 2,

xlab = "Time (Months)", ylab = "Cumulative Hazard",
main = "Nelson-Aalen Estimates")
lines(na_B$time, -log(km_B$surv), type = "s", col = "red", lwd = 2)
legend("topleft", legend = c("Group A", "Group B"), col = c("blue", "red"), lwd = 2)

Nelson−Aalen Estimates

Group A
Group B
1.2
Cumulative Hazard

0.8
0.4
0.0

0 10 20 30 40

Time (Months)

11
--------------------------------------------------------------------------------------------------------------------

#Problem-3.5
Pike (1966) gave results of a laboratory experiment in which 19 female rats were painted with the carcinogen
DMBA. The number of days T until the appearance of a carcinoma was of interest, and the data gave the
following times (asterisks denote censoring limes):
143, 164, 188, 188, 190, 192, 206, 209, 213, 216, 220, 227, 230, 234, 246, 265,304, 216, 244*

1. It was thought that carcinomas could not appear before some threshold time > 0, so a Weibull model
for T’ Ty was considered. Give twa Weibull probability plots, using (1) the raw data (1-values), and
(2) the values 1-100. Is there any strong indication that T-100 is closer to Weibull-distributed than is
T?
2. Obtain a nonparametric .95 confidence interval for the median time to carcinoma, 1.50. Comment
on the advantages and disadvantages of this estimate over one based on a Weibull model.

(1) It was thought that carcinomas could not appear before some threshold time
> 0, so a Weibull model for T’ Ty was considered. Give twa Weibull probability
plots, using (1) the raw data (1-values), and (2) the values 1-100. Is there any
strong indication that T-100 is closer to Weibull-distributed than is T?
• Data: Observed times (T) and censoring indicator

times <- c(143, 164, 188, 188, 190, 192, 206, 209, 213, 216,
220, 227, 230, 234, 246, 265, 304, 216, 244)
times

## [1] 143 164 188 188 190 192 206 209 213 216 220 227 230 234 246 265 304 216 244

censored <- rep(1,length(times)) # Last two are censored

censored

## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

length(censored)

## [1] 19

censored[c(18,19)]<-0
length(censored)

## [1] 19

censored

## [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0

• Transform data for Weibull plot (T’ = T - 100)

12
times_transformed <- times - 100
times_transformed

## [1] 43 64 88 88 90 92 106 109 113 116 120 127 130 134 146 165 204 116 144

• Weibull probability plot for raw data

weibull_plot_raw <- qqplot(qweibull(ppoints(times), shape=1), times,

main = "Weibull Probability Plot (Raw Data)",
xlab = "Theoretical Quantiles", ylab = "Observed Data")
abline(lm(times ~ qweibull(ppoints(times), shape=1)), col = "red") # Fit line

Weibull Probability Plot (Raw Data)

300
Observed Data

250
200
150

0 1 2 3

Theoretical Quantiles

weibull_plot_raw

## $x
## [1] 0.02666825 0.08223810 0.14107860 0.20359896 0.27029033 0.34174929
## [7] 0.41871033 0.50209194 0.59306372 0.69314718 0.80437282 0.92953596
## [13] 1.07263680 1.23969089 1.44036158 1.69167601 2.02814825 2.53897387
## [19] 3.63758616
##
## $y
## [1] 143 164 188 188 190 192 206 209 213 216 216 220 227 230 234 244 246 265 304

• Weibull probability plot for transformed data (T’ = T - 100)

13
weibull_plot_transformed <- qqplot(qweibull(ppoints(times_transformed), shape=1),
times_transformed,
main = "Weibull Probability Plot (T' = T - 100)",
xlab = "Theoretical Quantiles", ylab = "Observed Data")
abline(lm(times_transformed ~ qweibull(ppoints(times_transformed), shape=1)), col = "blue")

Weibull Probability Plot (T' = T − 100)

200
Observed Data

150
100
50

0 1 2 3

Theoretical Quantiles

weibull_plot_transformed

## $x
## [1] 0.02666825 0.08223810 0.14107860 0.20359896 0.27029033 0.34174929
## [7] 0.41871033 0.50209194 0.59306372 0.69314718 0.80437282 0.92953596
## [13] 1.07263680 1.23969089 1.44036158 1.69167601 2.02814825 2.53897387
## [19] 3.63758616
##
## $y
## [1] 43 64 88 88 90 92 106 109 113 116 116 120 127 130 134 144 146 165 204

length(censored)

## [1] 19

14
(2) Obtain a nonparametric .95 confidence interval for the median time to car-
cinoma, 1.50. Comment on the advantages and disadvantages of this estimate
over one based on a Weibull model.

• Kaplan-Meier estimation for median survival

surv_obj <- Surv(times, censored)

surv_obj

## [1] 143 164 188 188 190 192 206 209 213 216 220 227 230 234 246
## [16] 265 304 216+ 244+

km_fit <- survfit(surv_obj ~ 1)

km_fit

## Call: survfit(formula = surv_obj ~ 1)

##
## n events median 0.95LCL 0.95UCL
## [1,] 19 17 216 206 265

summary(km_fit)

## Call: survfit(formula = surv_obj ~ 1)

##
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 143 19 1 0.9474 0.0512 0.8521 1.000
## 164 18 1 0.8947 0.0704 0.7669 1.000
## 188 17 2 0.7895 0.0935 0.6259 0.996
## 190 15 1 0.7368 0.1010 0.5632 0.964
## 192 14 1 0.6842 0.1066 0.5041 0.929
## 206 13 1 0.6316 0.1107 0.4480 0.890
## 209 12 1 0.5789 0.1133 0.3946 0.850
## 213 11 1 0.5263 0.1145 0.3435 0.806
## 216 10 1 0.4737 0.1145 0.2949 0.761
## 220 8 1 0.4145 0.1145 0.2412 0.712
## 227 7 1 0.3553 0.1124 0.1911 0.661
## 230 6 1 0.2961 0.1082 0.1447 0.606
## 234 5 1 0.2368 0.1015 0.1023 0.548
## 246 3 1 0.1579 0.0934 0.0495 0.504
## 265 2 1 0.0789 0.0728 0.0130 0.481
## 304 1 1 0.0000 NaN NA NA

• 95% confidence interval for median time

km_ci <- quantile(km_fit, probs = c(0.5), conf.int = TRUE)

km_ci

## $quantile
## 50
## 216

15
##
## $lower
## 50
## 206
##
## $upper
## 50
## 265

• Kaplan-Meier plot

ggsurvplot(km_fit,
data = data.frame(times = times, censored = censored), # Add this line with the data
conf.int = TRUE,
xlab = "Time (Days)",
ylab = "Survival Probability",
title = "Kaplan-Meier Estimate of Carcinoma Time")

Kaplan−Meier Estimate of Carcinoma Time

Strata + All

1.00
Survival Probability

0.75

0.50 +

0.25 +

0.00
0 100 200 300
Time (Days)
--------------------------------------------------------------------------------------------------------------------

#Problem-3.14
The following data are survival times for 121 breast cancer patients treated over the period 1929-1938, quoted
in Boag (1949). Times are in months, and asterisks denote censoring times.

16
0.3, 0.3*, 4, 5, 5.6, 6.2 6.3, 6.6 6.8, 7.4*, 7.5, 8.4, 8.4, 10.3, 11, 11.8, 12.2, 12.3, 13.5, 14.4, 14.4, 14.8, 15.5*,
15.7, 16.2, 16.3, 16.5, 16.8, 17.2, 17.3, 17.5, 17.9, 19.8 20.4, 20.9, 21, 21, 21.1, 23, 23.4*, 23.6, 24, 24, 27.9,
28.2, 29.1, 30, 31, 31, 32, 35, 35, 37*, 37*, 37*, 38, 38*, 38*, 39*, 39*, 40, 40*, 40*, 41, 41, 41*, 42, 43**,
43*, 43*, 44, 45*, 45*, 46*, 46, 47*, 48, 49*, 51, 51, 51*, 52, 54, 55*, 56, 57*, 58**, 59**, 60, 60”, 60*, 61*,
62*, 65*, 65*, 67, 67, 68*, 69, 78, 80, 83*, 88*, 89, 90, 93, 96**, 103, 105*, 109**, 109*, 111*, 115*, 117,
125, 126, 127, 129*, 129* 139*, 154*,

1. Calculate the Kaplan-Meier estimate of the survivor function. Estimate 1- and 5-year survival proba-
bilities and give a standard error for these estimates.
2. Group the data into a life table with 1-year intervals. Compare the 1- and 5-year survival probability
estimates with those obtained in part (a),

(1) Calculate the Kaplan-Meier estimate of the survivor function. Estimate 1-

and 5-year survival probabilities and give a standard error for these estimates.

• Define the survival data (times and censoring status)

times <- c(0.3, 0.3, 4.0, 5.0, 5.6, 6.2, 6.3, 6.6, 6.8, 7.4, 7.5, 8.4, 8.4, 10.3, 11.0, 11.8, 12.2, 12.3
length(times)

## [1] 121

censored <- rep(1,length(times))

length(censored)

## [1] 121

censored[c(2,3,10,23,40,53,54,55,57,58,59,60,62,63,66,68,69,70,72,73,74,75,76,78,81,84,86,87,88,90,91,92

• Create a survival object

surv_obj <- Surv(times, censored)

• Kaplan-Meier estimate

km_fit <- survfit(surv_obj ~ 1)

km_fit

## Call: survfit(formula = surv_obj ~ 1)

##
## n events median 0.95LCL 0.95UCL
## [1,] 121 65 52 40 90

summary(km_fit)

17
## Call: survfit(formula = surv_obj ~ 1)
##
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 0.3 121 1 0.992 0.00823 0.976 1.000
## 5.0 118 1 0.983 0.01169 0.961 1.000
## 5.6 117 1 0.975 0.01429 0.947 1.000
## 6.2 116 1 0.967 0.01646 0.935 0.999
## 6.3 115 1 0.958 0.01834 0.923 0.995
## 6.6 114 1 0.950 0.02001 0.911 0.990
## 6.8 113 1 0.941 0.02152 0.900 0.984
## 7.5 111 1 0.933 0.02294 0.889 0.979
## 8.4 110 2 0.916 0.02547 0.867 0.967
## 10.3 108 1 0.907 0.02660 0.857 0.961
## 11.0 107 1 0.899 0.02767 0.846 0.955
## 11.8 106 1 0.890 0.02868 0.836 0.948
## 12.2 105 1 0.882 0.02964 0.826 0.942
## 12.3 104 1 0.873 0.03054 0.816 0.935
## 13.5 103 1 0.865 0.03140 0.806 0.929
## 14.4 102 2 0.848 0.03300 0.786 0.915
## 14.8 100 1 0.840 0.03374 0.776 0.908
## 15.7 98 1 0.831 0.03446 0.766 0.901
## 16.2 97 1 0.822 0.03516 0.756 0.894
## 16.3 96 1 0.814 0.03582 0.747 0.887
## 16.5 95 1 0.805 0.03645 0.737 0.880
## 16.8 94 1 0.797 0.03706 0.727 0.873
## 17.2 93 1 0.788 0.03764 0.718 0.865
## 17.3 92 1 0.780 0.03819 0.708 0.858
## 17.5 91 1 0.771 0.03872 0.699 0.851
## 17.9 90 1 0.762 0.03923 0.689 0.843
## 19.8 89 1 0.754 0.03971 0.680 0.836
## 20.4 88 1 0.745 0.04017 0.671 0.828
## 20.9 87 1 0.737 0.04061 0.661 0.821
## 21.0 86 2 0.720 0.04144 0.643 0.806
## 21.1 84 1 0.711 0.04182 0.634 0.798
## 23.0 83 1 0.702 0.04218 0.624 0.790
## 23.6 81 1 0.694 0.04254 0.615 0.782
## 24.0 80 2 0.676 0.04321 0.597 0.767
## 27.9 78 1 0.668 0.04352 0.588 0.759
## 28.2 77 1 0.659 0.04381 0.579 0.751
## 29.1 76 1 0.650 0.04408 0.570 0.743
## 30.0 75 1 0.642 0.04434 0.560 0.735
## 31.0 74 2 0.624 0.04481 0.542 0.719
## 32.0 72 1 0.616 0.04502 0.534 0.711
## 35.0 71 2 0.598 0.04539 0.516 0.694
## 38.0 66 1 0.589 0.04560 0.506 0.686
## 40.0 61 1 0.580 0.04586 0.496 0.677
## 41.0 58 2 0.560 0.04641 0.476 0.658
## 42.0 55 1 0.550 0.04667 0.465 0.649
## 44.0 51 1 0.539 0.04698 0.454 0.639
## 48.0 45 1 0.527 0.04744 0.442 0.628
## 51.0 43 2 0.502 0.04829 0.416 0.606
## 52.0 40 1 0.490 0.04869 0.403 0.595
## 54.0 39 1 0.477 0.04903 0.390 0.584
## 56.0 37 1 0.464 0.04937 0.377 0.572

18
## 60.0 33 1 0.450 0.04984 0.362 0.559
## 78.0 22 1 0.430 0.05161 0.340 0.544
## 80.0 21 1 0.409 0.05305 0.317 0.528
## 89.0 18 1 0.387 0.05476 0.293 0.510
## 90.0 17 1 0.364 0.05606 0.269 0.492
## 126.0 6 1 0.303 0.07243 0.190 0.484

• Plot the Kaplan-Meier curve

ggsurvplot(km_fit,
data = data.frame(times = times, censored = censored), conf.int = TRUE,
xlab = "Time (Months)", ylab = "Survival Probability",
title = "Kaplan-Meier Estimate of Survival",
risk.table = TRUE)

Kaplan−Meier Estimate of Survival

Strata + All

1.00 +++
Survival Probability

+
0.75 +
++++++
0.50 +++++
+++++++++++
+ + ++ ++ ++ ++ +
++ + +
0.25

0.00
0 50 100 150
Time (Months)
Number at risk
Strata

All 121 43 14 1
0 50 100 150
Time (Months)

• Calculate the 1-year (12 months) and 5-year (60 months) survival probabilities

km_fit$surv

## [1] 0.9917355 0.9917355 0.9833310 0.9749265 0.9665219 0.9581174 0.9497128

## [8] 0.9413083 0.9413083 0.9328281 0.9158675 0.9073873 0.8989070 0.8904268
## [15] 0.8819465 0.8734663 0.8649860 0.8480255 0.8395452 0.8395452 0.8309785
## [22] 0.8224117 0.8138449 0.8052781 0.7967113 0.7881445 0.7795777 0.7710109
## [29] 0.7624442 0.7538774 0.7453106 0.7367438 0.7196102 0.7110434 0.7024766

19
## [36] 0.7024766 0.6938041 0.6764590 0.6677864 0.6591139 0.6504413 0.6417688
## [43] 0.6244237 0.6157511 0.5984060 0.5984060 0.5893393 0.5893393 0.5796780
## [50] 0.5596891 0.5495129 0.5495129 0.5387381 0.5387381 0.5387381 0.5387381
## [57] 0.5267662 0.5267662 0.5022654 0.4897088 0.4771522 0.4771522 0.4642562
## [64] 0.4642562 0.4642562 0.4642562 0.4501878 0.4501878 0.4501878 0.4501878
## [71] 0.4501878 0.4501878 0.4501878 0.4297247 0.4092616 0.4092616 0.4092616
## [78] 0.3865249 0.3637881 0.3637881 0.3637881 0.3637881 0.3637881 0.3637881
## [85] 0.3637881 0.3637881 0.3637881 0.3637881 0.3031568 0.3031568 0.3031568
## [92] 0.3031568 0.3031568

km_fit$time

## [1] 0.3 4.0 5.0 5.6 6.2 6.3 6.6 6.8 7.4 7.5 8.4 10.3
## [13] 11.0 11.8 12.2 12.3 13.5 14.4 14.8 15.5 15.7 16.2 16.3 16.5
## [25] 16.8 17.2 17.3 17.5 17.9 19.8 20.4 20.9 21.0 21.1 23.0 23.4
## [37] 23.6 24.0 27.9 28.2 29.1 30.0 31.0 32.0 35.0 37.0 38.0 39.0
## [49] 40.0 41.0 42.0 43.0 44.0 45.0 46.0 47.0 48.0 49.0 51.0 52.0
## [61] 54.0 55.0 56.0 57.0 58.0 59.0 60.0 61.0 62.0 65.0 67.0 68.0
## [73] 69.0 78.0 80.0 83.0 88.0 89.0 90.0 93.0 96.0 103.0 105.0 109.0
## [85] 111.0 115.0 117.0 125.0 126.0 127.0 129.0 139.0 154.0

km_fit$surv[km_fit$time == 12] # 1-year survival

## numeric(0)

km_fit$surv[km_fit$time == 60] # 5-year survival

## [1] 0.4501878

(2) Group the data into a life table with 1-year intervals. Compare the 1- and
5-year survival probability estimates with those obtained in part (a),

• Using the survfit object to calculate the survival at specific time points
• This can be used to generate life table estimates

summary(km_fit, times = seq(1, 5, by = 1)) # For 1-year intervals

## Call: survfit(formula = surv_obj ~ 1)

##
## time n.risk n.event survival std.err lower 95% CI upper 95% CI
## 1 119 1 0.992 0.00823 0.976 1
## 2 119 0 0.992 0.00823 0.976 1
## 3 119 0 0.992 0.00823 0.976 1
## 4 119 0 0.992 0.00823 0.976 1
## 5 118 1 0.983 0.01169 0.961 1

Hospital Designing and Planning
81% (27)
Hospital Designing and Planning
55 pages
How To Perform SVAR in R: Mohamed Nachid Boussiala
0% (1)
How To Perform SVAR in R: Mohamed Nachid Boussiala
11 pages
Minutely
No ratings yet
Minutely
1 page
The Physiology of Training
94% (16)
The Physiology of Training
250 pages
Chapter 6. Product Limit Stimator Peter Smith: Required Libraries
No ratings yet
Chapter 6. Product Limit Stimator Peter Smith: Required Libraries
9 pages
SMG Estimation of Survival Function
No ratings yet
SMG Estimation of Survival Function
8 pages
Estimation of Survival Function
No ratings yet
Estimation of Survival Function
9 pages
Shailesh020902@gmail - Com 1
No ratings yet
Shailesh020902@gmail - Com 1
1 page
ReliabilityHw1 R26104047
No ratings yet
ReliabilityHw1 R26104047
4 pages
Demo Notebook
No ratings yet
Demo Notebook
13 pages
Homework 5
No ratings yet
Homework 5
104 pages
Problems: Log (Q) VS T
No ratings yet
Problems: Log (Q) VS T
8 pages
DSP_Lec8
No ratings yet
DSP_Lec8
12 pages
Caderno 2 - Exercícios 5 A 11
No ratings yet
Caderno 2 - Exercícios 5 A 11
16 pages
Simulation in Operation Research: Presented By
No ratings yet
Simulation in Operation Research: Presented By
42 pages
HW5
No ratings yet
HW5
9 pages
Image Classifications
No ratings yet
Image Classifications
4 pages
TP.ipynb - Colab
No ratings yet
TP.ipynb - Colab
6 pages
Analysis Course HW3
No ratings yet
Analysis Course HW3
12 pages
NNHOUSEPRICEDATA - Ipynb - Colab
No ratings yet
NNHOUSEPRICEDATA - Ipynb - Colab
3 pages
soruma-SECOND-ASSEsiment l reg
No ratings yet
soruma-SECOND-ASSEsiment l reg
33 pages
C2M4 - Assignment: 1 Cox Proportional Hazards and Random Survival Forests
No ratings yet
C2M4 - Assignment: 1 Cox Proportional Hazards and Random Survival Forests
18 pages
Simple Statistics Functions in R
No ratings yet
Simple Statistics Functions in R
41 pages
Handout 02
No ratings yet
Handout 02
12 pages
Jamboree
No ratings yet
Jamboree
56 pages
python-Copy1
No ratings yet
python-Copy1
5 pages
E21CSEU0770 Lab4
No ratings yet
E21CSEU0770 Lab4
4 pages
Soruma SECOND ASSEsiment Final l Reg
No ratings yet
Soruma SECOND ASSEsiment Final l Reg
34 pages
Teste Do Dia 15 01 2024
No ratings yet
Teste Do Dia 15 01 2024
3 pages
Tablas Estadisticas 10A
No ratings yet
Tablas Estadisticas 10A
1 page
seaborn
No ratings yet
seaborn
1 page
Logistics Regression
100% (1)
Logistics Regression
5 pages
R Code Default Data PDF
No ratings yet
R Code Default Data PDF
10 pages
1728086737277
No ratings yet
1728086737277
26 pages
Chapter 9 Cox Proportional Hazards: Library (Survival)
No ratings yet
Chapter 9 Cox Proportional Hazards: Library (Survival)
10 pages
Statistical Process Control: Managing For Quality Dr. Ron Lembke
No ratings yet
Statistical Process Control: Managing For Quality Dr. Ron Lembke
39 pages
Yaikob Second Assesiment Final
No ratings yet
Yaikob Second Assesiment Final
33 pages
HASRITH ML LAB 10 ASSIGNMENT - Jupyter Notebook (1)
No ratings yet
HASRITH ML LAB 10 ASSIGNMENT - Jupyter Notebook (1)
8 pages
Analysis Course HW2
No ratings yet
Analysis Course HW2
13 pages
Hypothesis Testing PDF
No ratings yet
Hypothesis Testing PDF
9 pages
HW2_solution_Fall2024
No ratings yet
HW2_solution_Fall2024
9 pages
Gestión de Carteras Mapa de Calor
No ratings yet
Gestión de Carteras Mapa de Calor
1 page
SH Assignment
No ratings yet
SH Assignment
6 pages
Cronbachs Alpha Code
No ratings yet
Cronbachs Alpha Code
6 pages
Heart FailureDataset ML Algorithms
No ratings yet
Heart FailureDataset ML Algorithms
10 pages
SH ASSIGNMENT 1 21CE01032
No ratings yet
SH ASSIGNMENT 1 21CE01032
6 pages
Assignment 11
100% (1)
Assignment 11
7 pages
Ujian Ke 4 Pengendalian Proses Tanggal 20 Mei 2017 Nama:Muhammad Arief Nugraha NRP: 142014066 Kelas: A Kode: 4
No ratings yet
Ujian Ke 4 Pengendalian Proses Tanggal 20 Mei 2017 Nama:Muhammad Arief Nugraha NRP: 142014066 Kelas: A Kode: 4
63 pages
Heart Failure Prediction
100% (1)
Heart Failure Prediction
41 pages
ML program 7 ,8,9 and10
No ratings yet
ML program 7 ,8,9 and10
12 pages
Analisis Dinamico Eje X
No ratings yet
Analisis Dinamico Eje X
24 pages
Pool
No ratings yet
Pool
13 pages
Lista 4 - Mineração de Dados
No ratings yet
Lista 4 - Mineração de Dados
15 pages
Lab 4
No ratings yet
Lab 4
21 pages
Practice - DL - Ipynb - Colaboratory
No ratings yet
Practice - DL - Ipynb - Colaboratory
14 pages
STOCK - MARKET - PROJECT - Jupyter Notebook
No ratings yet
STOCK - MARKET - PROJECT - Jupyter Notebook
24 pages
Practical - 5 - 52
No ratings yet
Practical - 5 - 52
4 pages
Econ209 f2024 Lab 4 Truong Gia Han
No ratings yet
Econ209 f2024 Lab 4 Truong Gia Han
11 pages
I Avaliação Parcial - 25.0 PTS - Gabarito
No ratings yet
I Avaliação Parcial - 25.0 PTS - Gabarito
9 pages
Simple Linear regression-LAB4.ipynb - Colaboratory
No ratings yet
Simple Linear regression-LAB4.ipynb - Colaboratory
6 pages
Practical 4
No ratings yet
Practical 4
3 pages
Summer Bridge Math, Grades 1 - 2
From Everand
Summer Bridge Math, Grades 1 - 2
Summer Bridge Activities
No ratings yet
DVR Crosslock Distal Radius Plating System Surgical Technique
No ratings yet
DVR Crosslock Distal Radius Plating System Surgical Technique
28 pages
Anti Arrhythmic Drugs
No ratings yet
Anti Arrhythmic Drugs
67 pages
Lecture Handout
No ratings yet
Lecture Handout
4 pages
General Instructions For Preparation of The Curriculum Vitae Harvard Medical School/Harvard School of Dental Medicine
No ratings yet
General Instructions For Preparation of The Curriculum Vitae Harvard Medical School/Harvard School of Dental Medicine
20 pages
Large Fragment Locking Compression Plate (LCP) : Technique Guide
No ratings yet
Large Fragment Locking Compression Plate (LCP) : Technique Guide
23 pages
Occupation Based Assessment
No ratings yet
Occupation Based Assessment
7 pages
Efficacy of Makaradhwaja On Madhumehatype2 Diabetes A Review Throughayurvedic Studies
No ratings yet
Efficacy of Makaradhwaja On Madhumehatype2 Diabetes A Review Throughayurvedic Studies
10 pages
Full Checklist
No ratings yet
Full Checklist
3 pages
Jordan 1
No ratings yet
Jordan 1
5 pages
Diabetes Case Study and Worksheets For mnt1
No ratings yet
Diabetes Case Study and Worksheets For mnt1
5 pages
Pentacam Four Maps Refractive
No ratings yet
Pentacam Four Maps Refractive
4 pages
Pulmonology Drug Names
No ratings yet
Pulmonology Drug Names
1 page
Effects of Alcohol Report
No ratings yet
Effects of Alcohol Report
5 pages
Eclipta Alba Bhringraj Benefits and Usages
No ratings yet
Eclipta Alba Bhringraj Benefits and Usages
3 pages
The Dry Eye
No ratings yet
The Dry Eye
8 pages
CellTiter 96 Non-Radioactive Cell Proliferation Assay
No ratings yet
CellTiter 96 Non-Radioactive Cell Proliferation Assay
10 pages
Philips HeartStart FR3 AED
No ratings yet
Philips HeartStart FR3 AED
6 pages
ECRI 2023 Top 10 Hazards Executive Brief
No ratings yet
ECRI 2023 Top 10 Hazards Executive Brief
18 pages
Psych c61 Midterm2 Review
No ratings yet
Psych c61 Midterm2 Review
26 pages
014 PDF
No ratings yet
014 PDF
21 pages
Hematology - Misc: TEST(s) Normal UNIT(s)
No ratings yet
Hematology - Misc: TEST(s) Normal UNIT(s)
1 page
ComingClean InternetVersionSec.
No ratings yet
ComingClean InternetVersionSec.
168 pages
South Central Railway
No ratings yet
South Central Railway
4 pages
Concept Map
No ratings yet
Concept Map
1 page
Pathological Delivery
No ratings yet
Pathological Delivery
44 pages
Effects of Moringa (Moringa Oleifera) Leave On Electrolyte in Rats Fed Crude Oil Contaminated Diet
No ratings yet
Effects of Moringa (Moringa Oleifera) Leave On Electrolyte in Rats Fed Crude Oil Contaminated Diet
10 pages
CASO 5 Úlcera Péptica Tratamiento Médico y Quirúrgico
No ratings yet
CASO 5 Úlcera Péptica Tratamiento Médico y Quirúrgico
9 pages
BT Discourse Cohesion SV
No ratings yet
BT Discourse Cohesion SV
3 pages