PSsurvival: Propensity Score Weighting for Survival Analysis

1. Introduction

PSsurvival implements propensity score weighting methods for causal inference with time-to-event outcomes. The package provides three main functions:

surveff(): Estimates counterfactual survival functions and survival differences
marCoxph(): Estimates marginal hazard ratios via weighted Cox regression
weightedKM(): Estimates weighted Kaplan-Meier and cumulative risk curves

All functions support binary and multiple treatment groups, multiple weighting schemes (IPW, overlap weights, ATT), and propensity score trimming.

library(PSsurvival)

2. Data for Vignette

The package includes two simulated datasets:

data(simdata_bin)   # Binary treatment (A vs B)
data(simdata_multi) # Four treatment groups (A, B, C, D)

# Binary treatment data
head(simdata_bin)
#>        X1      X2      X3 B1 B2 Z    time event
#> 1  0.9820 -0.0517  1.0119  0  1 A 10.0035     1
#> 2  0.4687 -0.4994 -1.2844  0  0 A  0.4150     0
#> 3 -0.1080 -0.9025 -0.5866  1  0 B  7.6058     1
#> 4 -0.2129 -0.3720 -2.3708  0  0 B 10.0073     0
#> 5  1.1581 -0.1884 -0.3767  1  1 B 17.3095     1
#> 6  1.2924 -0.7366 -2.0261  0  1 A 14.5914     1
table(simdata_bin$Z)
#> 
#>   A   B 
#> 408 592

# Multiple treatment data
table(simdata_multi$Z)
#> 
#>   A   B   C   D 
#> 244 188 208 360

Both datasets contain:

X1, X2, X3: Continuous covariates
B1, B2: Binary covariates
Z: Treatment group
time: Follow-up time (range 0-20)
event: Event indicator (1 = event, 0 = censored)

3. `surveff()`: Counterfactual Survival Probabilities and Differences

Key Arguments

Data and model specification:

data: Data frame containing treatment, outcome, and covariates
ps_formula: Propensity score model (treatment ~ covariates)
censoring_formula: Censoring model (Surv(time, event) ~ covariates)
eval_times: Time points for evaluation (default: all unique event times)
contrast_matrix: Matrix for pairwise comparisons (multiple groups only; column names must match treatment levels)

Weighting and trimming:

weight_method: Weighting method (“IPW”, “OW”, or “ATT”)
att_group: Target group for ATT (required if weight_method = "ATT")
trim: Logical, whether to apply symmetric trimming (default: FALSE)
delta: Threshold for symmetric trimming (default: NULL for automatic selection)

Variance estimation:

censoring_method: Method for censoring adjustment (“weibull” or “cox”)
variance_method: Variance estimation (“analytical” or “bootstrap”)
boot_level: Bootstrap sampling level (“full” or “strata”)
B: Number of bootstrap iterations
parallel, mc.cores, seed: Parallel computing and reproducibility

Basic Usage: Binary Treatment

result_bin <- surveff(
  data = simdata_bin,
  ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
  censoring_formula = survival::Surv(time, event) ~ X1 + B1,
  weight_method = "OW",
  censoring_method = "weibull"
)

print(result_bin, max.len = 3)
#> 
#> === Survival Effect Estimation ===
#> 
#> Treatment groups: A, B 
#> Number of groups: 2 
#> Estimand: overlap 
#> Censoring method: weibull 
#> Variance method: analytical 
#> Evaluation times: 666 time points
#> 
#> Survival function estimates:
#>           A      B
#> [1,] 0.9979 1.0000
#> [2,] 0.9960 1.0000
#> [3,] 0.9960 0.9973
#>   ... (663 more rows not shown)
#> 
#> Treatment effect estimates:
#>      B vs A
#> [1,] 0.0021
#> [2,] 0.0040
#> [3,] 0.0013
#>   ... (663 more rows not shown)

The ps_formula specifies the propensity score model (treatment ~ covariates). The censoring_formula specifies the censoring models (Weibull or Cox) within each treatment group for estimating the censoring scores.

Multiple Treatment Groups

For multiple groups, survival differences require a contrast_matrix specifying which comparisons to make. Each row represents one contrast with exactly two non-zero elements: 1 and -1. Column names must match treatment levels exactly.

# Define pairwise comparisons
contrast_mat <- matrix(
  c(1, -1, 0, 0,   # A vs B
    1, 0, -1, 0,   # A vs C
    1, 0, 0, -1),  # A vs D
  nrow = 3, byrow = TRUE
)
colnames(contrast_mat) <- c("A", "B", "C", "D")  # Must match treatment levels
rownames(contrast_mat) <- c("A vs B", "A vs C", "A vs D")

result_multi <- surveff(
  data = simdata_multi,
  ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
  censoring_formula = survival::Surv(time, event) ~ X1 + B1,
  weight_method = "IPW",
  contrast_matrix = contrast_mat,
  censoring_method = "cox",
  variance_method = "bootstrap",
  B = 50,
  seed = 123
)

print(result_multi, max.len = 3)
#> 
#> === Survival Effect Estimation ===
#> 
#> Treatment groups: A, B, C, D 
#> Number of groups: 4 
#> Estimand: ATE 
#> Censoring method: cox 
#> Variance method: bootstrap 
#> Evaluation times: 667 time points
#> 
#> Survival function estimates:
#>           A      B C D
#> [1,] 1.0000 0.9945 1 1
#> [2,] 0.9961 0.9945 1 1
#> [3,] 0.9917 0.9945 1 1
#>   ... (664 more rows not shown)
#> 
#> Treatment effect estimates:
#>       A vs B  A vs C  A vs D
#> [1,]  0.0055  0.0000  0.0000
#> [2,]  0.0016 -0.0039 -0.0039
#> [3,] -0.0028 -0.0083 -0.0083
#>   ... (664 more rows not shown)

Without contrast_matrix, only group-specific survival functions are estimated (no differences in survivals).

# with symmetric trimming
result_multi_trimmed <- surveff(
  data = simdata_multi,
  ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
  censoring_formula = survival::Surv(time, event) ~ X1 + B1,
  weight_method = "IPW",
  trim = TRUE,
  delta = 0.15,
  contrast_matrix = contrast_mat,
  censoring_method = "cox",
  variance_method = "bootstrap",
  B = 50,
  seed = 123
)
#> Excluding 251 trimmed observations (weights == 0) from calculations.

print(result_multi_trimmed, max.len = 3)
#> 
#> === Survival Effect Estimation ===
#> 
#> Treatment groups: A, B, C, D 
#> Number of groups: 4 
#> Estimand: ATE 
#> Censoring method: cox 
#> Variance method: bootstrap 
#> Evaluation times: 491 time points
#> 
#> Survival function estimates:
#>           A      B C D
#> [1,] 1.0000 0.9928 1 1
#> [2,] 0.9949 0.9928 1 1
#> [3,] 0.9896 0.9928 1 1
#>   ... (488 more rows not shown)
#> 
#> Treatment effect estimates:
#>       A vs B  A vs C  A vs D
#> [1,]  0.0072  0.0000  0.0000
#> [2,]  0.0020 -0.0051 -0.0051
#> [3,] -0.0032 -0.0104 -0.0104
#>   ... (488 more rows not shown)

Weighting Methods

Three weighting methods are supported:

IPW: Inverse probability (of treatment) weighting, targets average treatment effect (ATE) for the entire population
ATT: Propensity score weighting tilted to a specific treatment group (requires att_group)
OW: Overlap weighting, targets the overlap population at clinical equipoise where treatment groups have most similar covariates

# IPW (targets ATE)
surveff(..., weight_method = "IPW")

# ATT targeting group B
surveff(..., weight_method = "ATT", att_group = "B")

# Overlap weights
surveff(..., weight_method = "OW")

Propensity Score Trimming

Symmetric trimming excludes observations with extreme propensity scores. Not available with overlap weights (which already downweight extremes). The Crump extension is used for multiple treatments.

# Symmetric trimming with default delta (automatic selection)
surveff(..., weight_method = "IPW", trim = TRUE)

# Symmetric trimming with custom delta
surveff(..., weight_method = "IPW", trim = TRUE, delta = 0.1)

Censoring Methods

Two methods are available for estimating censoring distributions:

"weibull" (default): Parametric Weibull AFT model. Efficient when censoring follows Weibull distribution. Allows analytical variance for binary treatment.
"cox": Semi-parametric Cox proportional hazards model. More flexible but requires bootstrap variance.

# Weibull censoring model
surveff(..., censoring_method = "weibull")

# Cox censoring model (requires bootstrap)
surveff(..., censoring_method = "cox", variance_method = "bootstrap", B = 200)

No censoring adjustment: To skip censoring adjustment entirely (e.g., when there is no censoring), use censoring_formula = Surv(time, event) ~ 0. In this case, all censoring scores are set to 1.

Variance Estimation

Two variance estimation methods are available:

"analytical": M-estimation theory-based variance. Only available for binary treatment with Weibull censoring. Fast and efficient when applicable.
"bootstrap": Resamples the entire estimation pipeline. Required for Cox censoring or multiple treatment groups. More computationally intensive but broadly applicable.

# Analytical variance (binary + weibull only)
surveff(..., censoring_method = "weibull", variance_method = "analytical")

# Bootstrap variance
surveff(..., variance_method = "bootstrap", B = 200)

Bootstrap sampling level (boot_level): Controls how bootstrap samples are drawn.

"full" (default): Resamples from entire dataset. Standard choice for observational studies.
"strata": Resamples within treatment groups, preserving group sizes. Useful when treatment assignment follows a stratified or fixed-ratio design.

# Stratified bootstrap
surveff(..., variance_method = "bootstrap", boot_level = "strata", B = 200)

Summary Output

The summary() method provides two output styles:

Printed output (style = "prints", default): Displays formatted tables with confidence intervals. Customize with:

conf_level: Confidence level (default 0.95)
max.len: Maximum rows to display
round.digits: Number of decimal places

Programmatic output (style = "returns"): Returns a list of matrices for further processing:

survival_summary: List of survival estimates by group
difference_summary: List of treatment effect estimates

summary(result_bin, conf_level = 0.95, max.len = 5)
#> 
#> === Survival Effect Estimation Summary ===
#> 
#> Treatment groups: A, B 
#> Number of groups: 2 
#> Sample size: 1000 complete cases
#> Estimand: overlap 
#> Censoring method: weibull 
#> Variance method: analytical 
#> Evaluation times: 666 time points
#> Confidence level: 0.95 
#> 
#> --- Survival Function Estimates ---
#> 
#> Group A :
#>    Time Estimate     SE 95%CI.lower 95%CI.upper
#>  0.0492   0.9979 0.0021      0.9937           1
#>  0.0655   0.9960 0.0028      0.9905           1
#>  0.1875   0.9960 0.0028      0.9905           1
#>  0.3118   0.9960 0.0028      0.9905           1
#>  0.3199   0.9960 0.0028      0.9905           1
#>   ... (661 more rows not shown)
#> 
#> Group B :
#>    Time Estimate     SE 95%CI.lower 95%CI.upper
#>  0.0492   1.0000 0.0000      1.0000           1
#>  0.0655   1.0000 0.0000      1.0000           1
#>  0.1875   0.9973 0.0027      0.9920           1
#>  0.3118   0.9953 0.0033      0.9888           1
#>  0.3199   0.9938 0.0037      0.9866           1
#>   ... (661 more rows not shown)
#> 
#> --- Treatment Effect Estimates ---
#> 
#> B vs A :
#>    Time Estimate     SE 95%CI.lower 95%CI.upper
#>  0.0492   0.0021 0.0021     -0.0020      0.0063
#>  0.0655   0.0040 0.0028     -0.0015      0.0095
#>  0.1875   0.0013 0.0039     -0.0064      0.0089
#>  0.3118  -0.0007 0.0044     -0.0092      0.0079
#>  0.3199  -0.0022 0.0046     -0.0113      0.0068
#>   ... (661 more rows not shown)

For programmatic access:

summ <- summary(result_bin, style = "returns")
names(summ)
#> [1] "survival_summary"   "difference_summary"
head(summ$survival_summary$A)
#>     Time  Estimate          SE 95%CI.lower 95%CI.upper
#> 1 0.0492 0.9978679 0.002126095   0.9937008           1
#> 2 0.0655 0.9960009 0.002822286   0.9904693           1
#> 3 0.1875 0.9960009 0.002822286   0.9904693           1
#> 4 0.3118 0.9960009 0.002822286   0.9904693           1
#> 5 0.3199 0.9960009 0.002822286   0.9904693           1
#> 6 0.3958 0.9934268 0.003811892   0.9859556           1
head(summ$difference_summary$`B vs A`)
#>     Time      Estimate          SE  95%CI.lower 95%CI.upper
#> 1 0.0492  0.0021321189 0.002126095 -0.002034951 0.006299189
#> 2 0.0655  0.0039991123 0.002822286 -0.001532467 0.009530692
#> 3 0.1875  0.0012872620 0.003897586 -0.006351866 0.008926390
#> 4 0.3118 -0.0006676973 0.004358730 -0.009210651 0.007875257
#> 5 0.3199 -0.0022366401 0.004632447 -0.011316070 0.006842790
#> 6 0.3958  0.0003374480 0.005295230 -0.010041012 0.010715908

Plotting Results

The plot() method produces survival curves (type = "surv") or treatment effect curves (type = "survdiff").

# Survival curves for all groups
plot(result_multi, type = "surv")

# Treatment effect curves
plot(result_multi, type = "survdiff")

Selecting specific strata: Use strata_to_plot to display a subset of groups or contrasts.

# Only groups A and C
plot(result_multi, type = "surv",
     strata_to_plot = c("A", "C"))

# Only specific contrasts (names must match contrast_matrix rownames exactly)
plot(result_multi, type = "survdiff",
     strata_to_plot = c("A vs B", "A vs D"))

Customization options:

strata_to_plot: Vector of strata names to display
strata_colors: Custom colors for each stratum (length must match strata_to_plot)
max_time: Maximum time on x-axis
include_CI: Show confidence interval ribbons (TRUE/FALSE)
conf_level: Confidence level for intervals
curve_width: Line width for curves
CI_alpha: Transparency of CI ribbons (0-1)
legend_position: “right” or “bottom”
legend_title, plot_title: Custom titles

plot(result_multi,
     type = "surv",
     strata_to_plot = c("A", "B"),
     strata_colors = c("steelblue", "coral"),
     max_time = 15,
     include_CI = TRUE,
     legend_position = "bottom",
     plot_title = "Survival by Treatment Group")

4. `marCoxph()`: Marginal Hazard Ratios by Cox-PH model

marCoxph() fits a weighted Cox model to estimate marginal hazard ratios.

Key Arguments

Data and model specification:

data: Data frame containing treatment, outcome, and covariates
ps_formula: Propensity score model (treatment ~ covariates)
time_var: Name of time-to-event variable (character string)
event_var: Name of event indicator variable (character string, 1=event, 0=censored)
reference_level: Reference treatment level in Cox model (MANDATORY)

Weighting and trimming:

weight_method: Weighting method (“IPW”, “OW”, or “ATT”)
att_group: Target group for ATT (required if weight_method = "ATT")
trim: Logical, whether to apply symmetric trimming (default: FALSE)
delta: Threshold for symmetric trimming (default: NULL for automatic selection)

Variance estimation:

variance_method: “bootstrap” (default) or “robust”
boot_level: Bootstrap sampling level (“full” or “strata”)
B: Number of bootstrap iterations
parallel, mc.cores, seed: Parallel computing and reproducibility
robust: Use robust variance in Cox model (default TRUE)

Basic Usage: Binary Treatment

hr_result <- marCoxph(
  data = simdata_bin,
  ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
  time_var = "time",
  event_var = "event",
  reference_level = "A",
  weight_method = "OW"
)

print(hr_result)
#> 
#> === Marginal Cox Proportional Hazards Model ===
#> 
#> Treatment variable: Z 
#> Treatment levels: A, B 
#> Reference level: A 
#> Number of groups: 2 
#> Sample size: 1000 complete cases
#> Estimand: overlap 
#> Variance method: bootstrap-full 
#> Bootstrap iterations: 100 
#> Trimming: None
#> 
#> Sample sizes used in Cox model:
#>  treatment   n events
#>          A 408    250
#>          B 592    416
#> 
#> Log hazard ratio estimates:
#>  treatment  logHR SE_robust SE_bootstrap
#>        Z:B 0.5262    0.0842       0.0844
#> 
#> Note: Use summary() for confidence intervals and exponentiated hazard ratios.

The reference_level argument is mandatory and specifies the baseline group for hazard ratio comparisons.

Multiple Treatment Groups

hr_multi <- marCoxph(
  data = simdata_multi,
  ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
  time_var = "time",
  event_var = "event",
  reference_level = "A",
  weight_method = "IPW",
  variance_method = "bootstrap",
  B = 50,
  seed = 456
)

print(hr_multi)
#> 
#> === Marginal Cox Proportional Hazards Model ===
#> 
#> Treatment variable: Z 
#> Treatment levels: A, B, C, D 
#> Reference level: A 
#> Number of groups: 4 
#> Sample size: 1000 complete cases
#> Estimand: ATE 
#> Variance method: bootstrap-full 
#> Bootstrap iterations: 50 
#> Trimming: None
#> 
#> Sample sizes used in Cox model:
#>  treatment   n events
#>          A 244    128
#>          B 188    133
#>          C 208    132
#>          D 360    275
#> 
#> Log hazard ratio estimates:
#>  treatment  logHR SE_robust SE_bootstrap
#>        Z:B 0.5371    0.1320       0.1294
#>        Z:C 0.1766    0.1287       0.1303
#>        Z:D 0.8116    0.1130       0.1106
#> 
#> Note: Use summary() for confidence intervals and exponentiated hazard ratios.

hr_multi_trimmed <- marCoxph(
  data = simdata_multi,
  ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
  time_var = "time",
  event_var = "event",
  reference_level = "A",
  weight_method = "IPW",
  trim = TRUE,
  delta = 0.15,
  variance_method = "bootstrap",
  B = 50,
  seed = 456
)

print(hr_multi_trimmed)
#> 
#> === Marginal Cox Proportional Hazards Model ===
#> 
#> Treatment variable: Z 
#> Treatment levels: A, B, C, D 
#> Reference level: A 
#> Number of groups: 4 
#> Sample size: 1000 complete cases
#> Estimand: ATE 
#> Variance method: bootstrap-full 
#> Bootstrap iterations: 50 
#> Trimming: symmetric (delta = 0.15)
#> 
#> Sample sizes used in Cox model:
#>  treatment   n events
#>          A 190     93
#>          B 142     99
#>          C 156     97
#>          D 261    203
#> 
#> Log hazard ratio estimates:
#>  treatment  logHR SE_robust SE_bootstrap
#>        Z:B 0.5637    0.1489       0.1879
#>        Z:C 0.2315    0.1523       0.2169
#>        Z:D 0.8309    0.1340       0.1788
#> 
#> Note: Use summary() for confidence intervals and exponentiated hazard ratios.

Weighting Methods

Same as surveff(): IPW, ATT, and OW are supported.

# IPW (targets ATE)
marCoxph(..., weight_method = "IPW")

# ATT targeting specific group
marCoxph(..., weight_method = "ATT", att_group = "treated")

# Overlap weights
marCoxph(..., weight_method = "OW")

Trimming Options

Propensity score trimming works identically to surveff():

# Symmetric trimming with default delta
marCoxph(..., weight_method = "IPW", trim = TRUE)

# Symmetric trimming with custom delta
marCoxph(..., weight_method = "IPW", trim = TRUE, delta = 0.1)

Variance Options

Two variance estimation methods:

"bootstrap" (default): Resamples the entire analysis pipeline (propensity scores, weights, and Cox model)
"robust": Uses sandwich variance from coxph() directly (faster, no resampling)

# Bootstrap variance (default)
marCoxph(..., variance_method = "bootstrap", B = 200)

# Robust sandwich variance (no bootstrap)
marCoxph(..., variance_method = "robust")

The boot_level argument ("full" or "strata") also applies here when using bootstrap variance, with the same interpretation as in surveff().

Summary Output

The summary() method provides detailed results with confidence intervals on both log and original scales:

summary(hr_multi, conf_level = 0.95)
#> 
#> === Marginal Cox Model Summary ===
#> 
#> Treatment variable: Z 
#> Treatment levels: A, B, C, D 
#> Reference group: A 
#> Number of groups: 4 
#> Sample size: 1000 complete cases
#> Estimand: ATE 
#> Variance method: bootstrap-full 
#> Bootstrap iterations: 50 
#> Confidence level: 0.95 
#> 
#> --- Sample Sizes in Cox Model ---
#> 
#>  treatment   n events
#>          A 244    128
#>          B 188    133
#>          C 208    132
#>          D 360    275
#> 
#> --- Log Hazard Ratios (log scale) ---
#> 
#>  treatment  logHR     SE 95%CI.lower 95%CI.upper
#>        Z:B 0.5371 0.1294      0.2835      0.7908
#>        Z:C 0.1766 0.1303     -0.0788      0.4320
#>        Z:D 0.8116 0.1106      0.5948      1.0285
#> 
#> --- Hazard Ratios (original scale) ---
#> 
#>  treatment     HR 95%CI.lower 95%CI.upper
#>        Z:B 1.7111      1.3277      2.2051
#>        Z:C 1.1932      0.9242      1.5403
#>        Z:D 2.2516      1.8126      2.7968

Output styles:

style = "prints" (default): Displays formatted tables with log hazard ratios and exponentiated hazard ratios
style = "returns": Returns a list containing:
- logHR, logHR_CI_lower, logHR_CI_upper: Log-scale estimates and CIs
- HR, HR_CI_lower, HR_CI_upper: Original-scale estimates and CIs
- SE: Standard errors for log-scale hazard ratio (from specified variance method)
- n_per_group, events_per_group: Sample sizes and event counts

summ_hr <- summary(hr_multi, style = "returns")
names(summ_hr)
#>  [1] "logHR"            "logHR_CI_lower"   "logHR_CI_upper"   "SE"              
#>  [5] "HR"               "HR_CI_lower"      "HR_CI_upper"      "variance_method" 
#>  [9] "conf_level"       "n_per_group"      "events_per_group"
summ_hr$HR  # Exponentiated hazard ratios
#>      Z:B      Z:C      Z:D 
#> 1.711070 1.193173 2.251570

5. `weightedKM()`: Kaplan-Meier or Cumulative Risk Curves with Propensity Score Weights

weightedKM() estimates the survival probabilities at each observed time points, using weighted Kaplan-Meier (KM) estimator with propensity score weights. It uses the classic weighted Greenwood variance formula. plot.weightedKM() creates weighted KM or Cumulative Risk (CR) curves for visualization.

Key Arguments

Data specification:

data: Data frame
treatment_var: Name of treatment variable (character string)
time_var: Name of time variable (character string)
event_var: Name of event indicator (character string, 1=event, 0=censored)

Weighting:

weight_method: “IPW”, “OW”, “ATT”, or “none” (classical unweighted KM)
ps_formula: Propensity score model (required unless weight_method = "none")
att_group: Target group for ATT (required if weight_method = "ATT")
trim: Logical, whether to apply symmetric trimming (default: FALSE)
delta: Threshold for symmetric trimming

Fit a `weightedKM()` Object

Without Propensity Score Weights

# Classical unweighted KM
km_result <- weightedKM(
  data = simdata_bin,
  treatment_var = "Z",
  time_var = "time",
  event_var = "event",
  weight_method = "none"
)

print(km_result)
#> 
#> ===============================================================================
#> Weighted Kaplan-Meier Survival Estimates
#> ===============================================================================
#> 
#> Weight Method: none 
#> Treatment Levels: A, B 
#> Number of Complete Cases: 1000 
#> Number of Unique Event Times: 666 
#> 
#> -------------------------------------------------------------------------------
#> Treatment Group: A 
#> -------------------------------------------------------------------------------
#>        time survival    se
#>  [1,] 0.049    0.998 0.002
#>  [2,] 0.066    0.995 0.003
#>  [3,] 0.188    0.995 0.003
#>  [4,] 0.312    0.995 0.003
#>  [5,] 0.320    0.995 0.003
#>  [6,] 0.396    0.993 0.004
#>  [7,] 0.500    0.993 0.004
#>  [8,] 0.520    0.993 0.004
#>  [9,] 0.534    0.993 0.004
#> [10,] 0.562    0.990 0.005
#> ... (656 more rows)
#> 
#> -------------------------------------------------------------------------------
#> Treatment Group: B 
#> -------------------------------------------------------------------------------
#>        time survival    se
#>  [1,] 0.049    1.000 0.000
#>  [2,] 0.066    1.000 0.000
#>  [3,] 0.188    0.998 0.002
#>  [4,] 0.312    0.997 0.002
#>  [5,] 0.320    0.995 0.003
#>  [6,] 0.396    0.995 0.003
#>  [7,] 0.500    0.993 0.003
#>  [8,] 0.520    0.991 0.004
#>  [9,] 0.534    0.990 0.004
#> [10,] 0.562    0.990 0.004
#> ... (656 more rows)
#> 
#> ===============================================================================
#> Use summary() for confidence intervals. Use plot() to visualize.
#> ===============================================================================

With Propensity Score Weights

# Overlap-weighted KM
km_ow <- weightedKM(
  data = simdata_bin,
  treatment_var = "Z",
  ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
  time_var = "time",
  event_var = "event",
  weight_method = "OW"
)

print(km_ow)
#> 
#> ===============================================================================
#> Weighted Kaplan-Meier Survival Estimates
#> ===============================================================================
#> 
#> Weight Method: OW 
#> Treatment Levels: A, B 
#> Number of Complete Cases: 1000 
#> Number of Unique Event Times: 666 
#> 
#> -------------------------------------------------------------------------------
#> Treatment Group: A 
#> -------------------------------------------------------------------------------
#>        time survival    se
#>  [1,] 0.049    0.998 0.003
#>  [2,] 0.066    0.996 0.004
#>  [3,] 0.188    0.996 0.004
#>  [4,] 0.312    0.996 0.004
#>  [5,] 0.320    0.996 0.004
#>  [6,] 0.396    0.993 0.005
#>  [7,] 0.500    0.993 0.005
#>  [8,] 0.520    0.993 0.005
#>  [9,] 0.534    0.993 0.005
#> [10,] 0.562    0.992 0.006
#> ... (656 more rows)
#> 
#> -------------------------------------------------------------------------------
#> Treatment Group: B 
#> -------------------------------------------------------------------------------
#>        time survival    se
#>  [1,] 0.049    1.000 0.000
#>  [2,] 0.066    1.000 0.000
#>  [3,] 0.188    0.997 0.003
#>  [4,] 0.312    0.995 0.005
#>  [5,] 0.320    0.994 0.005
#>  [6,] 0.396    0.994 0.005
#>  [7,] 0.500    0.992 0.006
#>  [8,] 0.520    0.989 0.007
#>  [9,] 0.534    0.988 0.007
#> [10,] 0.562    0.988 0.007
#> ... (656 more rows)
#> 
#> ===============================================================================
#> Use summary() for confidence intervals. Use plot() to visualize.
#> ===============================================================================

# IPW-weighted KM with trimming
km_ipw <- weightedKM(
  data = simdata_multi,
  treatment_var = "Z",
  ps_formula = Z ~ X1 + X2 + X3 + B1 + B2,
  time_var = "time",
  event_var = "event",
  weight_method = "IPW",
  trim = TRUE,
  delta = 0.1
)

print(km_ipw)
#> 
#> ===============================================================================
#> Weighted Kaplan-Meier Survival Estimates
#> ===============================================================================
#> 
#> Weight Method: IPW 
#> Trimming: Symmetric (delta = 0.1 )
#> Treatment Levels: A, B, C, D 
#> Number of Complete Cases: 1000 
#> Number of Unique Event Times: 667 
#> 
#> -------------------------------------------------------------------------------
#> Treatment Group: A 
#> -------------------------------------------------------------------------------
#>        time survival    se
#>  [1,] 0.036    1.000 0.000
#>  [2,] 0.051    0.996 0.002
#>  [3,] 0.070    0.992 0.003
#>  [4,] 0.125    0.992 0.003
#>  [5,] 0.243    0.992 0.003
#>  [6,] 0.284    0.992 0.003
#>  [7,] 0.287    0.992 0.003
#>  [8,] 0.290    0.992 0.003
#>  [9,] 0.301    0.992 0.003
#> [10,] 0.324    0.988 0.003
#> ... (657 more rows)
#> 
#> -------------------------------------------------------------------------------
#> Treatment Group: B 
#> -------------------------------------------------------------------------------
#>        time survival    se
#>  [1,] 0.036    0.994 0.002
#>  [2,] 0.051    0.994 0.002
#>  [3,] 0.070    0.994 0.002
#>  [4,] 0.125    0.994 0.002
#>  [5,] 0.243    0.994 0.002
#>  [6,] 0.284    0.994 0.002
#>  [7,] 0.287    0.988 0.003
#>  [8,] 0.290    0.983 0.004
#>  [9,] 0.301    0.983 0.004
#> [10,] 0.324    0.983 0.004
#> ... (657 more rows)
#> 
#> -------------------------------------------------------------------------------
#> Treatment Group: C 
#> -------------------------------------------------------------------------------
#>        time survival    se
#>  [1,] 0.036    1.000 0.000
#>  [2,] 0.051    1.000 0.000
#>  [3,] 0.070    1.000 0.000
#>  [4,] 0.125    1.000 0.000
#>  [5,] 0.243    1.000 0.000
#>  [6,] 0.284    0.995 0.002
#>  [7,] 0.287    0.995 0.002
#>  [8,] 0.290    0.995 0.002
#>  [9,] 0.301    0.992 0.003
#> [10,] 0.324    0.992 0.003
#> ... (657 more rows)
#> 
#> -------------------------------------------------------------------------------
#> Treatment Group: D 
#> -------------------------------------------------------------------------------
#>        time survival    se
#>  [1,] 0.036    1.000 0.000
#>  [2,] 0.051    1.000 0.000
#>  [3,] 0.070    1.000 0.000
#>  [4,] 0.125    0.997 0.002
#>  [5,] 0.243    0.994 0.002
#>  [6,] 0.284    0.994 0.002
#>  [7,] 0.287    0.994 0.002
#>  [8,] 0.290    0.994 0.002
#>  [9,] 0.301    0.994 0.002
#> [10,] 0.324    0.994 0.002
#> ... (657 more rows)
#> 
#> ===============================================================================
#> Use summary() for confidence intervals. Use plot() to visualize.
#> ===============================================================================

Creating Weighted KM or CR Curves

plot.weightedKM() function visualizes a weightedKM object by survival (aka. Kaplan-Meier) curves or cumulative risk (aka. cumulative incidence) curves. By default, it produces curves with 0.95 confidence band and a risk table for all groups, with each group indicated by a specific color. It allows users to flexibly adjust the confidence band, risk table, colors, and other aesthetic components.

Kaplan-Meier vs. CR

Specified by type: "Kaplan-Meier" for Kaplan-Meier and "CR" for cumulative risk. Cumulative risk is defined as 1-KM.

Confidence intervals

Plot survival curves or cumulative risk curves with or without confidence intervals:

# Survival curves with log CI
plot(km_ow, type = "Kaplan-Meier", conf_type = "log")

# Cumulative risk curves with log-log CI
plot(km_ow, type = "CR", conf_type = "log-log")
#> Warning: Removed 2 rows containing missing values or values outside the scale range
#> (`geom_ribbon()`).

With or Without confidence intervals: Specified by include_CI (default: TRUE).

Confidence interval types: Three methods available via conf_type:

"plain": Normal approximation (may exceed [0,1])
"log": Log transformation (guarantees bounds in (0,1))
"log-log": Complementary log-log transformation (best for extreme probabilities, default)

The confidence intervals and corresponding transformation are calculated and performed on the scale of \(\hat{S}(t)\) for Kaplan-Meier and of \(1-\hat{S}(t)\) for cumulative risk, where \(\hat{S}(t)\) is the estimated survival probability at time \(t\).

Risk Tables

Display weighted number at risk and cumulative events below the curves:

plot(km_ipw,
     risk_table = TRUE,
     risk_table_breaks = c(0, 5, 10, 15),
     risk_table_height = 0.3,
     risk_table_stats = c("n.risk", "n.acc.event"))
#> Warning: Removed 9 rows containing missing values or values outside the scale range
#> (`geom_ribbon()`).

Risk table customization:

risk_table: Logical, add risk table (default FALSE)
risk_table_stats: Statistics to display: “n.risk” (number at risk), “n.acc.event” (cumulative events)
risk_table_height: Relative height of risk table (0-1, default 0.25)
risk_table_breaks: Time points for columns (default: automatic)
risk_table_fontsize: Font size (default 3.5)

Summary Output

summary(km_ow, type = "Kaplan-Meier", conf_type = "log-log", print.rows = 5)
#> 
#> ===============================================================================
#> Summary of Weighted Kaplan-Meier Estimates
#> ===============================================================================
#> 
#> Type: Survival Probability 
#> Confidence Level: 95% 
#> Confidence Type: log-log 
#> Weight Method: OW 
#> 
#> -------------------------------------------------------------------------------
#> Treatment Group: A 
#> -------------------------------------------------------------------------------
#>       time estimate    se CI_lower CI_upper
#> [1,] 0.000    1.000 0.000    1.000    1.000
#> [2,] 0.049    0.998 0.003    1.000    0.965
#> [3,] 0.066    0.996 0.004    0.999    0.969
#> [4,] 0.188    0.996 0.004    0.999    0.969
#> [5,] 0.312    0.996 0.004    0.999    0.969
#> ... (662 more rows)
#> 
#> -------------------------------------------------------------------------------
#> Treatment Group: B 
#> -------------------------------------------------------------------------------
#>       time estimate    se CI_lower CI_upper
#> [1,] 0.000    1.000 0.000    1.000    1.000
#> [2,] 0.049    1.000 0.000       NA       NA
#> [3,] 0.066    1.000 0.000       NA       NA
#> [4,] 0.188    0.997 0.003    1.000    0.967
#> [5,] 0.312    0.995 0.005    0.999    0.969
#> ... (662 more rows)
#> 
#> ===============================================================================
#> Note: Full-precision results returned invisibly. Use print.rows to show more.
#> ===============================================================================

The summary returns a list of matrices (one per treatment group) with full-precision estimates, invisibly. Customize display with print.digits and print.rows.

Summary

Feature	`surveff()`	`marCoxph()`	`weightedKM()`
Output	Survival curves, survival differences	Hazard ratios	Weighted KM curves
Weighting	IPW, ATT, OW	IPW, ATT, OW	IPW, ATT, OW, none
Censoring adjustment	Weibull or Cox	None (standard Cox)	None
Variance	Analytical or bootstrap	Robust or bootstrap	Weighted Greenwood
Trimming	Symmetric	Symmetric	Symmetric
Multiple groups	Yes (with contrast_matrix)	Yes	Yes
Risk tables	No	No	Yes

References

Li, F., Morgan, K. L., & Zaslavsky, A. M. (2018). Balancing covariates via propensity score weighting. Journal of the American Statistical Association, 113(521), 390-400.

Li, F., & Li, F. (2019). Propensity score weighting for causal inference with multiple treatments. The Annals of Applied Statistics, 13(4), 2389-2415.

Cheng, C., Li, F., Thomas, L. E., & Li, F. (2022). Addressing extreme propensity scores in estimating counterfactual survival functions via the overlap weights. American Journal of Epidemiology, 191(6), 1140-1151.

PSsurvival: Propensity Score Weighting for Survival Analysis

1. Introduction

2. Data for Vignette

3. surveff(): Counterfactual Survival Probabilities and Differences

Key Arguments

Basic Usage: Binary Treatment

Multiple Treatment Groups

Weighting Methods

Propensity Score Trimming

Censoring Methods

Variance Estimation

Summary Output

Plotting Results

4. marCoxph(): Marginal Hazard Ratios by Cox-PH model

Key Arguments

Basic Usage: Binary Treatment

Multiple Treatment Groups

Weighting Methods

Trimming Options

Variance Options

Summary Output

5. weightedKM(): Kaplan-Meier or Cumulative Risk Curves with Propensity Score Weights

Key Arguments

Fit a weightedKM() Object

Without Propensity Score Weights

With Propensity Score Weights

Creating Weighted KM or CR Curves

Kaplan-Meier vs. CR

Confidence intervals

Risk Tables

Summary Output

Summary

References

3. `surveff()`: Counterfactual Survival Probabilities and Differences

4. `marCoxph()`: Marginal Hazard Ratios by Cox-PH model

5. `weightedKM()`: Kaplan-Meier or Cumulative Risk Curves with Propensity Score Weights

Fit a `weightedKM()` Object