Before running a power simulation, you need three nuisance parameters estimated from your pre-intervention time-series data:
| Parameter | Symbol | Meaning |
|---|---|---|
| Baseline | \(\beta_0\) | Mean outcome at the start of the pre-intervention period |
| Residual SD | \(\sigma\) | Outcome variability from one time point to the next |
| Autocorrelation | \(\rho\) | AR(1) correlation between consecutive observations |
A fourth quantity, trend_pre, checks whether the outcome
was stable before the intervention. It should be close to zero.
These parameters are not the intervention effect —
that is your clinical hypothesis (level_change), which must
be set separately based on clinical judgement or published evidence.
PITS provides individual functions for each parameter, useful when you want fine-grained control or wish to estimate only one quantity.
data("example_cfr_data")
outcome <- example_cfr_data$outcome
# Baseline: intercept of a linear trend fit at t = 1
b <- estimate_baseline(outcome)
cat("Baseline:", round(b, 3), "\n")
#> Baseline: 14.696
# Sigma: residual SD after detrending
s <- estimate_sigma(outcome)
cat("Sigma: ", round(s, 3), "\n")
#> Sigma: 1.514
# Rho: lag-1 autocorrelation of residuals
r <- estimate_rho(outcome)
cat("Rho: ", round(r, 3), "\n")
#> Rho: 0.371
# Pre-trend: slope per time unit (should be near zero)
t_pre <- estimate_trend(outcome)
cat("Trend: ", round(t_pre, 4), "per month\n")
#> Trend: 0.0019 per monthFor most use cases, estimate_its_params() is the most
convenient entry point. It accepts either a data frame or a plain
numeric vector.
# From a data frame with 'time' and 'outcome' columns
params <- estimate_its_params(example_cfr_data)
#>
#> ============================================================
#> PITS - ITS Parameter Estimation
#> ============================================================
#>
#> Estimation method: GLS-ML
#> Observations (n_pre): 24
#> Baseline: 14.5853
#> Sigma (residual SD): 1.4876 (10.2% of baseline)
#> Rho (AR1): 0.3832
#> Pre-trend (per unit): 0.0039
#>
#> ------------------------------------------------------------
#> Copy these into calculate_power():
#>
#> n_pre = 24
#> baseline = 14.5853
#> sigma = 1.4876
#> rho = 0.3832
#> level_change = ??? # YOUR clinical hypothesis
#> slope_change = 0 # set if testing slope
#>
#> ============================================================# Or from a plain numeric vector (time index auto-generated)
params2 <- estimate_its_params(example_cfr_data$outcome, verbose = FALSE)
identical(params$sigma, params2$sigma)
#> [1] TRUEThe returned list plugs directly into
calculate_power():
result <- calculate_power(
n_pre = params$n_pre,
n_post = 30,
baseline = params$baseline,
level_change = -3, # <-- your clinical hypothesis
sigma = params$sigma,
rho = params$rho
)If your data uses different column names, pass them explicitly:
# Rename for illustration
alt_data <- example_cfr_data
names(alt_data) <- c("month", "cfr_pct")
params_alt <- estimate_its_params(
alt_data,
outcome_col = "cfr_pct",
time_col = "month",
verbose = FALSE
)
cat("Sigma:", round(params_alt$sigma, 3), "\n")
#> Sigma: 1.488Always inspect the pre-intervention data before trusting parameter
estimates. diagnose_params() produces a 2×2 panel covering
the four key checks:
{r diag-full, fig.width = 8, fig.height = 6, fig.cap = "Diagnostic panel: (1) observed series with fitted trend — checks for outliers and non-linearity; (2) residuals over time — checks for heteroscedasticity; (3) Q-Q plot — checks for normality of residuals; (4) residual ACF — quantifies autocorrelation structure."} out <- diagnose_params(example_cfr_data)
What to look for:
If historical data are not available, use
simulate_predata() to generate a synthetic pre-intervention
series and explore how different parameter assumptions affect power.
Then use the literature or expert elicitation to justify your
choices.
# Simulate pre-intervention data with assumed parameters
pre_synthetic <- simulate_predata(
n = 24,
baseline = 15,
sigma = 2.0,
rho = 0.40,
seed = 42
)
head(pre_synthetic)
#> time outcome
#> 1 1 17.51301
#> 2 2 14.97009
#> 3 3 15.65366
#> 4 4 16.42152
#> 5 5 16.30964
#> 6 6 15.32933plot(pre_synthetic$time, pre_synthetic$outcome,
type = "o", pch = 16, col = "steelblue",
xlab = "Time", ylab = "Outcome",
main = "Synthetic pre-intervention data",
las = 1, bty = "l")
abline(h = 15, lty = 2, col = "grey50")
grid(NULL, NULL, lwd = 0.6, col = rgb(0, 0, 0, 0.1))You can then recover the estimated parameters from this synthetic series to verify your simulation is self-consistent:
recovered <- estimate_its_params(pre_synthetic, verbose = FALSE)
cat(sprintf("Target → baseline: 15.00 sigma: 2.000 rho: 0.400\n"))
#> Target → baseline: 15.00 sigma: 2.000 rho: 0.400
cat(sprintf("Recovered → baseline: %.2f sigma: %.3f rho: %.3f\n",
recovered$baseline, recovered$sigma, recovered$rho))
#> Recovered → baseline: 17.52 sigma: 2.246 rho: 0.279Small discrepancies are expected from a single 24-point realisation. Longer pre-periods (n = 48–60) yield more reliable estimates.
estimate_its_params() accepts date columns and converts
them to a sequential integer index automatically:
Missing values in the outcome are removed with a warning before fitting:
Fewer than 12 observations produces a warning. Estimates become increasingly unreliable as the pre-period shrinks, and are essentially meaningless below n = 8.
params_short <- estimate_its_params(
example_cfr_data$outcome[1:9],
verbose = FALSE
)
#> Warning in estimate_its_params(example_cfr_data$outcome[1:9], verbose = FALSE):
#> Only 9 observations available. Estimates will be unreliable; aim for >= 24.If you genuinely have fewer than 12 pre-intervention observations, consider whether the ITS design is appropriate, or whether a controlled ITS (with a comparison series) would be more robust.
As a reference when pre-intervention data are unavailable:
| Parameter | Typical range | Notes |
|---|---|---|
sigma |
10–20% of baseline | Higher for small hospitals or rare outcomes |
rho (monthly) |
0.30–0.50 | Use 0.40 as a conservative default |
rho (weekly) |
0.50–0.80 | Consider aggregating to monthly |
rho (daily) |
0.70–0.90 | Strong case for monthly aggregation |