% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/kija_causal_forest_dynamic_subgroups.R
\name{CausalForestDynamicSubgroups}
\alias{CausalForestDynamicSubgroups}
\title{Calculate CATE in dynamically determined subgroups}
\usage{
CausalForestDynamicSubgroups(forest, n_rankings = 3, n_folds = 5, ...)
}
\arguments{
\item{forest}{An object of class \code{causal_forest}, as returned by
\link[grf]{causal_forest}().}

\item{n_rankings}{Integer, scalar with number of groups to rank CATE's into.}

\item{n_folds}{Integer, scalar with number of folds to split data into.}

\item{...}{Additional arguments passed to \link[grf]{causal_forest}() and
\link[grf]{regression_forest}().}
}
\value{
A list with elements
\itemize{
\item forest_subgroups: A tibble with CATE estimates, ranking, and AIPW-scores
for each subject.
\item forest_rank_ate: A tibble with the ATE estimate and standard error of
each subgroup.
\item forest_rank_diff_test: A tibble with estimates of the difference in ATE
between subgroups and p-values for a formal test of no difference.
\item heatmap_data: A tibble with data used to draw a heatmap of covariate
distribution in each subgroup.
\item forest_rank_ate_plot: ggplot with the ATE estimates in each subgroup.
\item heatmap: ggplot with heatmap of covariate distribution in each subgroup.
}
}
\description{
Determines subgroups ranked by CATE estimates from a causal_forest object,
then calculates comparable CATE estimates in each subgroup and tests for
differences.
}
\details{
To evaluate heterogeneity in treatment effect one can split data
into groups by estimated CATE (for an alternative, see also
\link[EpiForsk]{RATEOmnibusTest}). To compare estimates one must use a
model which is not trained on the subjects we wish to compare. To achieve
this, data is partitioned into n_folds folds and a causal forest is trained
for each fold where the fold is left out. If the data has no existing
clustering, one \link[grf]{causal_forest}() is trained with the folds as
clustering structure. This enables predictions on each fold where trees
using data from the fold are left out for the prediction. In the case of
preexisting clustering in the data, folds are sampled within each cluster
and combined across clusters afterwards.
}
\examples{
\donttest{
n <- 800
p <- 3
X <- matrix(rnorm(n * p), n, p) |> as.data.frame()
W <- rbinom(n, 1, 0.5)
event_prob <- 1 / (1 + exp(2 * (pmax(2 * X[, 1], 0) * W - X[, 2])))
Y <- rbinom(n, 1, event_prob)
cf <- grf::causal_forest(X, Y, W)
cf_ds <- CausalForestDynamicSubgroups(cf, 2, 4)
}

}
\author{
KIJA
}
