% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/baseline_table.R
\name{get_var_types}
\alias{get_var_types}
\title{Get variable types for baseline table}
\usage{
get_var_types(
  data,
  strata = NULL,
  norm_test_by_group = TRUE,
  omit_factor_above = 20,
  num_to_factor = 5,
  save_qqplots = FALSE,
  folder_name = "qqplots"
)
}
\arguments{
\item{data}{A data frame.}

\item{strata}{A character string indicating the column name of the strata variable.}

\item{norm_test_by_group}{A logical value indicating whether to perform normality tests by group.}

\item{omit_factor_above}{An integer indicating the maximum number of levels for a variable to be
considered a factor.}

\item{num_to_factor}{An integer. Numerical variables with number of unique values below or equal
to this value would be considered a factor.}

\item{save_qqplots}{A logical value indicating whether to save QQ plots. Sometimes the normality
tests do not work well for some variables, and the QQ plots can be used to check the distribution.}

\item{folder_name}{A character string indicating the folder name for saving QQ plots.}
}
\value{
An object from class \code{var_types}, which is just list containing the following elements:
\item{factor_vars}{A character vector of variables that are factors.}
\item{exact_vars}{A character vector of variables that require fisher exact test.}
\item{nonnormal_vars}{A character vector of variables that are nonnormal.}
\item{omit_vars}{A character vector of variables that are excluded form the baseline table.}
\item{strata}{A character vector of the strata variable.}
}
\description{
Automatic variable type and method determination for baseline table.
}
\note{
This function performs normality tests on the variables in the data frame and determines
whether they are normal. This is done by performing Shapiro-Wilk, Lilliefors, Anderson-Darling,
Jarque-Bera, and Shapiro-Francia tests. If at least two of these tests indicate that the variable
is nonnormal, then it is considered nonnormal. To alleviate the problem that normality tests become
too sensitive when sample size gets larger, the alpha level is determined by an experience formula
that decrease with sample size.

This function also marks the factor variables that require fisher exact tests if any cell haves
expected frequency less than or equal to 5. Note that this criterion less strict than the commonly
used one.
}
\examples{
data(cancer, package = "survival")
get_var_types(cancer, strata = "sex") # set save_qqplots = TRUE to check the QQ plots

var_types <- get_var_types(cancer, strata = "sex")
# for some reason we want the variable "pat.karno" ro be considered normal.
var_types$nonnormal_vars <- setdiff(var_types$nonnormal_vars, "pat.karno")
}
