The forestploter package provides a flexible way to draw
forest plots. The layout of the plot is determined by the dataset, and
all plot elements are placed in cells, making it easy to edit any
element by specifying its row and column. The graphical parameters of
each element can be further customized within its respective cell. This
vignette will demonstrate how to create a simple forest plot.
The plotting steps demonstrated in this vignette may not be optimal. Other R packages may be better suited for the plots shown here. Please choose the one that best suits your needs. The final plot is shown below:
Forest plots are commonly used in medical research publications, especially in meta-analysis. They can also be used to report the coefficients and confidence intervals (CIs) of regression models.
There are many packages available for drawing forest plots. The most popular one is forestplot. Other packages specialized for meta-analysis include meta, metafor, and rmeta. Some packages, like ggforestplot, use ggplot2 to draw forest plots, though ggforestplot is not yet available on CRAN.
The main differences between forestploter and other
packages are:
The layout of the forest plot is determined by the dataset provided. Please refer to the other vignette for instructions on changing text or background, adding or inserting text, adding borders to cells, and editing the color of the CI in specific cells.
The first step is to prepare a data.frame that will
serve as the basic layout of the forest plot. The column names of the
data will be drawn as the header, and the content will be displayed in
the forest plot body. One or more blank columns should be provided to
draw the confidence intervals (CIs). The width of the CI is
determined by the width of its corresponding column. To provide more
space for the CI, increase the number of spaces in the blank
column.
First, we need to prepare the data for plotting.
library(grid)
library(forestploter)
# Read provided sample example data
dt <- read.csv(system.file("extdata", "example_data.csv", package = "forestploter"))
# Keep needed columns
dt <- dt[, 1:6]
# Indent the subgroup if there is a number in the placebo column
dt$Subgroup <- ifelse(is.na(dt$Placebo),
dt$Subgroup,
paste0(" ", dt$Subgroup))
# Replace NA with blank or NA will be transformed to character
dt$Treatment <- ifelse(is.na(dt$Treatment), "", dt$Treatment)
dt$Placebo <- ifelse(is.na(dt$Placebo), "", dt$Placebo)
dt$se <- (log(dt$hi) - log(dt$est)) / 1.96
# Add a blank column for the forest plot to display CI
# Adjust the column width with spaces; increase the number of spaces below
# to provide a larger area for drawing the CI
dt$` ` <- paste(rep(" ", 20), collapse = " ")
# Create a confidence interval column to display
dt$`HR (95% CI)` <- ifelse(is.na(dt$se), "",
sprintf("%.2f (%.2f to %.2f)",
dt$est, dt$low, dt$hi))
head(dt)
#> Subgroup Treatment Placebo est low hi se
#> 1 All Patients 781 780 1.869694 0.13245636 3.606932 0.3352463
#> 2 Sex NA NA NA NA
#> 3 Male 535 548 1.449472 0.06834426 2.830600 0.3414741
#> 4 Female 246 232 2.275120 0.50768005 4.042560 0.2932884
#> 5 Age NA NA NA NA
#> 6 <65 yr 297 333 1.509242 0.67029394 2.348190 0.2255292
#> HR (95% CI)
#> 1 1.87 (0.13 to 3.61)
#> 2
#> 3 1.45 (0.07 to 2.83)
#> 4 2.28 (0.51 to 4.04)
#> 5
#> 6 1.51 (0.67 to 2.35)The data prepared above will serve as the basic layout of the forest plot. The example below demonstrates how to draw a simple forest plot, with a footnote added for demonstration.
p <- forest(dt[, c(1:3, 8:9)],
est = dt$est,
lower = dt$low,
upper = dt$hi,
sizes = dt$se,
ci_column = 4,
ref_line = 1,
arrow_lab = c("Placebo Better", "Treatment Better"),
xlim = c(0, 4),
ticks_at = c(0.5, 1, 2, 3),
footnote = "This is the demo data. Please feel free to change\nanything you want.")
# Print plot
plot(p)We will now use the same data as above but add a summary point.
Additionally, we will change the graphical parameters for the confidence
interval and other parts of the plot. The theme of the forest plot can
be adjusted with the forest_theme function. Refer to the
manual for more details.
dt_tmp <- rbind(dt[-1, ], dt[1, ])
dt_tmp[nrow(dt_tmp), 1] <- "Overall"
dt_tmp <- dt_tmp[1:11, ]
# Define theme
tm <- forest_theme(base_size = 10,
# Confidence interval point shape, line type/color/width
ci_pch = 15,
ci_col = "#762a83",
ci_fill = "black",
ci_alpha = 0.8,
ci_lty = 1,
ci_lwd = 1.5,
ci_Theight = 0.2, # Set a T end at the end of CI
# Reference line width/type/color
refline_gp = gpar(lwd = 1, lty = "dashed", col = "grey20"),
# Vertical line width/type/color
vertline_lwd = 1,
vertline_lty = "dashed",
vertline_col = "grey20",
# Change summary color for filling and borders
summary_fill = "#4575b4",
summary_col = "#4575b4",
# Footnote font size/face/color
footnote_gp = gpar(cex = 0.6, fontface = "italic", col = "blue"))
pt <- forest(dt_tmp[, c(1:3, 8:9)],
est = dt_tmp$est,
lower = dt_tmp$low,
upper = dt_tmp$hi,
sizes = dt_tmp$se,
is_summary = c(rep(FALSE, nrow(dt_tmp) - 1), TRUE),
ci_column = 4,
ref_line = 1,
arrow_lab = c("Placebo Better", "Treatment Better"),
xlim = c(0, 4),
ticks_at = c(0.5, 1, 2, 3),
footnote = "This is the demo data. Please feel free to change\nanything you want.",
theme = tm)
# Print plot
plot(pt)By default, all cells are left-aligned. However, it is possible to
justify any cell in the forest plot by setting parameters in
forest_theme. For example,
core = list(fg_params = list(hjust = 0, x = 0)) left-aligns
the content, while
rowhead = list(fg_params = list(hjust = 0.5, x = 0.5))
centers the header. To right-align text, set hjust = 1 and
x = 0.9. You can also change the text justification
with edit_plot, as detailed in another
vignette.
The same rule applies to changing the background color. This can be
done by setting
core = list(bg_params = list(fill = c("#edf8e9", "#c7e9c0", "#a1d99b"))).
Modify settings in core to change the graphical parameters
of the plot’s content, and use colhead for the header. To
modify the text, adjust the settings in fg_params (see
textGrob() in the grid package), and for the
background, change bg_params (see gpar() in
the grid package). Parameters should be passed as a list.
More details can be found here.
Provide a single value for uniform justification across all cells or a vector for varied justification. As shown in the second example, text is justified by row using the provided vector, which will be recycled as needed.
dt <- dt[1:4, ]
# Header center and content right
tm <- forest_theme(core = list(fg_params = list(hjust = 1, x = 0.9),
bg_params = list(fill = c("#edf8e9", "#c7e9c0", "#a1d99b"))),
colhead = list(fg_params = list(hjust = 0.5, x = 0.5)))
p <- forest(dt[, c(1:3, 8:9)],
est = dt$est,
lower = dt$low,
upper = dt$hi,
sizes = dt$se,
ci_column = 4,
title = "Header center content right",
theme = tm)
# Print plot
plot(p)
# Mixed justification
tm <- forest_theme(core = list(fg_params = list(hjust = c(1, 0, 0, 0.5),
x = c(0.9, 0.1, 0, 0.5)),
bg_params = list(fill = c("#f6eff7", "#d0d1e6", "#a6bddb", "#67a9cf"))),
colhead = list(fg_params = list(hjust = c(1, 0, 0, 0, 0.5),
x = c(0.9, 0.1, 0, 0, 0.5))))
p <- forest(dt[, c(1:3, 8:9)],
est = dt$est,
lower = dt$low,
upper = dt$hi,
sizes = dt$se,
ci_column = 4,
title = "Mixed justification",
theme = tm)
plot(p)Similar to text justification, you can parse text in any cell. However, parsing all text will remove blanks from the data, which will also affect the blank columns used for drawing the whiskers.
# Check out the `plotmath` function for math expression.
dt <- data.frame(
Study = c("Study ~1^a", "Study ~2^b", "NO[2]"),
low = c(0.2, -0.03, 1.11),
est = c(0.71, 0.35, 1.79),
hi = c(1.22, 0.74, 2.47)
)
dt$SMD <- sprintf("%.2f (%.2f, %.2f)", dt$est, dt$low, dt$hi)
dt$` ` <- paste(rep(" ", 20), collapse = " ")
fig_dt <- dt[, c(1, 5:6)]
# Get a matrix of which row and columns to parse
parse_mat <- matrix(FALSE,
nrow = nrow(fig_dt),
ncol = ncol(fig_dt))
# Here we want to parse the first column only, you can amend this to whatever you want.
parse_mat[, 1] <- TRUE
# Remove this if you don't want to parse the column head.
tm <- forest_theme(colhead = list(fg_params = list(parse = TRUE)),
core = list(fg_params = list(parse = parse_mat)))
p <- forest(fig_dt,
est = dt$est,
lower = dt$low,
upper = dt$hi,
ci_column = 3,
theme = tm)
# Add customized footnote.
# Due to the limitation of the textGrob, passing a parsed text with linebreak
# has some issues. We use a different approach here.
txt <- "<sup>a</sup> This is study A<br><sup>b</sup> This is study B"
add_grob(p,
row = 4,
col = 1:2,
order = "background",
gb_fn = gridtext::richtext_grob,
text = txt,
gp = gpar(fontsize = 8),
hjust = 0, vjust = 1, halign = 0, valign = 1,
x = unit(0, "npc"), y = unit(1, "npc"))You may want to have multiple CI columns, with each representing a
different outcome. To achieve this, provide a vector of column positions
where the CIs will be drawn. If the number of CI columns matches the
number of est values, one CI will be drawn in each
specified column. If there are fewer CI columns than est
values, the extra est values will be treated as a group and
drawn sequentially in the available CI columns. In this case, the group
number is determined by dividing the number of est values
by the number of ci_column, and multiple CIs will be drawn
in a single cell. As shown in the example below, the CIs are drawn in
columns 3 and 5, with the first and second elements of est,
lower, and upper corresponding to columns 3
and 5, respectively.
In an example with multiple groups, two or more CIs can be displayed
in one cell. The solution is to provide all values sequentially to
est, lower, and upper. This means
that the first n elements in est,
lower, and upper are treated as the same
group, and the same applies to the next n elements, where
n is determined by the number of ci_column. As
demonstrated in the example below, est_gp1 and
est_gp2 are drawn in columns 3 and 5 as group
1, while est_gp3 and est_gp4 are
drawn in the same columns as group 2.
This is an example of multiple CI columns and groups:
dt <- read.csv(system.file("extdata", "example_data.csv", package = "forestploter"))
dt <- dt[1:7, ]
# Indent the subgroup if there is a number in the placebo column
dt$Subgroup <- ifelse(is.na(dt$Placebo),
dt$Subgroup,
paste0(" ", dt$Subgroup))
# Replace NA with blank or NA will be transformed to character
dt$n1 <- ifelse(is.na(dt$Treatment), "", dt$Treatment)
dt$n2 <- ifelse(is.na(dt$Placebo), "", dt$Placebo)
# Add two blank columns for CI
dt$`CVD outcome` <- paste(rep(" ", 20), collapse = " ")
dt$`COPD outcome` <- paste(rep(" ", 20), collapse = " ")
# Generate point estimation and 95% CI. Paste two CIs together and separate by line break.
dt$ci1 <- paste(sprintf("%.1f (%.1f, %.1f)", dt$est_gp1, dt$low_gp1, dt$hi_gp1),
sprintf("%.1f (%.1f, %.1f)", dt$est_gp3, dt$low_gp3, dt$hi_gp3),
sep = "\n")
dt$ci1[grepl("NA", dt$ci1)] <- "" # Any NA to blank
dt$ci2 <- paste(sprintf("%.1f (%.1f, %.1f)", dt$est_gp2, dt$low_gp2, dt$hi_gp2),
sprintf("%.1f (%.1f, %.1f)", dt$est_gp4, dt$low_gp4, dt$hi_gp4),
sep = "\n")
dt$ci2[grepl("NA", dt$ci2)] <- ""
# Set-up theme
tm <- forest_theme(base_size = 10,
refline_gp = gpar(lty = "solid"),
ci_pch = c(15, 18),
ci_col = c("#377eb8", "#4daf4a"),
footnote_gp = gpar(col = "blue"),
legend_name = "Group",
legend_value = c("Trt 1", "Trt 2"),
vertline_lty = c("dashed", "dotted"),
vertline_col = c("#d6604d", "#bababa"),
# Table cell padding, width 4 and heights 3
core = list(padding = unit(c(4, 3), "mm")))
p <- forest(dt[, c(1, 19, 23, 21, 20, 24, 22)],
est = list(dt$est_gp1,
dt$est_gp2,
dt$est_gp3,
dt$est_gp4),
lower = list(dt$low_gp1,
dt$low_gp2,
dt$low_gp3,
dt$low_gp4),
upper = list(dt$hi_gp1,
dt$hi_gp2,
dt$hi_gp3,
dt$hi_gp4),
ci_column = c(4, 7),
ref_line = 1,
vert_line = c(0.5, 2),
nudge_y = 0.4,
theme = tm)
plot(p)It is clear that forest uses the provided data as the
skeleton for the forest plot. You can use your imagination to place any
content in a cell, including line breaks. Please refer to the other
vignette for instructions on how to modify text alignment.
When a forest plot has multiple columns, you may want to apply
different settings to each one. For example, different CI columns can
have distinct xlim, x-axis ticks, x-axis labels,
x_trans transformations, reference lines, vertical lines,
or arrow labels. This can be easily achieved by providing a list or a
vector. Use a list for xlim, vert_line,
arrow_lab, and ticks_at, and an atomic vector
for xlab, x_trans, and ref_line.
See the example below for a demonstration.
dt$`HR (95% CI)` <- ifelse(is.na(dt$est_gp1), "",
sprintf("%.2f (%.2f to %.2f)",
dt$est_gp1, dt$low_gp1, dt$hi_gp1))
dt$`Beta (95% CI)` <- ifelse(is.na(dt$est_gp2), "",
sprintf("%.2f (%.2f to %.2f)",
dt$est_gp2, dt$low_gp2, dt$hi_gp2))
tm <- forest_theme(arrow_type = "closed",
arrow_label_just = "end")
p <- forest(dt[, c(1, 21, 23, 22, 24)],
est = list(dt$est_gp1,
dt$est_gp2),
lower = list(dt$low_gp1,
dt$low_gp2),
upper = list(dt$hi_gp1,
dt$hi_gp2),
ci_column = c(2, 4),
ref_line = c(1, 0),
vert_line = list(c(0.3, 1.4), c(0.6, 2)),
x_trans = c("log", "none"),
arrow_lab = list(c("L1", "R1"), c("L2", "R2")),
xlim = list(c(0, 3), c(-1, 3)),
ticks_at = list(c(0.1, 0.5, 1, 2.5), c(-1, 0, 2)),
xlab = c("OR", "Beta"),
nudge_y = 0.2,
theme = tm)
plot(p)It is possible to pass a custom CI drawing function to
forest. The fn_ci argument accepts a CI
drawing function for normal confidence intervals, while
fn_summary is used for summary CIs. Other parameters for
these functions can be passed via forest. If you need to
pass row values such as est and lower to these
functions, you must define the names of the parameters you have passed
in index_args. This is an advanced technique, and this
vignette does not cover how to create a CI drawing function. However,
you can find tutorials here
if you are interested. Below is an example of how to use a box plot CI
with the built-in make_boxplot function.
# Function to calculate Box plot values
box_func <- function(x){
iqr <- IQR(x)
q3 <- quantile(x, probs = c(0.25, 0.5, 0.75), names = FALSE)
c("min" = q3[1] - 1.5 * iqr, "q1" = q3[1], "med" = q3[2],
"q3" = q3[3], "max" = q3[3] + 1.5 * iqr)
}
# Prepare data
val <- split(ToothGrowth$len, list(ToothGrowth$supp, ToothGrowth$dose))
val <- lapply(val, box_func)
dat <- do.call(rbind, val)
dat <- data.frame(Dose = row.names(dat),
dat, row.names = NULL)
dat$Box <- paste(rep(" ", 20), collapse = " ")
# Draw a single group box plot
tm <- forest_theme(ci_Theight = 0.2)
p <- forest(dat[, c(1, 7)],
est = dat$med,
lower = dat$min,
upper = dat$max,
# sizes = sizes,
fn_ci = make_boxplot,
ci_column = 2,
lowhinge = dat$q1,
uphinge = dat$q3,
hinge_height = 0.2,
# values of the lowhinge and uphinge will be used as row values
index_args = c("lowhinge", "uphinge"),
gp_box = gpar(fill = "black", alpha = 0.4),
theme = tm
)
pYou can use either the base R method or the ggsave
function to save the plot. When using ggsave, be sure to
specify the plot parameter. The width and height should be
adjusted to achieve the desired output. Alternatively, you can set
autofit = TRUE in the print or
plot function to automatically fit the plot, though this
may result in a layout that is not as compact as desired.
# Base method
png('rplot.png', res = 300, width = 7.5, height = 7.5, units = "in")
p
dev.off()
# ggsave function
ggplot2::ggsave(filename = "rplot.png", plot = p,
dpi = 300,
width = 7.5, height = 7.5, units = "in")Alternatively, you can retrieve the width and height of the forest
plot using get_wh and use these dimensions when saving.
# Get width and height
p_wh <- get_wh(plot = p, unit = "in")
png('rplot.png', res = 300, width = p_wh[1], height = p_wh[2], units = "in")
p
dev.off()
# Or get scale
get_scale <- function(plot,
width_wanted,
height_wanted,
unit = "in"){
h <- convertHeight(sum(plot$heights), unit, TRUE)
w <- convertWidth(sum(plot$widths), unit, TRUE)
max(c(w / width_wanted, h / height_wanted))
}
p_sc <- get_scale(plot = p, width_wanted = 6, height_wanted = 4, unit = "in")
ggplot2::ggsave(filename = "rplot.png",
plot = p,
dpi = 300,
width = 6,
height = 4,
units = "in",
scale = p_sc)Q: The whisker/CI plot area is too narrow. What should I do?
A: The vignettes may not be perfectly written, but you should be able to resolve this by carefully reviewing the examples. To widen the CI plot area, increase the number of blank spaces in the column where the CI is drawn. Please refer to the first example for a demonstration of how to do this.
Q: Can I modify the width and height of each row and column?
A: Yes. Although the data’s content determines the
initial dimensions of the rows and columns, you can modify them after
plotting. For details, see the discussion here.
You can also add padding to each cell by using
core = list(padding = unit(c(4, 3), "mm")) in
forest_theme.
Q: How should I use weights for sizes?
A: The forest function uses the
sizes argument as is, without any transformation. If you
need to weigh the sizes yourself, you can find some options discussed here.
Q: How can I create a grouped forest plot?
A: You can indicate group breaks by leaving a few
blank lines in your data. Alternatively, you can combine multiple forest
plots using arrangeGrob from the gridExtra
package or wrap_elements from patchwork.