rtables
models both the row- and column-structure of a
table as trees. These trees collectively reflect the layout instructions
used to declare the table’s structure. We can use this to describe
locations within the row-, column-, or cell-space of a table in a
semantically self-describing way. We call these semantically meaningful
locations paths.
A path is an ordered set of names which declares both a path
for traversing the tree structure for the relevant dimensions, and
consequently a corresponding subset of the table in that dimension.
Column paths may contain only split names and names of facets
generated from those splits. Row paths, can additionally
contain names of tables corresponding to analysis
calls,
the "@content"
directive which steps from a facet into the
table generated by summarize_row_groups
containing its
marginal summary row(s), and names of individual rows.The location of
individual cells or rectangular groups of cells is then defined by a
row-path column-path pair.
As of rtables
version 0.6.13
, any
structural element of a successfully built1 table is guaranteed to
correspond to a unique row path, column path, or combination
thereof.
Consider a the table with non-trivial structure in both the column and row dimensions:
library(rtables)
keep_rc <- c("ASIAN", "WHITE") ## chosen for brevity
afun <- function(x) {
list(
Mean = rcell(mean(x), format = "xx.x"),
Median = rcell(median(x), format = "xx.x")
)
}
lyt <- basic_table() |>
split_cols_by("ARM", split_fun = keep_split_levels(c("A: Drug X", "C: Combination"))) |>
split_cols_by("SEX", split_fun = keep_split_levels(c("F", "M"))) |>
add_overall_col("All") |>
split_rows_by("RACE", split_fun = keep_split_levels(keep_rc)) |>
summarize_row_groups() |>
split_rows_by("STRATA1") |>
summarize_row_groups() |>
analyze("AGE", afun = afun) |>
analyze("BMRKR1", nested = FALSE, show_labels = "visible")
tbl <- build_table(lyt, DM)
tbl
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————————————————————————————————
## ASIAN 44 (62.9%) 35 (68.6%) 40 (65.6%) 44 (64.7%) 231 (64.9%)
## A 15 (21.4%) 12 (23.5%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 30.4 34.4 37.4 36.2 34.5
## Median 30.0 33.5 38.0 33.0 33.0
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
## Median 32.0 34.5 34.0 34.5 33.0
## C 13 (18.6%) 15 (29.4%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 36.9 35.6 33.5 31.4 33.9
## Median 36.0 37.0 34.0 31.0 34.0
## WHITE 8 (11.4%) 6 (11.8%) 8 (13.1%) 10 (14.7%) 46 (12.9%)
## A 2 (2.9%) 1 (2.0%) 1 (1.6%) 5 (7.4%) 15 (4.2%)
## Mean 34.0 45.0 35.0 32.8 33.3
## Median 34.0 45.0 35.0 31.0 31.0
## B 4 (5.7%) 3 (5.9%) 3 (4.9%) 1 (1.5%) 16 (4.5%)
## Mean 37.0 43.7 34.3 36.0 38.3
## Median 38.5 44.0 35.0 36.0 38.0
## C 2 (2.9%) 2 (3.9%) 4 (6.6%) 4 (5.9%) 15 (4.2%)
## Mean 35.5 44.0 38.5 35.0 39.1
## Median 35.5 44.0 38.5 32.5 40.0
## BMRKR1
## Mean 6.06 5.42 5.83 5.57 5.85
We can get a first look at the row- and column-structure of our table
(if in different formats) with table_structure
and
col_info
.
## [TableTree] root
## [TableTree] RACE
## [TableTree] ASIAN [cont: 1 x 5]
## [TableTree] STRATA1
## [TableTree] A [cont: 1 x 5]
## [ElementaryTable] AGE (2 x 5)
## [TableTree] B [cont: 1 x 5]
## [ElementaryTable] AGE (2 x 5)
## [TableTree] C [cont: 1 x 5]
## [ElementaryTable] AGE (2 x 5)
## [TableTree] WHITE [cont: 1 x 5]
## [TableTree] STRATA1
## [TableTree] A [cont: 1 x 5]
## [ElementaryTable] AGE (2 x 5)
## [TableTree] B [cont: 1 x 5]
## [ElementaryTable] AGE (2 x 5)
## [TableTree] C [cont: 1 x 5]
## [ElementaryTable] AGE (2 x 5)
## [ElementaryTable] BMRKR1 (1 x 5)
## An InstantiatedColumnInfo object
## Columns:
## A: Drug X (ARM) -> F (SEX)
## A: Drug X (ARM) -> M (SEX)
## C: Combination (ARM) -> F (SEX)
## C: Combination (ARM) -> M (SEX)
## All (all)
We can use paths to declare intuitive substructures of our table. We
illustrate this using [
which interpret character vector
indices as individual paths in the respective dimension.
Note that while we will use these paths to subset our table for illustrative purposes, they are more often used to specify where something should happen within the larger table, which we discuss in the following section.
Row paths, in isolation, describe horizontal slices of our table. We
can see all the valid row paths (including an optional “root” beginning
value which is technically correct but not necessary to include) via
row_paths_summary
.
## rowname node_class path
## ——————————————————————————————————————————————————————————————————————
## ASIAN ContentRow root, RACE, ASIAN, @content, ASIAN
## A ContentRow root, RACE, ASIAN, STRATA1, A, @content, A
## Mean DataRow root, RACE, ASIAN, STRATA1, A, AGE, Mean
## Median DataRow root, RACE, ASIAN, STRATA1, A, AGE, Median
## B ContentRow root, RACE, ASIAN, STRATA1, B, @content, B
## Mean DataRow root, RACE, ASIAN, STRATA1, B, AGE, Mean
## Median DataRow root, RACE, ASIAN, STRATA1, B, AGE, Median
## C ContentRow root, RACE, ASIAN, STRATA1, C, @content, C
## Mean DataRow root, RACE, ASIAN, STRATA1, C, AGE, Mean
## Median DataRow root, RACE, ASIAN, STRATA1, C, AGE, Median
## WHITE ContentRow root, RACE, WHITE, @content, WHITE
## A ContentRow root, RACE, WHITE, STRATA1, A, @content, A
## Mean DataRow root, RACE, WHITE, STRATA1, A, AGE, Mean
## Median DataRow root, RACE, WHITE, STRATA1, A, AGE, Median
## B ContentRow root, RACE, WHITE, STRATA1, B, @content, B
## Mean DataRow root, RACE, WHITE, STRATA1, B, AGE, Mean
## Median DataRow root, RACE, WHITE, STRATA1, B, AGE, Median
## C ContentRow root, RACE, WHITE, STRATA1, C, @content, C
## Mean DataRow root, RACE, WHITE, STRATA1, C, AGE, Mean
## Median DataRow root, RACE, WHITE, STRATA1, C, AGE, Median
## BMRKR1 LabelRow root, BMRKR1
## Mean DataRow root, BMRKR1, Mean
In addition to displaying a nicely formatted summary, this returns a
data.frame containing the same information in a programmatically
accessible form. In particular path
is a list-valued column
whose values can be used directly as row paths:
## label indent node_class path
## 1 ASIAN 0 ContentRow root, RA....
## 2 A 1 ContentRow root, RA....
## 3 Mean 2 DataRow root, RA....
## 4 Median 2 DataRow root, RA....
## 5 B 1 ContentRow root, RA....
## 6 Mean 2 DataRow root, RA....
## A: Drug X C: Combination
## F M F M All
## ——————————————————————————————————————————————
## Mean 33.8 34.9 33.3 35.9 33.3
The c("RACE", "ASIAN")
row path refers to the horizontal
slice of our table containing all rows that represent analysis on Asian
patients. We see that we get the expected subtable:
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————————————————————————————————
## ASIAN 44 (62.9%) 35 (68.6%) 40 (65.6%) 44 (64.7%) 231 (64.9%)
## A 15 (21.4%) 12 (23.5%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 30.4 34.4 37.4 36.2 34.5
## Median 30.0 33.5 38.0 33.0 33.0
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
## Median 32.0 34.5 34.0 34.5 33.0
## C 13 (18.6%) 15 (29.4%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 36.9 35.6 33.5 31.4 33.9
## Median 36.0 37.0 34.0 31.0 34.0
Similarly we can get the groups summary and row for strata B of Caucasian patients via
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————————————————————
## B 4 (5.7%) 3 (5.9%) 3 (4.9%) 1 (1.5%) 16 (4.5%)
## Mean 37.0 43.7 34.3 36.0 38.3
## Median 38.5 44.0 35.0 36.0 38.0
Notice this is a strict subtable in the structural sense, which means
we do not get the ethnicity-level group summary here. We can see this
because our structure and our path now starts with "B"
:
## [TableTree] root
## [TableTree] RACE
## [TableTree] WHITE
## [TableTree] STRATA1
## [TableTree] B [cont: 1 x 5]
## [ElementaryTable] AGE (2 x 5)
As mentioned above, to “path into” a group summary we use the
"@content"
directive:
## A: Drug X C: Combination
## F M F M All
## ———————————————————————————————————————————————————————————————————————
## ASIAN 44 (62.9%) 35 (68.6%) 40 (65.6%) 44 (64.7%) 231 (64.9%)
We can path down to analysis tables and then individual rows via their name, which unlike other structures tends to be identical to their label:
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————
## Mean 37.0 43.7 34.3 36.0 38.3
## Median 38.5 44.0 35.0 36.0 38.0
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————
## Median 38.5 44.0 35.0 36.0 38.0
Similar to row paths, we can get information about column paths via
col_paths_summary
:
## label path
## —————————————————————————————————————————————
## A: Drug X ARM, A: Drug X
## F ARM, A: Drug X, SEX, F
## M ARM, A: Drug X, SEX, M
## C: Combination ARM, C: Combination
## F ARM, C: Combination, SEX, F
## M ARM, C: Combination, SEX, M
## All All, All
We can then describe vertical slices of our table via these paths (we
use head
, which subsets via absolute position to limit the
amount of output here):
## A: Drug X
## F M
## ————————————————————————————————————
## ASIAN 44 (62.9%) 35 (68.6%)
## A 15 (21.4%) 12 (23.5%)
## Mean 30.4 34.4
## Median 30.0 33.5
## B 16 (22.9%) 8 (15.7%)
## Mean 33.8 34.9
## C: Combination
## M
## ———————————————————————————
## ASIAN 44 (64.7%)
## A 16 (23.5%)
## Mean 36.2
## Median 33.0
## B 12 (17.6%)
## Mean 35.9
## All
## ————————————————————————
## ASIAN 231 (64.9%)
## A 78 (21.9%)
## Mean 34.5
## Median 33.0
## B 75 (21.1%)
## Mean 33.3
Note that despite being displayed next to each-other, the last two
columns of our table have fundamentally different paths. This is due to
add_overall_col
adding a non-nested
additional split rather than adding an additional implicit combination
level to the ARM
split.
As of rtables
0.6.13
, rtables enforces
uniqueness of names within groups of direct sibling structures in both
row and column2 space, thus guaranteeing unique paths to
every substructure in the table.
In row space, it does this by appending [k]
to the names
of elements which would otherwise have an identical name to a previous
sibling, where k is a sequence of integers such that all siblings have
unique names. This affects the paths to these substructures3, as we see
from the informative messages below:
lytdup <- basic_table() |>
analyze("STRATA1") |>
split_rows_by("STRATA1") |>
analyze("AGE")
tbldup <- build_table(lytdup, DM)
## Modifying subtable (or row) names to ensure uniqueness among direct siblings
## [STRATA1 -> { STRATA1, STRATA1[2] }]
## To control table names use split_rows_by*(, parent_name =.) or analyze(., table_names = .) when analyzing a single variable, or analyze(., parent_name = .) when analyzing multiple variables in a single call.FALSE
## all obs
## ————————————————
## A 114
## B 119
## C 123
## A
## Mean 33.74
## B
## Mean 34.10
## C
## Mean 34.79
## rowname node_class path
## ———————————————————————————————————————————————————————
## A DataRow root, STRATA1, A
## B DataRow root, STRATA1, B
## C DataRow root, STRATA1, C
## A LabelRow root, STRATA1[2], A
## Mean DataRow root, STRATA1[2], A, AGE, Mean
## B LabelRow root, STRATA1[2], B
## Mean DataRow root, STRATA1[2], B, AGE, Mean
## C LabelRow root, STRATA1[2], C
## Mean DataRow root, STRATA1[2], C, AGE, Mean
This allows us to path to all elements of the row structure, which
was not possible in previous (<0.6.13
)
rtables
versions:
## all obs
## ———————————
## A 114
## all obs
## ————————————————
## A
## Mean 33.74
Many, though not all, rtables
functions which accept a
row or column paths support the "*"
path wildcard. Where
supported, the wild-card will match any name present at that
step in the table structure, leading to (potentially) multiple matches.
Note "*"
will never behave as the "@content"
directive, which must always be used explicitly.
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————
## Median 32.0 34.5 34.0 34.5 33.0
## Median 38.5 44.0 35.0 36.0 38.0
Multiple wildcards can appear in a path, with each wildcard applied recursively within the full combined set of matches from all wildcards earlier in the path.
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————
## Median 30.0 33.5 38.0 33.0 33.0
## Median 32.0 34.5 34.0 34.5 33.0
## Median 36.0 37.0 34.0 31.0 34.0
## Median 34.0 45.0 35.0 31.0 31.0
## Median 38.5 44.0 35.0 36.0 38.0
## Median 35.5 44.0 38.5 32.5 40.0
Note that while the [
method does support wildcards, we
are only using that to illustrate the behavior, as the tables resulting
from using wildcard paths with [
are generally not going to
be very useful.
For (currently only) row paths, we can resolve a path with one or
more wildcards into a set of fully specified paths that match the path
in our table using the tt_normalize_row_path
utility
function
## $ASIAN.A
## [1] "RACE" "ASIAN" "STRATA1" "A" "AGE" "Median"
##
## $ASIAN.B
## [1] "RACE" "ASIAN" "STRATA1" "B" "AGE" "Median"
##
## $ASIAN.C
## [1] "RACE" "ASIAN" "STRATA1" "C" "AGE" "Median"
##
## $WHITE.A
## [1] "RACE" "WHITE" "STRATA1" "A" "AGE" "Median"
##
## $WHITE.B
## [1] "RACE" "WHITE" "STRATA1" "B" "AGE" "Median"
##
## $WHITE.C
## [1] "RACE" "WHITE" "STRATA1" "C" "AGE" "Median"
We can also test whether a row path (including those containing
wildcards) exists in our table with tt_row_path_exists
## [1] TRUE
## [1] FALSE
Note also that each "*"
wildcard will only match a
single step, there is not currently a directive that searches for a
match anywhere in the relevant (sub)structure.
Thus we get
## $BMRKR1
## [1] "BMRKR1" "Mean"
Despite there being other "Mean"
elements elsewhere in
our row structure.
Though the above utilities don’t currently exist for column paths (which are implemented differently in ways not relevant to end users), generally those mechanisms which support wildcards in row space and also accept a column path support wildcards for column paths as well:
## A: Drug X C: Combination
## F F
## ————————————————————————————————————————
## ASIAN 44 (62.9%) 40 (65.6%)
## A 15 (21.4%) 15 (24.6%)
## Mean 30.4 37.4
## Median 30.0 38.0
## B 16 (22.9%) 10 (16.4%)
## Mean 33.8 33.3
## Median 32.0 34.0
## C 13 (18.6%) 15 (24.6%)
## Mean 36.9 33.5
## Median 36.0 34.0
## WHITE 8 (11.4%) 8 (13.1%)
## A 2 (2.9%) 1 (1.6%)
## Mean 34.0 35.0
## Median 34.0 35.0
## B 4 (5.7%) 3 (4.9%)
## Mean 37.0 34.3
## Median 38.5 35.0
## C 2 (2.9%) 4 (6.6%)
## Mean 35.5 38.5
## Median 35.5 38.5
## BMRKR1
## Mean 6.06 5.83
In addition to subsetting via paths, which as we mentioned is likely to be of limited utility, many aspects of a table can be selectively inspected or changed using paths.
We will explore some of these throughout this section
We can set (though, currently not get, an oversight that will likely be remedied in a future version) the visibility on a set of sibling facets.
## A: Drug X
## F M C: Combination
## (N=70) (N=51) F M All
## ————————————————————————————————————————————————————————————————————————————
## ASIAN 44 (62.9%) 35 (68.6%) 40 (65.6%) 44 (64.7%) 231 (64.9%)
## A 15 (21.4%) 12 (23.5%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 30.4 34.4 37.4 36.2 34.5
## Median 30.0 33.5 38.0 33.0 33.0
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
NB: unlike virtually all functions which accept paths,
facet_colcounts_visible
accepts the path to the
parent of the facets you’d like to change the colcount visibility
for. This is because direct siblings cannot have different
column count visibilities, so pathing to individual facets would lead to
an invalid table.
We can also get or modify the value of any particular column count (note no s here):
## A: Drug X
## F M C: Combination
## (N=70) (N=5) F M All
## ————————————————————————————————————————————————————————————————————————————
## ASIAN 44 (62.9%) 35 (68.6%) 40 (65.6%) 44 (64.7%) 231 (64.9%)
## A 15 (21.4%) 12 (23.5%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 30.4 34.4 37.4 36.2 34.5
## Median 30.0 33.5 38.0 33.0 33.0
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
If we need to mix visibilty and non-visibilty of counts within a direct sibling group the best we can do is setting one to NA, which will leave a blank space there:
## A: Drug X
## F M C: Combination
## (N=5) F M All
## ————————————————————————————————————————————————————————————————————————————
## ASIAN 44 (62.9%) 35 (68.6%) 40 (65.6%) 44 (64.7%) 231 (64.9%)
## A 15 (21.4%) 12 (23.5%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 30.4 34.4 37.4 36.2 34.5
## Median 30.0 33.5 38.0 33.0 33.0
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
Section dividers are character(s) that are printed in a line after a particular row or subtable during rendering to differentiate sections of a table (they are most often, and by default, ” ” to create a blank line).
tbl3 <- tbl
section_div_at_path(tbl3, c("RACE", "*")) <- "*"
section_div_at_path(tbl3, c("RACE", "*", "STRATA1", "B")) <- "+"
tbl3
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————————————————————————————————
## ASIAN 44 (62.9%) 35 (68.6%) 40 (65.6%) 44 (64.7%) 231 (64.9%)
## A 15 (21.4%) 12 (23.5%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 30.4 34.4 37.4 36.2 34.5
## Median 30.0 33.5 38.0 33.0 33.0
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
## Median 32.0 34.5 34.0 34.5 33.0
## ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
## C 13 (18.6%) 15 (29.4%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 36.9 35.6 33.5 31.4 33.9
## Median 36.0 37.0 34.0 31.0 34.0
## ****************************************************************************
## WHITE 8 (11.4%) 6 (11.8%) 8 (13.1%) 10 (14.7%) 46 (12.9%)
## A 2 (2.9%) 1 (2.0%) 1 (1.6%) 5 (7.4%) 15 (4.2%)
## Mean 34.0 45.0 35.0 32.8 33.3
## Median 34.0 45.0 35.0 31.0 31.0
## B 4 (5.7%) 3 (5.9%) 3 (4.9%) 1 (1.5%) 16 (4.5%)
## Mean 37.0 43.7 34.3 36.0 38.3
## Median 38.5 44.0 35.0 36.0 38.0
## ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
## C 2 (2.9%) 2 (3.9%) 4 (6.6%) 4 (5.9%) 15 (4.2%)
## Mean 35.5 44.0 38.5 35.0 39.1
## Median 35.5 44.0 38.5 32.5 40.0
## ****************************************************************************
## BMRKR1
## Mean 6.06 5.42 5.83 5.57 5.85
Section dividers have a least specific to most specific
order of precedence, with only the least specific applicable section
divider displayed after any given row. See ?section_div
for
more details.
rtables
TableSorting rows in a table occurs in a path-specific way. See the sorting section in the pruning and sorting vignette for a detailed discussion of this.
We saw that the [
method interprets character indicies
as paths. Beyond that, the value_at
and
cell_values
getters and setters accept paths as well. See
the subsetting vignette.
Pathing can also be used to add referential footnotes to rows, columns, or cells. This is discussed in the title and footer and subsetting vignettes.
Here we will go into a bit more detail of how layouts, table structure, and pathing are related. This is largely for informational purposes and most of it will not be directly relevant to end-users who are simply creating tables.
rtables
is row-dominant (as opposed to R’s
data.frame
s which are column dominant). This means that
tables are modelled as a (generalized) collection of rows, rather than
columns. More accurately, a table is modeled as a collection of
children, which can have children, etc until ultimately all of the
“leaf” children in the defined tree-graph are individual rows.
We can see this using the tree_children
function. The
table we’ve been working with throughout this vignette has two direct
children, one containing all of the structure generated underneath the
initial "RACE"
split, and one containing the unnested
analysis of "BMRKR1"
:
## $RACE
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————————————————————————————————
## ASIAN 44 (62.9%) 35 (68.6%) 40 (65.6%) 44 (64.7%) 231 (64.9%)
## A 15 (21.4%) 12 (23.5%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 30.4 34.4 37.4 36.2 34.5
## Median 30.0 33.5 38.0 33.0 33.0
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
## Median 32.0 34.5 34.0 34.5 33.0
## C 13 (18.6%) 15 (29.4%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 36.9 35.6 33.5 31.4 33.9
## Median 36.0 37.0 34.0 31.0 34.0
## WHITE 8 (11.4%) 6 (11.8%) 8 (13.1%) 10 (14.7%) 46 (12.9%)
## A 2 (2.9%) 1 (2.0%) 1 (1.6%) 5 (7.4%) 15 (4.2%)
## Mean 34.0 45.0 35.0 32.8 33.3
## Median 34.0 45.0 35.0 31.0 31.0
## B 4 (5.7%) 3 (5.9%) 3 (4.9%) 1 (1.5%) 16 (4.5%)
## Mean 37.0 43.7 34.3 36.0 38.3
## Median 38.5 44.0 35.0 36.0 38.0
## C 2 (2.9%) 2 (3.9%) 4 (6.6%) 4 (5.9%) 15 (4.2%)
## Mean 35.5 44.0 38.5 35.0 39.1
## Median 35.5 44.0 38.5 32.5 40.0
##
## $BMRKR1
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————
## BMRKR1
## Mean 6.06 5.42 5.83 5.57 5.85
For convenience we will define a multi_step_children
function which recursively retrieves children from the table, and then
from those children, etc. For information purposes, we will print the
“path step” taken each time, thus building up our path as we descend
using the class structure.
multi_step_children <- function(tbl, indices) {
print(obj_name(tbl))
ret <- tree_children(tbl)
for (i in indices) {
print(obj_name(ret[[i]]))
ret <- tree_children(ret[[i]])
}
ret
}
Thus we can see that the first of our table’s children has the path
c("root", "RACE")
and has children for each ethnicity in
our table (recall the “root” path element is correct but optional):
## [1] "root"
## [1] "RACE"
## $ASIAN
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————————————————————————————————
## ASIAN 44 (62.9%) 35 (68.6%) 40 (65.6%) 44 (64.7%) 231 (64.9%)
## A 15 (21.4%) 12 (23.5%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 30.4 34.4 37.4 36.2 34.5
## Median 30.0 33.5 38.0 33.0 33.0
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
## Median 32.0 34.5 34.0 34.5 33.0
## C 13 (18.6%) 15 (29.4%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 36.9 35.6 33.5 31.4 33.9
## Median 36.0 37.0 34.0 31.0 34.0
##
## $WHITE
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————————————————————————————
## WHITE 8 (11.4%) 6 (11.8%) 8 (13.1%) 10 (14.7%) 46 (12.9%)
## A 2 (2.9%) 1 (2.0%) 1 (1.6%) 5 (7.4%) 15 (4.2%)
## Mean 34.0 45.0 35.0 32.8 33.3
## Median 34.0 45.0 35.0 31.0 31.0
## B 4 (5.7%) 3 (5.9%) 3 (4.9%) 1 (1.5%) 16 (4.5%)
## Mean 37.0 43.7 34.3 36.0 38.3
## Median 38.5 44.0 35.0 36.0 38.0
## C 2 (2.9%) 2 (3.9%) 4 (6.6%) 4 (5.9%) 15 (4.2%)
## Mean 35.5 44.0 38.5 35.0 39.1
## Median 35.5 44.0 38.5 32.5 40.0
Each of these children under "RACE"
is a subtable.
The children under our BMRKR1
analysis, on the other
hand, are rows (in this case only one row, in fact):
## [1] "root"
## [1] "BMRKR1"
## $Mean
## [DataRow indent_mod 0]: Mean 6.06 5.42 5.83 5.57 5.85
Within each race subtable, we see a table corresponding to the
STRATA1
split:
## [1] "root"
## [1] "RACE"
## [1] "ASIAN"
## $STRATA1
## A: Drug X C: Combination
## F M F M All
## —————————————————————————————————————————————————————————————————————————
## A 15 (21.4%) 12 (23.5%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 30.4 34.4 37.4 36.2 34.5
## Median 30.0 33.5 38.0 33.0 33.0
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
## Median 32.0 34.5 34.0 34.5 33.0
## C 13 (18.6%) 15 (29.4%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 36.9 35.6 33.5 31.4 33.9
## Median 36.0 37.0 34.0 31.0 34.0
## [1] "root"
## [1] "RACE"
## [1] "ASIAN"
## [1] "STRATA1"
## $A
## A: Drug X C: Combination
## F M F M All
## —————————————————————————————————————————————————————————————————————————
## A 15 (21.4%) 12 (23.5%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 30.4 34.4 37.4 36.2 34.5
## Median 30.0 33.5 38.0 33.0 33.0
##
## $B
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————————————————————————————
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
## Median 32.0 34.5 34.0 34.5 33.0
##
## $C
## A: Drug X C: Combination
## F M F M All
## —————————————————————————————————————————————————————————————————————————
## C 13 (18.6%) 15 (29.4%) 15 (24.6%) 16 (23.5%) 78 (21.9%)
## Mean 36.9 35.6 33.5 31.4 33.9
## Median 36.0 37.0 34.0 31.0 34.0
And finally within each strata facet is a table representing the
analysis of AGE
## [1] "root"
## [1] "RACE"
## [1] "ASIAN"
## [1] "STRATA1"
## [1] "B"
## $AGE
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————
## Mean 33.8 34.9 33.3 35.9 33.3
## Median 32.0 34.5 34.0 34.5 33.0
And within each of those AGE
analysis tables, like our
BMRKR1
top level analysis table, we have a collection of
rows:
## [1] "root"
## [1] "RACE"
## [1] "ASIAN"
## [1] "STRATA1"
## [1] "B"
## [1] "AGE"
## $Mean
## [DataRow indent_mod 0]: Mean 33.8 34.9 33.3 35.9 33.3
##
## $Median
## [DataRow indent_mod 0]: Median 32.0 34.5 34.0 34.5 33.0
Thus we see that analyze
calls create tables (called
ElementaryTable
s) containing individual rows as children,
while split_rows_by
(and siblings) calls create a subtable
with children that are a table for each facet declared by the split
operation:
## child is AGE analysis table within RACE->WHITE->STRATA1->A
multi_step_children(tbl, c(1, 2, 1, 1))
## [1] "root"
## [1] "RACE"
## [1] "WHITE"
## [1] "STRATA1"
## [1] "A"
## $AGE
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————
## Mean 34.0 45.0 35.0 32.8 33.3
## Median 34.0 45.0 35.0 31.0 31.0
## [1] "root"
## [1] "RACE"
## [1] "WHITE"
## [1] "STRATA1"
## [1] "A"
## [1] "AGE"
## $Mean
## [DataRow indent_mod 0]: Mean 34.0 45.0 35.0 32.8 33.3
##
## $Median
## [DataRow indent_mod 0]: Median 34.0 45.0 35.0 31.0 31.0
For technical and historical reasons, label rows and so-called “content rows” (which are essentially marginal analyses at a non-leaf point in the tree graph defined by the parent-child relationships discussed above) are modeled separately.
Given a (sub)table, the content table (containing the content rows)
and label can be retrieved by content_table
and
obj_label
, respectively. Note that obj_label
returns a string, not a row, as the label row is an internal detail not
currently exposed.
Recall that our multi_step_children
function returns
the set of children at a location, so we must subset one
additional time to arrive at a single child:
## [1] "root"
## [1] "RACE"
## [1] "ASIAN"
## [1] "STRATA1"
## A: Drug X C: Combination
## F M F M All
## ————————————————————————————————————————————————————————————————————————
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
## Mean 33.8 34.9 33.3 35.9 33.3
## Median 32.0 34.5 34.0 34.5 33.0
## A: Drug X C: Combination
## F M F M All
## —————————————————————————————————————————————————————————————————
## B 16 (22.9%) 8 (15.7%) 10 (16.4%) 12 (17.6%) 75 (21.1%)
Typically (ie by default) label rows for (sub)tables that have a non-empty content table are not visible when rendering, but they do still exist:
## [1] "B"
Thus we see that:
split_rows_by*
layout instructions create a single
subtable with the split name, which contains a facet for each value of
the split;analyze
instructions create a subtable
containing individual rows defined by the afun used;analyze
instructions create a
parent subtable with a child for each individual analyzed var, as above;
andsummarize_row_groups
create content tables on the
children of the table for that splitFor largely historical reasons, and due to the fact that the
rtables
object model is row-dominant, the exact way that
column structure is modeled is an arcane implementation detail not
useful to end users (much more so than the row structure explored
above). Thus we will largely gloss over it here.
For our purposes here it suffices to say that the analog of the subtables representing split instructions are implicit in column space after the first split, as opposed to explicit as we saw them to be in row space. That said, the relationship between layout instructions and resulting paths in the table remains valid and useful.
We can see this by looking again at our column paths summary:
## label path
## —————————————————————————————————————————————
## A: Drug X ARM, A: Drug X
## F ARM, A: Drug X, SEX, F
## M ARM, A: Drug X, SEX, M
## C: Combination ARM, C: Combination
## F ARM, C: Combination, SEX, F
## M ARM, C: Combination, SEX, M
## All All, All
Column paths have a more rigid structure than their row-based counterparts. Because column space has no analog to analyze layout instructions, All paths corresponding to facets or individual columns come in the form of one or more pairs of the form (split name, split value).
Virtually all useful column paths will be of the form above. The only
exception to this is when setting column count visibility to a set of
facets via facet_colcounts_visible<-
, for which we path
to the implicit parent structure whose children are the facets we are
interested in. We saw this in action in the previous section.
To summarize, as for row space, the relationship between layout
instructions and column paths as follows: column split instructions
create structures pathable via (split_name, split value/facet name)
pairs. Because there is no analyze
analog for column
splitting, this paradigm is sufficient to understand and predict all
column paths.
In rtables
0.6.13
table
layouts which would result in non-unique paths in column space will fail
to build. This will likely change to be more in line with the behavior
in row space in a future release.↩︎
In rtables
0.6.13
table
layouts which would result in non-unique paths in column space will fail
to build. This will likely change to be more in line with the behavior
in row space in a future release.↩︎
the result-data.frame / ARD machinery knows to remove these uniquification artifacts, so these modifications to the names will not be reflected there.↩︎