library(dplyr)
library(matsbyname)
library(tibble)matsbyname functions in which operands are specified in
a ... argument are ambiguous when applied to a data frame.
But there is an argument (.summarise) that signals
intention, allowing the ambiguous functions to be used flexibly with
data frames.
For normal functions, such as + and mean(),
there is no ambiguity about their operation in a data frame.
df <- tibble::tribble(~x, ~y, ~z,
1, 2, 3,
4, 5, 6)
# Typically, operations are done across rows.
df %>%
dplyr::mutate(
a = x + y + z,
b = rowMeans(.)
)
#> # A tibble: 2 × 5
#> x y z a b
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 2 3 6 2
#> 2 4 5 6 15 5To perform the same operations down columns, use
dplyr::summarise().
df %>%
dplyr::summarise(
x = sum(x),
y = sum(y),
z = sum(z)
)
#> # A tibble: 1 × 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 5 7 9
df %>%
dplyr::summarise(
x = mean(x),
y = mean(y),
z = mean(z)
)
#> # A tibble: 1 × 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 2.5 3.5 4.5matsbyname::sum_byname()What does matsbyname::sum_byname() mean for a data
frame? Will it give sums across rows (as +), or will it
give sums down columns (as summarise())? This ambiguity is
present for all *_byname() functions in which operands are
specified via the ... argument, including
matrixproduct_byname(),
hadamardproduct_byname(), mean_byname(),
etc.
To resolve the ambiguity, use the .summarise argument.
The default value of .summarise is FALSE,
meaning that the functions normally operate across rows. If you want to
perform the action down columns, set .summarise = TRUE.
df %>%
dplyr::mutate(
a = sum_byname(x, y, z),
b = mean_byname(x, y, z)
)
#> # A tibble: 2 × 5
#> x y z a b
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 1 2 3 6 2
#> 2 4 5 6 15 5
df %>%
dplyr::summarise(
x = sum_byname(x, .summarise = TRUE) %>% unlist(),
y = sum_byname(y, .summarise = TRUE) %>% unlist(),
z = sum_byname(z, .summarise = TRUE) %>% unlist()
)
#> # A tibble: 1 × 3
#> x y z
#> <dbl> <dbl> <dbl>
#> 1 5 7 9The .summarise argument broadens the range of
applicability for many matsbyname functions, especially
when used with data frames. The default is
.summarise = FALSE, meaning that operations will be
performed across columns. Set .summarise = TRUE argument to
signal intent to perform operations down a column.