library(constructive)We detail in this vignette how {constructive} works and how you might
define custom constructors or custom .cstr_construct.*()
methods.
This documents provides the general theory here but you are encouraged to look at examples.
In particular the package {constructive.examples} accessible at https://github.com/cynkra/constructive.example/ contains 2 examples, support a new class (“qr”), or implement a new constructor for an already supported class (“tbl_df). This package might be used as a template.
The scripts starting with “s3-” and “s4-” in the {constructive} package provide many more examples in a similar but slightly different shape, those 2 resources along with the explanations in this document should get you started. Don’t hesitate to open issues if things are unclear.
The next 5 sections describe the inner logic of the package, the last 2 sections explain how to support a new class and/or define your own constructors.
The package is young and subject to breaking changes, so we apologize in advance for the possible API breaking changes in the future.
.cstr_construct() builds code recursively, without
checking input or output validity, without handling errors, and without
formatting.construct() wraps .cstr_construct() and
does this extra work..cstr_construct() is a generic and many methods are
implemented in the package, for instance construct(iris)
will call .cstr_construct.data.frame() which down the line
will call .cstr_construct.atomic() and
.cstr_construct.factor() to construct its columns..cstr_construct()
attempts to match its data input to a list of objects provided to the
data argument..cstr_construct
#> function (x, ..., data = NULL)
#> {
#> data_name <- perfect_match(x, data)
#> if (!is.null(data_name))
#> return(data_name)
#> UseMethod(".cstr_construct")
#> }
#> <bytecode: 0x1108b8c50>
#> <environment: namespace:constructive>
.cstr_construct(letters)
#> [1] "c("
#> [2] "\"a\", \"b\", \"c\", \"d\", \"e\", \"f\", \"g\", \"h\", \"i\", \"j\", \"k\", \"l\", \"m\", \"n\", \"o\","
#> [3] "\"p\", \"q\", \"r\", \"s\", \"t\", \"u\", \"v\", \"w\", \"x\", \"y\", \"z\""
#> [4] ")"
construct(letters)
#> c(
#> "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o",
#> "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"
#> ).cstr_construct.?() methods.cstr_construct.?() methods typically have this
form:
.cstr_construct.Date <- function(x, ...) {
opts <- .cstr_fetch_opts("Date", ...)
if (is_corrupted_Date(x) || opts$constructor == "next") return(NextMethod())
constructor <- constructors$Date[[opts$constructor]]
constructor(x, ..., origin = opts$origin)
}.cstr_fetch_opts() gathers options provided to
construct() through the opts_*() function (see
next section), or falls back to a default value if none were
provided.NextMethod() to forward all our inputs to a lower
level constructor.constructor() actually builds the code from the object
x, the parameters forwarded through ... and
the optional construction details gathered in opts (here
the origin)opts_?() functionWhen implementing a new method you’ll need to define and export the
corresponding opts_?() function. It provides to the user a
way to choose a constructor and object retrieved by
.cstr_fetch_opts() in the .cstr_construct()
method.
It should always have this form:
opts_Date <- function(
constructor = c(
"as.Date", "as_date", "date", "new_date", "as.Date.numeric",
"as_date.numeric", "next", "atomic"
),
...,
origin = "1970-01-01"
) {
.cstr_combine_errors(
constructor <- .cstr_match_constructor(constructor),
ellipsis::check_dots_empty()
)
.cstr_options("Date", constructor = constructor, origin = origin)
}opts_?()
function and as the first argument of .cstr_options().originThe following code illustrates how the information is retrieved.
# .cstr_fetch_opts() takes a class and the dots and retrieves the relevant options
# if none were provided it falls back on the default value for the relevant opts_?() function
test <- function(...) {
.cstr_fetch_opts("Date", ...)
}
test(opts_Date("as_date"), opts_data.frame("read.table"))
#> <constructive_options_Date/constructive_options>
#> constructor: "as_date"
#> origin: "1970-01-01"
test()
#> <constructive_options_Date/constructive_options>
#> constructor: "as.Date"
#> origin: "1970-01-01"is_corrupted_?() functionis_corrupted_?() checks if x has the right
internal type and attributes, sometimes structure, so that it satisfies
the expectations of a well formatted object of a given class.
If an object is corrupted for a given class we cannot use
constructors for this class, so we move on to a lower level constructor
by calling NextMethod() in
.cstr_construct().
This is important so that {constructive} doesn’t choke
on corrupted objects but instead helps us understand them.
For instance in the following example x prints like a
date but it’s corrupted, a date should not be built on top of characters
and this object cannot be built with as.Date() or other
idiomatic date constructors.
x <- structure("12345", class = "Date")
x
#> [1] "2003-10-20"
x + 1
#> Error in unclass(e1) + unclass(e2): non-numeric argument to binary operatorWe have defined :
is_corrupted_Date <- function(x) {
!is.double(x)
}And as a consequence the next method,
.cstr_construct.default() will be called through
NextMethod() and will handle the object using an atomic
vector constructor:
construct(x)
#> "12345" |>
#> structure(class = "Date"){constructive} exports a constructors environment
object, itself containing environments named like classes, the latter
contain the constructor functions.
It is retrieved in the .cstr_construct() method by:
constructor <- constructors$Date[[opts$constructor]]For instance the default constructor for “Date” is :
constructors$Date$as.Date
#> function (x, ..., origin = "1970-01-01")
#> {
#> if (any(is.infinite(x)) && any(is.finite(x))) {
#> x_dbl <- unclass(x)
#> if (origin != "1970-01-01")
#> x_dbl <- x_dbl - as.numeric(as.Date(origin))
#> code <- .cstr_apply(list(x_dbl, origin = origin), "as.Date",
#> ..., new_line = FALSE)
#> }
#> else {
#> code <- .cstr_apply(list(format(x)), "as.Date", ...,
#> new_line = FALSE)
#> }
#> repair_attributes_Date(x, code, ...)
#> }
#> <bytecode: 0x10647ec30>
#> <environment: namespace:constructive>A function call is made of a function and its arguments. A
constructor sets the function and constructs its arguments recursively.
This is done with the help of .cstr_apply() once these
output have been prepared. In the case above we have 2 logical paths
because dates can be infinite but date vectors containing infinite
elements cannot be represented by
as.Date(<character>), our preferred choice.
x <- structure(c(12345, 20000), class = "Date")
y <- structure(c(12345, Inf), class = "Date")
constructors$Date$as.Date(x)
#> [1] "as.Date(c(\"2003-10-20\", \"2024-10-04\"))"
constructors$Date$as.Date(y)
#> [1] "as.Date(c(12345, Inf), origin = \"1970-01-01\")"It’s important to consider corner cases when defining a constructor,
if some cases can’t be handled by the constructor we should fall back to
another constructor or to another .cstr_construct()
method.
For instance constructors$data.frame$read.table() falls
back on constructors$data.frame$data.frame() when the input
contains non atomic columns, which cannot be represented in a table
input, and constructors$data.frame$data.frame() itself
falls back on .cstr_construct.list() when the data frame
contains list columns not defined using I(), since
data.frame() cannot produce such objects.
That last line of the function does the attribute reparation.
Constructors should always end by a call to
.cstr_repair_attributes() or a function that wraps it.
These are needed to adjust the attributes of an object after
idiomatic constructors such as as.Date() have defined their
data and canonical attributes.
x <- structure(c(12345, 20000), class = "Date", some_attr = 42)
# attributes are not visible due to "Date"'s printing method
x
#> [1] "2003-10-20" "2024-10-04"
# but constructive retrieves them
constructors$Date$as.Date(x)
#> [1] "as.Date(c(\"2003-10-20\", \"2024-10-04\")) |>"
#> [2] "structure(some_attr = 42)".cstr_repair_attributes() essentially sets attributes
with exceptions :
idiomatic_class
argument.cstr_repair_attributes() does a bit more but we don’t
need to dive deeper in this vignette.
constructive:::repair_attributes_Date
#> function (x, code, ...)
#> {
#> .cstr_repair_attributes(x, code, ..., idiomatic_class = "Date")
#> }
#> <bytecode: 0x104b5fa58>
#> <environment: namespace:constructive>
constructive:::repair_attributes_factor
#> function (x, code, ...)
#> {
#> .cstr_repair_attributes(x, code, ..., ignore = "levels",
#> idiomatic_class = "factor")
#> }
#> <bytecode: 0x1071c7b38>
#> <environment: namespace:constructive>Registering a new class is done by defining and registering a
.cstr_construct.?() method. In a package you might register
the method with {roxygen2} by using the “@export tag”
You should not attempt to modify manually the
constructors object of the {constructive} package, instead
you should :
.cstr_register_constructors(class_name, constructor_name = constructor_function, ...)Do the latter in .onload() if the new constructor is to
be part of a package, for instance.
# in zzz.R
.onLoad <- function(libname, pkgname) {
.cstr_register_constructors(
class_name,
constructor_name1 = constructor1,
constructor_name2 = constructor2
)
}