.options.future = list(conditions = NULL)
would
be ignored as if it would never had been specified.Add %dofuture%
operator, which can be used like
%dopar%
, but without the need for
registerDoFuture()
,
e.g. y <- foreach(x = 1:3) %dofuture% { slow_fcn(x) }
.
One advantage, contrary to using %dopar%
, is that a
developer then has full control on foreach()
; with
%dopar%
the exact behavior depends also on what
%dopar%
adapter the user has registered. Another advantage
is that %dofuture%
can hand over random number generation,
identification of globals, and error handling to the future ecosystem,
which result in a more predictable and concise behavior, similar to that
provided by future.apply and
furrr.
Now future operators such as
%stdout%
and %conditions%
can be used to
control the corresponding options.future
arguments,
e.g. y <- foreach(i = 1:3) %dopar% { my_fun(i) } %stdout% FALSE
is the same as
y <- foreach(i = 1:3, .options.future = list(stdout = FALSE)) %dopar% { my_fun(i) }
.
Add withDoRNG()
for evaluating a foreach
%dopar%
expression with doRNG::registerDoRNG()
temporarily set.
options(doFuture.rng.onMisuse = "ignore")
is now a tad
faster than before.registerDoFuture()
with popular packages
(e.g. BiocParallel, foreach,
plyr, caret, glmnet,
NSP, and TSP) relying on
foreach to a separate
doFuture.tests.extra package. This reduced the size of
the package tarball from 140 kB to 40 kB.Now registerDoFuture()
returns the previously set
foreach backend, making it possible to reset the the foreach backend to
the previous settings.
Now doFuture recognizes when it is called via the BiocParallel package in which case it skips the check whether or not RNG was used by mistake.
Add option doFuture.rng.onMisuse
which can be used
to temporarily override option future.rng.onMisuse
when the
doFuture adaptor is running.
Add option doFuture.workarounds
, which can be set by
environment variable R_DOFUTURE_WORKAROUNDS
when the
package is loaded.
Adding "BiocParallel.DoParam.errors"
to option
doFuture.workarounds
will prefix
RngFutureError
messages with
"task <index> failed - "
in order for such errors to
be recognized by the BiocParallel DoParam
backend.
foreach()
argument
.options.snow = list(preschedule = <logical>)
is
now acknowledged as a fallback to analogous argument
.options.multicore
- two arguments defined by the
doMC and the doSNOW packages and also
used by the doParallel package. As before, argument
.options.future
will always take precedence if
list(scheduling = <logical>/<numeric>)
or
list(chunk.size = <integer>)
is given.
Warnings and errors produced when using the RNG without using
%dorng%
of the doRNG package are now
tailored to the doFuture package.
foreach()
argument .noexport
was
completely ignored by doFuture.doFuture.foreach.export
values
"automatic"
and "automatic-unless-.export"
.
They were made defunct in doFuture 0.8.2.%dorng%
of the doRNG package no longer
produces a warning on ‘Foreach loop had changed the current RNG type:
RNG was restored to same type, next state’ when using the
doFuture adapter.chunk_size
,
which now has been corrected to mention chunk.size
.Now doFuture sets a label on each future that
reflects its name and the index of the chunk,
e.g. "doFuture-3"
.
doFuture will now detect when doRNG is in use allowing underlying futures to skip the test of incorrectly generated random numbers - an optional validation of parallel RNG that will be added to future (>= 1.16.0).
.options.future = list(chunk.size = <count>)
to
foreach()
.doFuture.foreach.export
values
automatic-unless-.export
and automatic
are
defunct. They have been deprecated since doFuture
0.7.0..Random.seed
to NULL, instead of
removing it, which in turn would produce a warning on “.Random.seed is
not an integer vector but of type NULL
, so ignored” when
the next random number generated._R_CHECK_LENGTH_1_LOGIC2_=true
bug. This bug did
not affect how the package worked or any of its results.The foreach()
argument .options.future
(a named list) can already be used to control whether “chunking” should
take place or not, and if so, how much,
e.g. .options.future = list(scheduling = 2.0))
. As an
alternative to scheduling
, this can now be specified by
chunk.size
- the number of elements processed per future
(“chunk”). In R 3.5.0, the parallel package introduced
argument chunk.size
.
Elements can be processed in random order by setting attribute
ordering
of .options.future
elements
chunk.size
or scheduling
,
e.g. .options.future = list(chunk.size = structure(TRUE, ordering = "random"))
.
This improve load balancing in cases where there is a correlation
between processing time and ordering of the elements. Note that the
order of the returned values is not affected when randomizing the
processing order.
Passing argument
.options.future = list(stdout = ...)
can be used to to
control how standard output should be relayed. See
?future::Future
for further details. Analogously,
.options.future = list(conditions = ...)
can be used to
control how messages and warnings are relayed, if at all.
Debug messages are now prepended with a timestamp.
foreach()
iterates over,
e.g. foreach(f = F, g = G) %dopar% { f() + g() }
.The maximum total size of globals allowed (option
future.globals.maxSize
) per future (“chunk”) is now scaled
up by the number of elements processed by the future (“chunk”) making
the protection approximately invariant to the amount of chunking done by
foreach.
Added support for option
doFuture.foreach.export = "manual"
, which will strictly
follow argument .export
of foreach()
for
identifying global variables. None of the future +
globals framework for identifying global variables will
be used. This is useful for asserting that the .export
argument is correct.
help("doFuture")
and updated
example("doFuture")
with a best-practices RNG example.TESTS: The opt-in tests for third-party packages now run their
examples with example(..., run.dontrun = TRUE)
to cover
even more use cases.
TESTS: Added future.callr to the backend opt-in tested with plyr.
If the doFuture package is missing on a worker, an error on “length(results) == nbr_of_elements is not TRUE” would be produced. Now a more informative error is produced.
foreach(..., .export)
with .export
containing "..."
would produce an error when using
globals (<= 0.11.0).
foreach()
would not relay captured conditions as
provided by future (>= 0.11.0).
Previously deprecated option
doFuture.foreach.nullexport
is defunct.
Option doFuture.foreach.export
values
automatic-unless-.export
and automatic
are
defunct and will fall back to
export-and-automatic
.
doFuture now respects option
future.globals.resolve
instead of being hardcoded to always
resolve globals (future.globals.resolve = TRUE
). This makes
doFuture consistent with other future
frontends.
Added option doFuture.foreach.export
making it
possible ignore a faulty .export
argument to
foreach()
and instead rely on the future framework to
identify globals. For instance, all examples of caret
6.0-77 works with doFuture and any backend when setting
this option to "automatic"
(or
".export-and-automatic"
) whereas they will only work on
forked backends if using ".export"
or the default
"automatic-unless-.export"
. If using
".export-and-automatic-with-warning"
, a warning that lists
globals potentially missing from the .export
argument is
produced
foreach()
code.help("doFuture.options")
.TESTS: The doFuture package gained more opt-in tests for third-party packages across all known future backends. These tests are not performed on the CRAN servers; instead they are performed on Travis CI. Third-party packages that are currently tested are: caret, foreach, glmnet, NMF, plyr, and TSP.
TESTS: Testing global functions that call themselves recursively.
foreach()
option to control whether scheduling
(“chunking”) should take place or not, and if so, how granular it should
be. This is specified as
foreach(..., .options.future = list(scheduling = <value>))
.
With scheduling = 1.0
(or equivalently
scheduling = TRUE
), the the elements (iterations) will be
split up in equally sized chunks such that each backend worker will
process exactly one chunk. With scheduling = Inf
(or
equivalently scheduling = FALSE
), chunking is disabled,
i.e. each worker process exactly one element at the time. If
scheduling = 0.0
, then a single workers processes all
elements (and the other workers will not be used). If 2.0
,
then each worker will process two chunks, and so on. If above option is
not set, then
.options.multicore = list(preschedule = <logical>))
which is defined by doParallel, is used to mean
scheduling = preschedule. If that is not set, then
scheduling = 1.0
is used by default..parallel = TRUE
. The default is to test
against all of the future strategies that comes with the
future package, but it is possible to also test against
future.BatchJobs, future.batchtools,
and so on. These tests are performed on all possible future backends
before each release (as well as via continuous integration).Using a nested foreach()
call would incorrectly
produce an error on not being able to locate the iterator variable of
the inner-most foreach()
as a global variable.
If a foreach()
call would result in an error, the
error thrown would report on “object ‘expr’ not found” and not the
actually error message.
%dopar%
backend now processes all elements in chunks such
that each backend worker will process a subset of data at once (and only
once). This significantly speeds up processing time when iterating over
a large number of elements that each has short a processing time.Now the package tests future.batchtools with
foreach by itself, in combination with
plyr (parallel = TRUE
) as well as with
BiocParallel::bplapply()
and friends. Similar tests are
already done using future.BatchJobs.
Added test for foreach::times() %dopar% { ... }
.
Especially, it is now tested that global variables are properly
identified. Note that times()
does not allow you to specify
neither .export
nor .packages
so it is not
really designed for processing in external R process. Having said this,
times()
does indeed work also in those cases when used with
doFuture because it internally handles this
automatically.
ROBUSTNESS: The package redundancy tests (not run by
R CMD check
; needed to be run manually for now) that run 89
plyr examples with the doFuture
foreach adapter, now forces testing of
.parallel = TRUE
for all plyr functions
that support that argument. Each example is run across various future
strategies, including ‘sequential’, ‘multicore’, ‘multisession’, and
‘cluster’, as well as ‘batchjobs_local’ and ‘batchtools_local’, if
installed. See doFuture 0.2.0 notes below for how to
run these tests.
.export
of foreach()
is
acknowledged such that if a character vector of variables names to be
exported is specified, then those variables and nothing else are
exported to the future. If NULL, then automatic lookup of global
variables is used.%dopar%
calls, because doFuture forgot to
remind foreach that doFuture should be
used also deeper down. Thank you Alex Vorobiev for reporting on
this.ROBUSTNESS: Added package redundancy tests that runs all examples
of the foreach and the plyr packages
using doFuture and all known types of futures. These
tests are not package tests and need to be run manually. The test
scripts are available in package directory
path <- system.file("tests2", package="doFuture")
and
can be run as
source(file.path(path, "plyr", "examples.R"))
.
ROBUSTNESS: Added package tests validating foreach()
on regular as well as future.BatchJobs futures. Same
for plyr and BiocParallel apply
functions.
help("doFuture")
.foreach::getDoParWorkers()
gives useful information
with registerDoFuture()
in most cases. In cases where the
number of workers cannot be inferred easily from
future::plan()
it will default to returning a large number
(= 99).foreach::getDoParName()
and
foreach::getDoParVersion()
gives useful information with
registerDoFuture()
.