Connor Hughes, Mariana Castaneda-Guzman, Luis E. Escobar
Ecological niche modeling (ENM) and species distribution modeling (SDM) are widely used to reconstruct and predict species environmental tolerances and geographic distributions, expanding ecological and biogeographic theory. Despite their widespread use, one aspect that has received relatively little attention is the temporally dynamic environment.
Environmental conditions shaping species distributions vary over time, including climate, resource availability, and habitat availability. As a result, the suitability of any given location may change as key environmental conditions become more or less favorable for a species. Most existing implementations of ENM assume a static relationship between species and environment: occurrences are pooled across years, predictors are time-averaged, and a single model is fit to the combined dataset. This works well when relationships are genuinely stable, but it carries three costs when they aren’t:
Temporally-explicit ENM offer an alternative practice which preserve the time stamp on each observation. Each occurrence is paired with the environmental conditions it experienced at the time step it was recorded, rather than against a long-term average or other static variable. The resulting models are sensitive to year-to-year (or season-to-season) variation in the species-environment relationship, and predictions are projected as time series rather than as a single static surface.
TemporalModelR is an R package for building temporally explicit SDMs end-to-end, from raw occurrence records to time-resolved trend analyses. It provides a consistent and replicable workflow for implementing temporally explicit ENM.
TemporalModelR operates across two complementary dimensions to support temporally explicit modeling:
Because environmental conditions change over time, the same point in G-space may move through E-space across years, decades, or seasons. Traditional, temporally-static workflows assume both G-space and E-space to be stable in time. TemporalModelR assumes only that the niche (i.e., the species tolerance in E-space) is stable across the study period, while the E-space coordinates of any G-space location may change over time. Species observations are therefore matched to E-space data correct in both space and time, the niche is modeled in time-independent E-space. The resulting niche estimate is projected back onto G-space at explicit time periods to produce temporally dynamic predictions across time rather than a single static surface.
The package structures the SDM workflow around three phases:
Every function in the package is built around the concept of
time placeholders in raster filenames
(forest_cover_2005.tif,
pr_seas_2003_Spring.tif). Patterns map clean predictor
names to filenames; time_cols defines which columns in your
occurrence data correspond to those placeholders. For example, a column
in species occurrence data which labeled year with
subsequent values 2001, 2002,
2005, and a user defined variable name as
forest_cover_year, would match to rasters
forest_cover_2001, forest_cover_2002, and
forest_cover_2005. Likewise, a databse with both columns
year and month and a user defined variable
name of pr_m_year_month would match to rasters like
pr_m_1997_09.
This makes the workflow flexible to whether your time scale is annual, monthly, daily, or anything in between, so long as you have rasters data at the same temporal level. It also allows different predictors to vary at different time scales within the same model, flexibly allowing temporally static variables (e.g. elevation) to be processed in the same suite and used in the same models as temporally dynamic variables, which may themselves be measured at different time steps (e.g. annual forest cover estimates, monthly precipitation).
Note: Internet connection is required to install the package.
The development version of TemporalModelR can be installed using the code below.
# install.packages("devtools")
devtools::install_github("CJHughes926/TemporalModelR")A typical TemporalModelR workflow follows these steps:
A brief description of each step is presented below. For full walkthrough of each phase, see the vignettes hosted on the TemporalModelR site under the Articles menu, also linked inline throughout this README.
The package ships a small synthetic dataset that every vignette runs
against, so the full workflow can be reproduced end-to-end without
downloading any external data. The dataset includes raw and aligned
environmental rasters, an occurrence point database, intermediate point
files, and pre-computed data() objects from each phase of
the workflow. Reading the data set vignette first is recommended, as it
provides the shared context every other vignette refers back to.
For a description of the synthetic data and the bundled
data()objects used in the workflow vignettes, see the About the Example Dataset vignette.
The preprocessing pipeline transforms raw occurrence records and
environmental rasters into the structured inputs that downstream models
expect. Six functions cover this phase. raster_align()
standardizes rasters to a common projection, extent, and resolution to
prevent misalignment in future analyses.
spatiotemporal_rarefaction() subsets data to one point per
pixel per time step, reducing pseudoreplication from your data but
preserving truly unique temporally independent observations from the
same location. temporally_explicit_extraction() extracts
environmental values from the raster that matches each point’s
observation time step, producing a table of observations with their
relevant E-space conditions correct in both space and time, along with
the means and standard deviations of each variable across the data set.
scale_rasters() optionally uses those scaling parameters to
z-score every raster, ensuring variables share a comparable range and
contribute evenly to subsequent modeling algorithms sensitive to that.
spatiotemporal_partition() builds cross-validation folds
via one of four methods: purely spatial, purely temporal, spatiotemporal
(combining both), or random as a baseline. Finally,
generate_absences() produces fold-stratified pseudoabsences
relevant for presence/absence models.
For a detailed workflow showcasing each preprocessing function, see the Preprocessing temporally explicit data vignette.
TemporalModelR supports four modeling algorithms, all of which share
the same data requirements (partition_result,
pseudoabsence_result for presnce/absence models, a
model_formula or model_vars). One model is fit
per cross-validation fold; each is evaluated on its held-out test set,
and for any continuous models a threshold is selected and applied to
produce only binary results.
build_temporal_glm() fits one
generalized linear model per fold with a binomial link function.
Supports any combination of linear, polynomial, and interactive terms
via standard R formula syntax.build_temporal_gam() fits one
generalized additive model per fold via mgcv::gam(). Allows
for complex nonlinear relationships between environmental gradients and
species presence.build_temporal_rf() fits one random
forest classifier per fold via
randomForest::randomForest(). Uses ensembles of decision
trees to capture complex nonlinear relationships and interactions.build_temporal_hv() constructs
n-dimensional hypervolumes per fold via Gaussian kernel density
estimation or one-class support vector machine. The presence-only
option; no pseudoabsences required.Each model builder produces evaluation metrics across folds in E-space (time-independent), capturing overall niche estimation quality regardless of when the test points were observed. E-space evaluation gives a stable picture of model performance even in years where occurrence records are sparse.
Once a model is fit,
generate_spatiotemporal_predictions() projects it across
user-defined time steps. The function accepts any of the four model
types via the model_result argument and writes one
prediction raster per fold per time step alongside per-time-step
evaluation metrics. The same call handles single-timestep and
multi-timestep workflows given the relevant raster data is available:
for example time_steps = data.frame(year = 1:15) produces
15 annual maps; expand.grid(year = 1:15, season = "Spring")
produces 15 spring snapshots;
expand.grid(year = 1:15, season = c("Spring", "Summer"))
produces 30 maps spanning two seasons.
Once predictions are projected into G-space, time-specific assessment metrics become available, including cumulative binomial probability (CBP) tests assessing whether predictions in a given time step are better than random. G-space evaluation complements the E-space metrics from model fitting and both may be considered in model evaluation: G-space is robust and specific in years with substantial sample sizes but loses meaning when few records exist for a given time step, while E-space gives a more stable overall picture independent of time. Reporting both gives the most complete view of model performance.
For a full walkthrough of each modeling algorithm, including formula syntax, parameter choices, G-space projection, and diagnostic interpretation, see the four modeling vignettes: Modeling with a GLM, Modeling with a GAM, Modeling with a Random Forest, and Modeling with a Hypervolume.
The postprocessing phase converts the per-fold prediction rasters
into interpretable summaries. summarize_raster_outputs()
collapses fold-level predictions into a binary consensus stack (where a
user-specified number of folds agree on suitability) and a frequency
raster (proportion of time steps each pixel was suitable) which explored
the stability of G-space suitability projections across time.
analyze_temporal_patterns() uses the fastcpd
package to detect structured changes in each pixel’s suitability across
the time series, classifying trajectories as never-suitable,
always-suitable, no pattern, increasing, decreasing, or fluctuating. The
changepoint analysis accounts for temporal autocorrelation by including
each pixel’s previous-time-step value as a predictor in the change point
detection model; users can optionally enable spatial autocorrelation,
which adds the proportion of a pixel’s first-order (queen’s case)
neighbors predicted as suitable as an additional covariate.
analyze_trends_by_spatial_unit() aggregates pixel-level
patterns across user-defined polygons (administrative units, watersheds,
ecoregions), producing useful tables summarizing losses and gains in
suitability over time across a these regions, and is useful for
identifying regional variation in suitability trends.
For the full postprocessing workflow including consensus threshold selection, changepoint diagnostics, and zone aggregation, see the Postprocessing predictions vignette.
If you use TemporalModelR in your research, please cite:
Hughes, C., Castaneda-Guzman, M., & Escobar, L.E. (2026). TemporalModelR: An R Package for temporally-explicit species distribution modeling. [Journal information to be added]
To maintain high standards of code quality and documentation, we have used AI LLM tools while constructing code for this package. We used these tools for grammatical polishing and exploring technical implementation strategies for specialized functions. We manually checked and tested all code and documentation refined with these tools.
We welcome contributions to improve TemporalModelR. To
maintain the integrity and performance of the package, we follow a few
core principles:
If you have an idea for a major change, please open an Issue first to discuss it with the maintainers.