vennDiagramLab is library-first and tidyverse-friendly. The broom-compatible S3 methods on RegionResult make it trivial to plug into targets / drake workflows or any pipeline that expects tidy data.
library(vennDiagramLab)
result <- analyze(load_sample("dataset_real_cancer_drivers_4"))Three methods convert a RegionResult to a tibble at three different levels of aggregation:
tidy(result) — one row per set pair, all five pairwise metricsglance(result) — one row, headline numbersaugment(result) — one row per item, set-membership flags + region labelbroom::glance(result)
head(broom::tidy(result))
head(broom::augment(result))If you want to filter to only the highly significant pairs:
broom::tidy(result) |>
dplyr::filter(highly_significant) |>
dplyr::arrange(dplyr::desc(jaccard)) |>
dplyr::select(set_a, set_b, intersection, jaccard, p_adjusted)Or count items per region:
broom::augment(result) |>
dplyr::count(region_label, sort = TRUE)A simple _targets.R file:
library(targets)
list(
tar_target(ds, load_sample("dataset_real_cancer_drivers_4")),
tar_target(result, analyze(ds)),
tar_target(stats_df, broom::tidy(result)),
tar_target(genes_df, broom::augment(result)),
tar_target(venn_svg, render_venn_svg(result)),
tar_target(venn_path,
{ writeLines(venn_svg, "venn.svg"); "venn.svg" },
format = "file")
)Run with targets::tar_make(). Each step caches independently, so re-running after only changing the sort order in a downstream report does not re-run the analysis.
statistics(result) recomputes on every call (no S4 lazy-property equivalent). If you call it many times, cache it once:
stats <- statistics(result)
str(stats@jaccard, max.level = 1)Inside a targets pipeline, this is a non-issue because tar_target(stats, statistics(result)) caches it for you.
vignette("v05_statistics_deep_dive") — what the metrics in broom::tidy() actually mean.vignette("v07_pdf_reports") — turning a result into a PDF artifact for a pipeline.