Why?
Medical device event data are messy.
Common challenges include:
How?
The mds package provides a standardized framework to address these challenges:
R files for auditability, documentation, and reproducibilityPurpose of This Vignette
mdsmds functions: deviceevent(), exposure(), define_analyses(), time_series()Note on Statistical Algorithms
mds data and analysis standards allow for seamless application of various statistical trending algorithms via the mdsstat package (under development).
Our example dataset maude was queried from the FDA MAUDE API and contains 535 reported events on bone cement in 2017. Furthermore, a simulated exposure dataset sales was generated to provide denominator data for our bone cement events.
head(maude, 3)
| report_number | event_type | date_received | product_problem_flag | adverse_event_flag | report_source_code | lot_number | model_number | manufacturer_d_name | manufacturer_d_country | brand_name | device_name | medical_specialty_description | device_class | region |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0002249697-2017-00023 | Malfunction | 20170103 | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | Bone Cement | Orthopedic | 2 | Central | |
| 0002249697-2017-00028 | Malfunction | 20170103 | Y | N | Manufacturer report | MHX080 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | Bone Cement | Orthopedic | 2 | West | |
| 0002249697-2017-00025 | Malfunction | 20170103 | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | Bone Cement | Orthopedic | 2 | Central |
head(sales, 3)
| device_name | region | sales_month | sales_volume |
|---|---|---|---|
| Arthroscope | Central | 2017-01-01 | 83 |
| Arthroscope | Central | 2017-02-01 | 119 |
| Arthroscope | Central | 2017-03-01 | 112 |
The general workflow to go from data to trending over time is as follows:
deviceevent() to standardize device-event data.exposure() to standardize exposure data (optional).define_analyses() to enumerate possible analysis combinations.time_series() to generate counts (and/or rates) by time based on your defined analyses.# Step 1 - Device Events
de <- deviceevent(
maude,
time="date_received",
device_hierarchy=c("device_name", "device_class"),
event_hierarchy=c("event_type", "medical_specialty_description"),
key="report_number",
covariates="region",
descriptors="_all_")
# Step 2 - Exposures (Optional step)
ex <- exposure(
sales,
time="sales_month",
device_hierarchy="device_name",
match_levels="region",
count="sales_volume")
# Step 3 - Define Analyses
da <- define_analyses(
de,
device_level="device_name",
exposure=ex,
covariates="region")
# Step 4 - Time Series
ts <- time_series(
da,
deviceevents=de,
exposure=ex)You may:
de, ex), analyses (da), and time series (ts) for documentationsummary() and define_analyses_dataframe()plot() your time series (plotting options)mdsstat package)summary(da)
#> $`Analyses Timestamp`
#> [1] "2019-07-04 13:18:43 PDT"
#>
#> $`Analyses Counts`
#> Total Analyses Analyses with Exposure Device Levels
#> 27 27 6
#> Event Levels Covariates
#> 1 2
#>
#> $`Date Ranges`
#> Data Start End
#> 1 Device-Event 2017-01-01 2017-12-01
#> 2 Exposure 2017-01-01 2017-12-01
#> 3 Both 2017-01-01 2017-12-01head(dadf, 3)
| id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | exp_device_level | exp_covariate_level | date_range_exposure_start | date_range_exposure_end | date_range_de_exp_start | date_range_de_exp_end |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | device_name | Bone Cement | device_class | 2 | event_type | All | region | Central | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | Central | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
| 2 | device_name | Bone Cement | device_class | 2 | event_type | All | region | West | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | West | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
| 3 | device_name | Bone Cement | device_class | 2 | event_type | All | region | East | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | East | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
deviceevent() to Standardize Device-Event DataBasic Usage
head(de, 3)
| key | time | device_1 | device_2 | event_1 | event_2 |
|---|---|---|---|---|---|
| 1 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic |
| 2 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic |
| 3 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic |
Advanced Usage
de <- deviceevent(
maude,
time="date_received",
device_hierarchy=c("device_name", "device_class"),
event_hierarchy=c("event_type", "medical_specialty_description"),
key="report_number",
covariates="region",
descriptors="_all_")head(de, 3)
| key | time | device_1 | device_2 | event_1 | event_2 | region | product_problem_flag | adverse_event_flag | report_source_code | lot_number | model_number | manufacturer_d_name | manufacturer_d_country | brand_name |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0002249697-2017-00023 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic | Central | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | |
| 0002249697-2017-00028 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic | West | Y | N | Manufacturer report | MHX080 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK | |
| 0002249697-2017-00025 | 2017-01-03 | Bone Cement | 2 | Malfunction | Orthopedic | Central | Y | N | Manufacturer report | MHX076 | STRYKER ORTHOPAEDICS-MAHWAH | US | SIMPLEX P - US TOBRA FD 10-PK |
data_frametimeDate format.
device_hierarchymds remembers this hierarchy and allows trending at multiple levels as you specify.
event_hierarchydescriptors argument. The hierarchical concept reflects how events are often nested into progressively more general groups. Set the first variable as the lowest event level that you would like to trend at. mds remembers this hierarchy and allows trending at multiple levels as you specify. If your data does not have an event variable, you will need to create a dummy variable.
keydata_frame. If your data pipeline carries over a key variable, it is recommended to specify it here. The key allows downstream aggregated analysis to be able to “look up” individual constituent events.
covariatescovariates="Region" will allow analysis of regions within device. These variables should be categorical in nature.
descriptorsimplant_daysexposure() to Standardize Exposure DataExposure data is meant to support device-event data. As such, the general expectation is that variable values match between exposure and device-event data. For example, 10 exposures for ev3 Solitaire in France will be matched exactly to ev3 Solitaire events in France, and not to events for EV3 SOLITAIRE in FRANCE.
Basic Usage
head(ex, 3)
| key | time | count | device_1 |
|---|---|---|---|
| 1 | 2017-01-01 | 1 | Arthroscope |
| 2 | 2017-02-01 | 1 | Arthroscope |
| 3 | 2017-03-01 | 1 | Arthroscope |
Advanced Usage
ex <- exposure(
sales,
time="sales_month",
device_hierarchy="device_name",
match_levels="region",
count="sales_volume")head(ex, 3)
| key | time | count | device_1 | region |
|---|---|---|---|---|
| 1 | 2017-01-01 | 83 | Arthroscope | Central |
| 2 | 2017-02-01 | 119 | Arthroscope | Central |
| 3 | 2017-03-01 | 112 | Arthroscope | Central |
Note: Although not required, count will commonly be used as well.
data_frametimeDate format. If exposure will be used, it is critical to have sufficient time granularity. For example, if analysis will be done monthly, exposure data must be no less granular than monthly. mds does not make assumptions about filling in holes in time!
device_hierarchydevice_hierarchy parameter.
event_hierarchyevent_hierarchy parameter. Exposures at an event level is not common.
countkeydata_frame. If your data pipeline carries over a key variable, it is recommended to specify it here. The key allows downstream aggregated analysis to be able to “look up” individual constituent exposure records.
match_levelsdefine_analyses() to Enumerate Analysis CombinationsAfter standardizing device-event data using deviceevent() and, optionally, exposure data using exposure(), the next step is to discover what types of analyses are possible. This is separated from actually doing the analysis (counting, calculations, statistics, etc.) because:
Basic Usage
Note that define_analyses() returns a list of individual analyses. Each individual analysis contains a set of instructions. You can view an analysis by submitting da[[1]], da[[2]], etc., but a less cumbersome overview is possible using summary() and define_analyses_dataframe().
summary(da)
#> $`Analyses Timestamp`
#> [1] "2019-07-04 13:18:46 PDT"
#>
#> $`Analyses Counts`
#> Total Analyses Analyses with Exposure Device Levels
#> 7 0 6
#> Event Levels Covariates
#> 1 1
#>
#> $`Date Ranges`
#> Data Start End
#> 1 Device-Event 2017-01-01 2017-12-01
#> 2 Exposure <NA> <NA>
#> 3 Both 2017-01-01 2017-12-01head(define_analyses_dataframe(da), 3)
| id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | date_range_de_exp_start | date_range_de_exp_end |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | device_name | Bone Cement | device_class | 2 | event_type | All | Data | All | FALSE | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
| 2 | device_name | Bone Cement, Antibiotic | device_class | 2 | event_type | All | Data | All | FALSE | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
| 3 | device_name | Cement, Bone, Vertebroplasty | device_class | 2 | event_type | All | Data | All | FALSE | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
Advanced Usage
summary(da)
#> $`Analyses Timestamp`
#> [1] "2019-07-04 13:18:46 PDT"
#>
#> $`Analyses Counts`
#> Total Analyses Analyses with Exposure Device Levels
#> 27 27 6
#> Event Levels Covariates
#> 1 2
#>
#> $`Date Ranges`
#> Data Start End
#> 1 Device-Event 2017-01-01 2017-12-01
#> 2 Exposure 2017-01-01 2017-12-01
#> 3 Both 2017-01-01 2017-12-01head(define_analyses_dataframe(da), 3)
| id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | exp_device_level | exp_covariate_level | date_range_exposure_start | date_range_exposure_end | date_range_de_exp_start | date_range_de_exp_end |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | device_name | Bone Cement | device_class | 2 | event_type | All | region | Central | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | Central | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
| 2 | device_name | Bone Cement | device_class | 2 | event_type | All | region | West | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | West | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
| 3 | device_name | Bone Cement | device_class | 2 | event_type | All | region | East | FALSE | 2017-01-01 | 2017-12-01 | Bone Cement | East | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-12-01 |
deviceeventsclass() should contain "mds_de")
device_levelattributes(de)$device_hierarchy.
event_levelattributes(de)$event_hierarchy.
exposureclass() should contain "mde_e")
date_level and date_level_n"months" and 1 analyzes by month. Other examples include "months" and 12 for yearly, or "days" and 7 for weekly.
covariatesc("region") analyzes by each level of region within device.
times_to_calcdate_level and date_level_n.
It is always assumed that analyses at aggregated levels are desired. (such as analysis of all events for a given device, or analysis of all events across all devices)
Aggregated level analysis is easily recognized by the "All" and "Data" values in device_level, event_level, covariate, and covariate_level.
| id | device_level_source | device_level | device_1up_source | device_1up | event_level_source | event_level | covariate | covariate_level | invivo | date_range_de_start | date_range_de_end | exp_device_level | exp_covariate_level | date_range_exposure_start | date_range_exposure_end | date_range_de_exp_start | date_range_de_exp_end | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 11 | 11 | device_name | Cement, Bone, Vertebroplasty | device_class | 2 | event_type | All | region | East | FALSE | 2017-01-01 | 2017-08-01 | Cement, Bone, Vertebroplasty | East | 2017-01-01 | 2017-12-01 | 2017-01-01 | 2017-08-01 |
| 12 | 12 | device_name | Cement, Bone, Vertebroplasty | device_class | 2 | event_type | All | region | Central | FALSE | 2017-02-01 | 2017-12-01 | Cement, Bone, Vertebroplasty | Central | 2017-01-01 | 2017-12-01 | 2017-02-01 | 2017-12-01 |
| NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
There are several options:
da[[c(1:5, 24:27)]]))define_analyses() with different parameter settingsda[[1]]$date_range_exposure['start'] <- as.Date("2016-10-01"))time_series() to Generate Counts, Rates, and MoreOnce an analysis has been defined using define_analyses(), the analyses instructions can be executed using time_series(), returning by defined time periods:
key parameter from deviceevent()) for lookup of individual event records.key parameter from exposure()) for lookup of individual exposure records.Basic Usage
Note that time_series() returns, in a list, one time series data frame for every analysis. You can select a time series by submitting ts[[1]], ts[[2]], etc.
head(ts[[1]], 3)
| time | nA | ids |
|---|---|---|
| 17167 | 13 | 0002249697-2017-00023 |
| 17198 | 7 | 0002249697-2017-00488 |
| 17226 | 5 | 0002249697-2017-00755 |
Advanced Usage
head(ts[[1]], 3)
| time | nA | ids | exposure | ids_exposure |
|---|---|---|---|---|
| 17167 | 13 | 0002249697-2017-00023 | 8597 | 37 |
| 17198 | 7 | 0002249697-2017-00488 | 5115 | 38 |
| 17226 | 5 | 0002249697-2017-00755 | 10191 | 39 |
analysisclass() should contain "mds_da") or a list of defined analysis.
deviceeventsclass() contains "mds_de"). It is typically the same data frame used to generate analysis, but can be another "mds_de" data frame, such as a cut of the data at a different time. Note if, say, an older dataset is being used, the analysis date ranges must correspond.
exposureclass() contains "mds_e"). It is typically the same data frame used to generate analysis. Like deviceevents, another data frame may be used, but the analysis instructions must correspond.
use_hierarchy?time_series.mds_da for more details.
It is not uncommon to adjust event and exposure counts, such as with applications of rolling or moving averages. These adjustments should be applied after generating time series data frames from time_series().
plot()ing a Time SeriesPlotting an individual time series generated by time_series() is simple. Simply call plot() on the time series object:
There are a few custom parameters, including:
mode"nA" (representing the device-event of interest), "exposure", and "rate" (simply "nA"/"exposure"). Less common are "nB", "nC", and "nD" representing the cell counts of the disproportionality analysis (DPA) contingency table.
xlab, ylab, mainplot() behavior. By default, axes and title labels are inferred directly from the time series.
All other parameters are from plot.default().