This document uses theme rmarkdown::html_vignette.
Below are examples using recommended styles for Rmarkdown rendering. Available styles in summarytools are the same as pander’s:
For freq(), descr() (and ctable(), although with caveats), rmarkdown style is recommended. For dfSummary(), grid is recommended.
knitr option results = 'asis' must be specified to get good results. This can be done globally via opts_chunk$set(results='asis'), or in the individual chunks.
The following summarytools global options have been set:
#st_options('omit.headings', TRUE)
st_options('bootstrap.css', FALSE)
st_options('footnote', NA)To generate tables using summarytool’s own html rendering, the .Rmd document’s configuration part (yaml) must point to the package’s summarytools.css file. This can be achieved in several ways; the current vignette uses this configuration:
output:
rmarkdown::html_vignette:
css:
- !expr system.file("rmarkdown/templates/html_vignette/resources/vignette.css", package = "rmarkdown")
- !expr system.file("includes/stylesheets/summarytools.css", package = "summarytools")
An alternative is to point to the directory on your system containing summarytools.css:
---
title: "RMarkdown using summarytools"
output:
html_document:
css: C:/R/win-library/3.4/summarytools/includes/stylesheets/summarytools.css
---
Starting with freq(), we’ll review the recommended methods and styles to get going with summarytools in Rmarkdown documents.
Jump to…
freq() is best used with `style = ‘rmarkdown’; html rendering is also possible.
freq(tobacco$gender, style = 'rmarkdown')Variable: tobacco$gender
Type: Factor (unordered)
| Freq | % Valid | % Valid Cum. | % Total | % Total Cum. | |
|---|---|---|---|---|---|
| F | 489 | 50.00 | 50.00 | 48.90 | 48.90 |
| M | 489 | 50.00 | 100.00 | 48.90 | 97.80 |
| <NA> | 22 | 2.20 | 100.00 | ||
| Total | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
print(freq(tobacco$gender), method = 'render')| Valid | Total | ||||
|---|---|---|---|---|---|
| gender | Freq | % | % Cumul | % | % Cumul |
| F | 489 | 50.00 | 50.00 | 48.90 | 48.90 |
| M | 489 | 50.00 | 100.00 | 48.90 | 97.80 |
| <NA> | 22 | 2.20 | 100.00 | ||
| Total | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
If you find the table too large, you can use table.classes = 'st-small' - an example is provided further below.
Tables with heading spanning over 2 rows are not fully supported in markdown (yet), but the result is getting close to acceptable.
ctable(tobacco$gender, tobacco$smoker, style = 'rmarkdown')Variables: gender * smoker
Data Frame: tobacco
| smoker | Yes | No | Total | |
| gender | ||||
| F | 147 (30.06%) | 342 (69.94%) | 489 (100.00%) | |
| M | 143 (29.24%) | 346 (70.76%) | 489 (100.00%) | |
| <NA> | 8 (36.36%) | 14 (63.64%) | 22 (100.00%) | |
| Total | 298 (29.80%) | 702 (70.20%) | 1000 (100.00%) |
For best results, use this method.
print(ctable(tobacco$gender, tobacco$smoker), method = 'render')| smoker | |||
|---|---|---|---|
| gender | Yes | No | Total |
| F | 147 (30.06%) | 342 (69.94%) | 489 (100.00%) |
| M | 143 (29.24%) | 346 (70.76%) | 489 (100.00%) |
| <NA> | 8 (36.36%) | 14 (63.64%) | 22 (100.00%) |
| Total | 298 (29.80%) | 702 (70.20%) | 1000 (100.00%) |
descr() is also best used with style = 'rmarkdown', and HTML rendering is also supported.
descr(tobacco, style = 'rmarkdown')Non-numerical variable(s) ignored: gender, age.gr, smoker, diseased, disease
Data Frame: tobacco
N: 1000
| age | BMI | cigs.per.day | samp.wgts | |
|---|---|---|---|---|
| Mean | 49.60 | 25.73 | 6.78 | 1.00 |
| Std.Dev | 18.29 | 4.49 | 11.88 | 0.08 |
| Min | 18.00 | 8.83 | 0.00 | 0.86 |
| Q1 | 34.00 | 22.93 | 0.00 | 0.86 |
| Median | 50.00 | 25.62 | 0.00 | 1.04 |
| Q3 | 66.00 | 28.65 | 11.00 | 1.05 |
| Max | 80.00 | 39.44 | 40.00 | 1.06 |
| MAD | 23.72 | 4.18 | 0.00 | 0.01 |
| IQR | 32.00 | 5.72 | 11.00 | 0.19 |
| CV | 2.71 | 5.73 | 0.57 | 11.92 |
| Skewness | -0.04 | 0.02 | 1.54 | -1.04 |
| SE.Skewness | 0.08 | 0.08 | 0.08 | 0.08 |
| Kurtosis | -1.26 | 0.26 | 0.90 | -0.90 |
| N.Valid | 975.00 | 974.00 | 965.00 | 1000.00 |
| Pct.Valid | 97.50 | 97.40 | 96.50 | 100.00 |
We’ll use table.classes = ‘st-small’ to show how it affects the table’s size (compare to the freq() table rendered earlier).
print(descr(tobacco), method = 'render', table.classes = 'st-small')Non-numerical variable(s) ignored: gender, age.gr, smoker, diseased, disease
| age | BMI | cigs.per.day | samp.wgts | |
|---|---|---|---|---|
| Mean | 49.60 | 25.73 | 6.78 | 1.00 |
| Std.Dev | 18.29 | 4.49 | 11.88 | 0.08 |
| Min | 18.00 | 8.83 | 0.00 | 0.86 |
| Q1 | 34.00 | 22.93 | 0.00 | 0.86 |
| Median | 50.00 | 25.62 | 0.00 | 1.04 |
| Q3 | 66.00 | 28.65 | 11.00 | 1.05 |
| Max | 80.00 | 39.44 | 40.00 | 1.06 |
| MAD | 23.72 | 4.18 | 0.00 | 0.01 |
| IQR | 32.00 | 5.72 | 11.00 | 0.19 |
| CV | 2.71 | 5.73 | 0.57 | 11.92 |
| Skewness | -0.04 | 0.02 | 1.54 | -1.04 |
| SE.Skewness | 0.08 | 0.08 | 0.08 | 0.08 |
| Kurtosis | -1.26 | 0.26 | 0.90 | -0.90 |
| N.Valid | 975 | 974 | 965 | 1000 |
| Pct.Valid | 97.50 | 97.40 | 96.50 | 100.00 |
This gives good results, although the histograms are not shown. This has to do with an unresolved issue, but we’re working hard to figure out a solution. Don’t forget to specify plain.ascii = FALSE, or you won’t get good results.
dfSummary(tobacco, style = 'grid', plain.ascii = FALSE)tobacco
N: 1000
| No | Variable | Stats / Values | Freqs (% of Valid) | Text Graph | Valid | Missing |
|---|---|---|---|---|---|---|
1 |
gender |
1. F |
489 (50.0%) |
IIIIIIIIIIIIIIII |
978 |
22 |
2 |
age |
mean (sd) : 49.6 (18.29) |
63 distinct val. |
975 |
25 |
|
3 |
age.gr |
1. 18-34 |
258 (26.5%) |
IIIIIIIIIIIII |
975 |
25 |
4 |
BMI |
mean (sd) : 25.73 (4.49) |
974 distinct val. |
974 |
26 |
|
5 |
smoker |
1. Yes |
298 (29.8%) |
IIIIII |
1000 |
0 |
6 |
cigs.per.day |
mean (sd) : 6.78 (11.88) |
37 distinct val. |
965 |
35 |
|
7 |
diseased |
1. Yes |
224 (22.4%) |
IIII |
1000 |
0 |
8 |
disease |
1. Hypertension |
36 (16.2%) |
IIIIIIIIIIIIIIII |
222 |
778 |
9 |
samp.wgts |
mean (sd) : 1 (0.08) |
0.86!: 267 (26.7%) |
IIIIIIIIIIIII IIIIIIIIIIII IIIIIIIIIIIIIIII IIIIIII |
1000 |
0 |
Although the results are not as neat as they are when simply generating an html report from the R interpreter – the transparency of the graphs is lost in translation –, this is the best method still.
print(dfSummary(tobacco, graph.magnif = 0.75), method = 'render')| No | Variable | Stats / Values | Freqs (% of Valid) | Graph | Valid | Missing |
|---|---|---|---|---|---|---|
| 1 | gender [factor] | 1. F 2. M | 489 (50.0%) 489 (50.0%) | 978 (97.8%) | 22 (2.2%) | |
| 2 | age [numeric] | mean (sd) : 49.6 (18.29) min < med < max : 18 < 50 < 80 IQR (CV) : 32 (0.37) | 63 distinct val. | 975 (97.5%) | 25 (2.5%) | |
| 3 | age.gr [factor] | 1. 18-34 2. 35-50 3. 51-70 4. 71 + | 258 (26.5%) 241 (24.7%) 317 (32.5%) 159 (16.3%) | 975 (97.5%) | 25 (2.5%) | |
| 4 | BMI [numeric] | mean (sd) : 25.73 (4.49) min < med < max : 8.83 < 25.62 < 39.44 IQR (CV) : 5.72 (0.17) | 974 distinct val. | 974 (97.4%) | 26 (2.6%) | |
| 5 | smoker [factor] | 1. Yes 2. No | 298 (29.8%) 702 (70.2%) | 1000 (100%) | 0 (0%) | |
| 6 | cigs.per.day [numeric] | mean (sd) : 6.78 (11.88) min < med < max : 0 < 0 < 40 IQR (CV) : 11 (1.75) | 37 distinct val. | 965 (96.5%) | 35 (3.5%) | |
| 7 | diseased [factor] | 1. Yes 2. No | 224 (22.4%) 776 (77.6%) | 1000 (100%) | 0 (0%) | |
| 8 | disease [character] | 1. Hypertension 2. Cancer 3. Cholesterol 4. Heart 5. Pulmonary 6. Musculoskeletal 7. Diabetes 8. Hearing 9. Digestive 10. Hypotension [ 3 others ] | 36 (16.2%) 34 (15.3%) 21 (9.5%) 20 (9.0%) 20 (9.0%) 19 (8.6%) 14 (6.3%) 14 (6.3%) 12 (5.4%) 11 (5.0%) 21 (9.4%) | 222 (22.2%) | 778 (77.8%) | |
| 9 | samp.wgts [numeric] | mean (sd) : 1 (0.08) min < med < max : 0.86 < 1.04 < 1.06 IQR (CV) : 0.19 (0.08) | 0.86! : 267 (26.7%) 1.04! : 249 (24.9%) 1.05! : 324 (32.4%) 1.06! : 160 (16.0%) ! rounded | 1000 (100%) | 0 (0%) |