Nettskjemar connects to version 3 of the nettskjema api, and the main functionality here is to download data from a form into R. Once you have created a nettskjema client, and set up your Renvironment locally, you can start accessing your forms.
While functions to download data also have the option to turn off the codebook, i.e. return the data with the original questions as column names, this is not recommended. Working with data in R in this format is very unpredictable, and we cannot guarantee that the functions in this package will act as expected.
Therefore, you are highly advised if you are using this package, to turn on the codebook in the Nettskjema portal for your form, and setting up a codebook for the entire form.
You can toggle the codebook for a form by going to the Nettskjema portal and entering your form. Then proceed to “Settings” and then “General settings”, and make sure “Codebook activated” is set to “Yes”. Once this is toggled, you will need to setup the codebook, either manually (advised) or by using the pre-filling functionality in Nettskjema.
You can read more about the details around the codebook on the UiO webpages (only available in Norwegian).
The data returns in this package are developed to be tidyverse-compatible. This means that those who are familiar with tidyverse, should find working with the data as retrieved from this package fairly easy. If you want to learn about the tidyverse and how to use it, there are excellent resources for that on the Tidyverse webpage.
Perhaps at the core of this package is the ability to download submission answers to a form into a tibble (variation of a data.frame).
library(nettskjemar)
formid <- 123823
data <- ns_get_data(formid)
data
#> formid $submission_id $created
#> 1 123823 27685292 2023-06-01T20:57:15+02:00
#> freetext radio checkbox.questionnaires
#> 1 some text 1 1
#> checkbox.events checkbox.logs dropdown
#> 1 1 0 4
#> radio_matrix.grants radio_matrix.lecture
#> 1 1 2
#> radio_matrix.email checkbox_matrix.1.IT
#> 1 2 1
#> checkbox_matrix.1.colleague
#> 1 1
#> checkbox_matrix.1.admin checkbox_matrix.1.union
#> 1 0 0
#> checkbox_matrix.1.internet checkbox_matrix.2.IT
#> 1 0 0
#> checkbox_matrix.2.colleague
#> 1 0
#> checkbox_matrix.2.admin checkbox_matrix.2.union
#> 1 1 0
#> checkbox_matrix.2.internet date time
#> 1 0 2023-06-01 12:00
#> datetime number_decimal number_integer
#> 1 2023-06-12T13:33 4.5 77
#> slider attachment_1 attachment_2 $answer_time_ms
#> 1 3 sølvi.png 74630
#> [ reached 'max' / getOption("max.print") -- omitted 2 rows ]
You will notice three columns that are prefixed with $
.
These are columns automatically added by the Nettskjema backend to your
data, and the prefix is used to denote exactly this.
If you’d like to have labelled data, similar to that of SPSS and Stata, have a look at the vignette on labelled data for more information about that.
Raw answers come in a very different format than the data as seen in the codebook and preview in Nettskjema portal. The raw data have a timestamped format, per question for each submission. This means the data comes in a tall format (many rows per submission) rather than a wide format (one row per submission). The raw data may provide those who have keen interest in the time a users spent between each question. The data.frame this output will have one column that indicates which question the row is for, and another which is the response to that question.
# Fetch raw data
ns_get_data(formid, type = "long")
#> formid formId submissionId answerId elementId
#> 1 123823 123823 27685292 158263133 1641697
#> 2 123823 123823 27685292 158263124 1641698
#> externalElementId textAnswer answerOptionIds
#> 1 freetext some text
#> 2 radio <NA> 3879435
#> externalAnswerOptionIds elementType
#> 1 QUESTION
#> 2 1 RADIO
#> createdDate modifiedDate
#> 1 2023-06-01T20:57:15 2023-06-01T20:57:15
#> 2 2023-06-01T20:57:15 2023-06-01T20:57:15
#> subElementId answerAttachmentId filename mediaType
#> 1 NA NA <NA> <NA>
#> 2 NA NA <NA> <NA>
#> size attachment.answerAttachmentId
#> 1 NA NA
#> 2 NA NA
#> attachment.filename attachment.mediaType
#> 1 <NA> <NA>
#> 2 <NA> <NA>
#> attachment.size
#> 1 NA
#> 2 NA
#> [ reached 'max' / getOption("max.print") -- omitted 44 rows ]
Raw data cannot be labelled, and no other alterations have been made to the data. They come exactly as the API returns them, so you may do what you need with them.
The Nettskjema survey tool includes the possibility to create a matrix of checkboxes, i.e. giving the respondents the ability to select several options within a question. Checkboxes are returned as binary columns, one columns per checkbox where 1 indicates that the checkbox was ticked.
We provide some extra functionality to help work with these data. Currently we only support working with the checkbox matrix questions, but are working on a solution for all checkboxes.
ns_get_data(formid) |>
ns_alter_checkbox(
to = "list"
)
#> $submission_id formid $created
#> 1 27685292 123823 2023-06-01T20:57:15+02:00
#> 2 27685302 123823 2023-06-01T20:58:33+02:00
#> freetext radio checkbox.questionnaires
#> 1 some text 1 1
#> 2 another answer -1 0
#> checkbox.events checkbox.logs dropdown
#> 1 1 0 4
#> 2 0 1 9
#> radio_matrix.grants radio_matrix.lecture
#> 1 1 2
#> 2 3 3
#> radio_matrix.email date time
#> 1 2 2023-06-01 12:00
#> 2 1 2023-02-07 14:45
#> datetime number_decimal number_integer
#> 1 2023-06-12T13:33 4.5 77
#> 2 2024-02-15T08:55 2.2 45
#> slider attachment_1 attachment_2 $answer_time_ms
#> 1 3 sølvi.png 74630
#> 2 1 marius.jpeg 71313
#> checkbox_matrix.1 checkbox_matrix.2
#> 1 colleague, IT admin
#> 2 internet admin, union
#> [ reached 'max' / getOption("max.print") -- omitted 1 rows ]
These can be separated into rows if wanted, using tidyverse syntax.
library(tidyr)
library(dplyr)
# As list column
ns_get_data(formid) |>
ns_alter_checkbox(
to = "list"
) |>
relocate(checkbox_matrix.1, .after = 2)
#> $submission_id formid checkbox_matrix.1
#> 1 27685292 123823 colleague, IT
#> 2 27685302 123823 internet
#> $created freetext radio
#> 1 2023-06-01T20:57:15+02:00 some text 1
#> 2 2023-06-01T20:58:33+02:00 another answer -1
#> checkbox.questionnaires checkbox.events
#> 1 1 1
#> 2 0 0
#> checkbox.logs dropdown radio_matrix.grants
#> 1 0 4 1
#> 2 1 9 3
#> radio_matrix.lecture radio_matrix.email date
#> 1 2 2 2023-06-01
#> 2 3 1 2023-02-07
#> time datetime number_decimal
#> 1 12:00 2023-06-12T13:33 4.5
#> 2 14:45 2024-02-15T08:55 2.2
#> number_integer slider attachment_1 attachment_2
#> 1 77 3 sølvi.png
#> 2 45 1 marius.jpeg
#> $answer_time_ms checkbox_matrix.2
#> 1 74630 admin
#> 2 71313 admin, union
#> [ reached 'max' / getOption("max.print") -- omitted 1 rows ]
# Turns list column, into rows
ns_get_data(formid) |>
ns_alter_checkbox(
to = "list"
) |>
unnest(checkbox_matrix.1) |>
relocate(checkbox_matrix.1, .after = 2)
#> # A tibble: 5 × 23
#> `$submission_id` formid checkbox_matrix.1
#> <int> <dbl> <chr>
#> 1 27685292 123823 colleague
#> 2 27685292 123823 IT
#> 3 27685302 123823 internet
#> 4 27685319 123823 colleague
#> 5 27685319 123823 internet
#> # ℹ 20 more variables: `$created` <chr>,
#> # freetext <chr>, radio <int>,
#> # checkbox.questionnaires <int>,
#> # checkbox.events <int>, checkbox.logs <int>,
#> # dropdown <int>, radio_matrix.grants <int>,
#> # radio_matrix.lecture <int>,
#> # radio_matrix.email <int>, date <chr>, …
If you are not used to working with list columns, another option is to work with delimited strings in columns. In this procedure, you can turn the checkboxes into a character column, where each option in separated by a character of your choosing.
# As delimited string column
ns_get_data(formid) |>
ns_alter_checkbox(
to = "character",
sep = ";"
) |>
relocate(checkbox_matrix.1, .after = 2)
#> $submission_id formid checkbox_matrix.1
#> 1 27685292 123823 colleague;IT
#> 2 27685302 123823 internet
#> $created freetext radio
#> 1 2023-06-01T20:57:15+02:00 some text 1
#> 2 2023-06-01T20:58:33+02:00 another answer -1
#> checkbox.questionnaires checkbox.events
#> 1 1 1
#> 2 0 0
#> checkbox.logs dropdown radio_matrix.grants
#> 1 0 4 1
#> 2 1 9 3
#> radio_matrix.lecture radio_matrix.email date
#> 1 2 2 2023-06-01
#> 2 3 1 2023-02-07
#> time datetime number_decimal
#> 1 12:00 2023-06-12T13:33 4.5
#> 2 14:45 2024-02-15T08:55 2.2
#> number_integer slider attachment_1 attachment_2
#> 1 77 3 sølvi.png
#> 2 45 1 marius.jpeg
#> $answer_time_ms checkbox_matrix.2
#> 1 74630 admin
#> 2 71313 admin;union
#> [ reached 'max' / getOption("max.print") -- omitted 1 rows ]
# Turns string column, into rows
ns_get_data(formid) |>
ns_alter_checkbox(
to = "character",
sep = ";"
) |>
relocate(checkbox_matrix.1, .after = 2) |>
separate_rows(checkbox_matrix.1)
#> # A tibble: 5 × 23
#> `$submission_id` formid checkbox_matrix.1
#> <int> <dbl> <chr>
#> 1 27685292 123823 colleague
#> 2 27685292 123823 IT
#> 3 27685302 123823 internet
#> 4 27685319 123823 colleague
#> 5 27685319 123823 internet
#> # ℹ 20 more variables: `$created` <chr>,
#> # freetext <chr>, radio <int>,
#> # checkbox.questionnaires <int>,
#> # checkbox.events <int>, checkbox.logs <int>,
#> # dropdown <int>, radio_matrix.grants <int>,
#> # radio_matrix.lecture <int>,
#> # radio_matrix.email <int>, date <chr>, …