| Title: | Recodes Sex/Gender Descriptions into a Standard Set |
| Version: | 0.1.1 |
| Description: | Provides dictionary-based tools for recoding free-text gender responses into consistent categories while preserving gender diversity where possible. The package standardises spelling, capitalization, whitespace, and common variants through curated named character-vector dictionaries, supports either detailed or collapsed output categories, and can retain original unmatched responses for manual review. It also includes helpers for creating custom dictionaries from approximate string matches and a local interactive application for recoding uploaded data files. |
| Depends: | R (≥ 3.0.0) |
| Maintainer: | Yaoxiang Li <liyaoxiang@outlook.com> |
| License: | GPL-2 |
| Encoding: | UTF-8 |
| LazyData: | true |
| RoxygenNote: | 7.3.3 |
| Suggests: | knitr, rmarkdown, bs4Dash, haven, shiny |
| VignetteBuilder: | knitr |
| URL: | https://github.com/ropensci/gendercoder |
| BugReports: | https://github.com/ropensci/gendercoder/issues |
| NeedsCompilation: | no |
| Packaged: | 2026-05-12 23:45:55 UTC; Bach |
| Author: | Yaoxiang Li |
| Repository: | CRAN |
| Date/Publication: | 2026-05-18 18:10:16 UTC |
gendercoder: A Package for Recoding Freetext Gender Data
Description
Provides dictionaries and recode_gender() to allow for easy automatic coding of common variations in free-text responses to the question "What is your gender?"
Author(s)
Maintainer: Yaoxiang Li liyaoxiang@outlook.com (ORCID)
Authors:
Jennifer Beaudry jbeaudry@swin.edu.au (ORCID)
Emily Kothe emily.kothe@deakin.edu.au (ORCID)
Felix Singleton Thorn fsingletonthorn@gmail.com (ORCID)
Rhydwyn McGuire rhydwyn@rhydwyn.net
Nicholas Tierney nicholas.tierney@gmail.com (ORCID)
Mathew Ling mathewtyling@gmail.com (ORCID)
Other contributors:
Julia Silge (Julia reviewed the package (v. 0.0.0.9000) for rOpenSci, see <https://github.com/ropensci/software-review/issues/435>) [reviewer]
Elin Waring (Elin reviewed the package (v. 0.0.0.9000) for rOpenSci, see <https://github.com/ropensci/software-review/issues/435>) [reviewer]
See Also
Useful links:
Report bugs at https://github.com/ropensci/gendercoder/issues
fewlevels_en
Description
An English dictionary for the recode_gender function that has fewer levels
Create a custom dictionary from fuzzy matches
Description
gender_create_dictionary suggests dictionary entries for gender
responses that are not already matched exactly. The returned named character
vector is intended to be reviewed before it is combined with a built-in
dictionary and passed to recode_gender().
Usage
gender_create_dictionary(
gender,
dictionary = gendercoder::manylevels_en,
max_distance = 1
)
Arguments
gender |
a character vector of gender responses for recoding |
dictionary |
a character vector whose names are known gender responses and whose values are replacement values |
max_distance |
maximum edit distance allowed for a suggested match |
Value
a named character vector of suggested replacement values
Examples
suggested <- gender_create_dictionary(
c("maile", "unknown"),
dictionary = manylevels_en,
max_distance = 1
)
suggested
Launch the gendercoder Shiny app
Description
Code data interactively in a Shiny app that runs locally in RStudio or a web browser using a bs4Dash interface. The app supports CSV, Stata, SPSS, RDS, and R data files. Stata and SPSS files require the optional haven package.
Usage
gendercoder_app(...)
Arguments
... |
arguments to pass to |
Value
Called for its side effect of launching a Shiny app.
Examples
if (interactive()) {
gendercoder_app()
}
manylevels_en
Description
An English dictionary for the recode_gender function that has many levels
recode_gender
Description
recode_gender matches uncleaned gender responses to cleaned list using
an built-in or custom dictionary.
Usage
recode_gender(
gender,
dictionary = gendercoder::manylevels_en,
retain_unmatched = FALSE
)
Arguments
gender |
a character vector of gender responses for recoding |
dictionary |
a list that the contains gender responses and their
replacement values. A built-in dictionary |
retain_unmatched |
logical indicating if gender responses that are not found in dictionary should be filled with the uncleaned values during recoding |
Value
a character vector of recoded genders
Examples
df <- data.frame(
stringsAsFactors = FALSE,
gender = c("male", "MALE", "mle", "I am male", "femail", "female", "enby"),
age = c(34L, 37L, 77L, 52L, 68L, 67L, 83L)
)
df$recoded_gender <- recode_gender(df$gender,
dictionary = manylevels_en,
retain_unmatched = TRUE
)
df
sample
Description
A sample data.frame of free-text gender in English for testing and demonstration