Package: llmclean
Type: Package
Title: LLM-Assisted Data Cleaning with Multi-Provider Support
Version: 0.1.1
Date: 2026-06-01
Authors@R: c(
    person("Sadikul", "Islam",
           email   = "sadikul.islamiasri@gmail.com",
           role    = c("aut", "cre"),
           comment = c(ORCID = "0000-0003-2924-7122")),
    person("Rajesh", "Kaushal",
           role    = "aut"))
Maintainer: Sadikul Islam <sadikul.islamiasri@gmail.com>
Description: Detects and suggests fixes for semantic inconsistencies in data
    frames by calling large language models (LLMs) through a unified,
    provider-agnostic interface. Supported providers include 'OpenAI'
    ('GPT-4o', 'GPT-4o-mini') <https://platform.openai.com>,
    'Anthropic' ('Claude') <https://www.anthropic.com>,
    'Google' ('Gemini') <https://ai.google.dev>,
    'Groq' (free-tier 'LLaMA' and 'Mixtral') <https://groq.com>,
    and local 'Ollama' models <https://ollama.com>.
    The package identifies issues that rule-based tools cannot detect:
    abbreviation variants, typographic errors, case inconsistencies, and
    malformed values. Results are returned as tidy data frames with column,
    row index, detected value, issue type, suggested fix, and confidence
    score. An offline fallback using statistical and fuzzy-matching methods
    is provided for use without any application programming interface (API)
    key. Interactive fix application with human review is supported via
    'apply_fixes()'. Methods follow de Jonge and van der Loo (2013)
    <https://cran.r-project.org/doc/contrib/de_Jonge+van_der_Loo-Introduction_to_data_cleaning_with_R.pdf>
    and Chaudhuri et al. (2003) <doi:10.1145/872757.872796>.
License: GPL-3
Depends: R (>= 4.1.0)
Imports: stats, utils, dplyr (>= 1.0.0), rlang (>= 1.0.0)
Suggests: knitr, rmarkdown, testthat (>= 3.0.0), httr2 (>= 1.0.0),
        jsonlite (>= 1.8.0)
VignetteBuilder: knitr
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.3
Config/testthat/edition: 3
NeedsCompilation: no
Packaged: 2026-06-09 10:54:34 UTC; acer
Author: Sadikul Islam [aut, cre] (ORCID:
    <https://orcid.org/0000-0003-2924-7122>),
  Rajesh Kaushal [aut]
Repository: CRAN
Date/Publication: 2026-06-09 14:20:02 UTC
