Title: | Community Estimation in G-Models via CORD |
Version: | 0.2.0 |
Maintainer: | Xi (Rossi) LUO <xi.rossi.luo@gmail.com> |
Author: | Xi (Rossi) LUO [aut, cre], Florentina Bunea [aut], Christophe Giraud [aut] |
Description: | Partitions data points (variables) into communities/clusters, similar to clustering algorithms such as k-means and hierarchical clustering. This package implements a clustering algorithm based on a new metric CORD, defined for high-dimensional parametric or semiparametric distributions. For more details see Bunea et al. (2020), Annals of Statistics <doi:10.1214/18-AOS1794>. |
License: | GPL-3 |
URL: | https://doi.org/10.1214/18-AOS1794 |
Encoding: | UTF-8 |
Suggests: | pcaPP |
Imports: | Rcpp |
LinkingTo: | Rcpp, RcppArmadillo |
NeedsCompilation: | yes |
RoxygenNote: | 7.3.3 |
Packaged: | 2025-10-15 22:03:31 UTC; xluo |
Repository: | CRAN |
Date/Publication: | 2025-10-15 22:30:08 UTC |
Community estimation in G-models via CORD
Description
Partition data points (variables) into clusters/communities. Reference: Bunea et al (2020). Model assisted variable clustering: Minimax-optimal recovery and algorithms, Annals of Statistics, doi:10.1214/18-AOS1794.
Usage
cord(
X,
tau = 2 * sqrt(log(ncol(X))/nrow(X)),
kendall = T,
input = c("data", "cor", "dist")
)
Arguments
X |
Input data matrix. It should be an n (samples) by p (variables) matrix when |
tau |
Threshold to use at each iteration. A theoretical choice is about |
kendall |
Whether to compute Kendall's tau correlation matrix from |
input |
Type of input |
Value
list
with one element: a vector of integers showing which cluster/community each point is assigned to.
Examples
set.seed(100)
X <- 2*matrix(rnorm(200*2), 200, 10)+matrix(rnorm(200*10), 200, 10)
cord(X)