This article provides a comprehensive guide to density estimation
using the biasBound_density() function.
The main function for density estimation takes the following key arguments:
biasBound_density(
X, # Data vector
x = NULL, # Evaluation points (auto if NULL)
h = NULL, # Bandwidth (auto-selected if NULL)
h_method = "cv", # "cv" or "silverman"
alpha = 0.05, # Confidence level
kernel.fun = "Schennach2004",
xi_lb = NULL, # Frequency range lower bound
xi_ub = NULL # Frequency range upper bound
)# Generate sample data from 2-fold uniform distribution
X <- gen_sample_data(size = 1000, dgp = "2_fold_uniform", seed = 42)
# Estimate density
fit <- biasBound_density(X, h = 0.08, kernel.fun = "Schennach2004")
# Display results
fit
#> Bias-Bounded Density Estimation
#>
#> Call:
#> biasBound_density(X = X, h = 0.08, kernel.fun = "Schennach2004")
#>
#> Sample size: n = 1000
#> Bandwidth: h = 0.0800 (user-specified)
#> Kernel: Schennach2004
#>
#> Bias bound parameters:
#> A = 6.8980, r = 2.5142
#> bias bound b1x = 0.0334
#>
#> Evaluation points: 100 (range: [-0.1746, 2.1780])
#> Confidence level: 95%
#>
#> Use summary() for detailed statistics
#> Use plot() to visualize resultsThe result is an S3 object of class bbnp_density
containing:
# Key parameters
coef(fit)
#> A r h
#> 6.898006 2.514158 0.080000
# Detailed summary
summary(fit)
#> Summary: Bias-Bounded Density Estimation
#> ============================================================
#>
#> Call:
#> biasBound_density(X = X, h = 0.08, kernel.fun = "Schennach2004")
#>
#> Sample Information:
#> Sample size (n): 1000
#> Bandwidth (h): 0.0800
#> Kernel function: Schennach2004
#>
#> Bias Bound Parameters:
#> A (amplitude): 6.8980
#> r (decay rate): 2.5142
#> b1x (bias bound): 0.0334
#> Xi interval: [2.4074, 4.5544]
#>
#> Range Estimation:
#> Density estimates:
#> min Q1.25% median mean Q3.75% max
#> 0.0000 0.1111 0.4273 0.4217 0.7125 0.8798
#>
#> Standard errors:
#> min mean max
#> 0.0000 0.0341 0.0563Interpreting the plot:
| Element | Description |
|---|---|
| Estimate | Kernel density estimate \(\hat{f}(x)\) |
| Bias bound | Bias range: \([\hat{f} - \bar{b}, \hat{f} + \bar{b}]\) |
| 95% CI | \([\hat{f} - \bar{b} - z\hat{\sigma}, \hat{f} + \bar{b} + z\hat{\sigma}]\) |
The package automatically detects smoothness via Fourier analysis:
Interpreting the Fourier plot (the legend labels each line):
The slope \(r\) indicates smoothness: larger \(r\) = smoother function.
fit_cv <- biasBound_density(X, h = NULL, h_method = "cv", kernel.fun = "normal")
cat("CV bandwidth:", coef(fit_cv)["h"], "\n")
#> CV bandwidth: 0.1878156
plot(fit_cv) + ggtitle("Cross-Validation Bandwidth")| Kernel | Order | Best For |
|---|---|---|
| Schennach2004 | Infinite | High smoothness (recommended) |
| sinc | Infinite | Any smoothness level |
| normal | 2 | Moderate smoothness |
| epanechnikov | 2 | Limited smoothness |
fit_sch <- biasBound_density(X, kernel.fun = "Schennach2004")
fit_sinc <- biasBound_density(X, kernel.fun = "sinc")
fit_norm <- biasBound_density(X, kernel.fun = "normal")
fit_epan <- biasBound_density(X, kernel.fun = "epanechnikov")
grid.arrange(
plot(fit_sch) + ggtitle("Schennach2004 (infinite order)"),
plot(fit_sinc) + ggtitle("Sinc (infinite order)"),
plot(fit_norm) + ggtitle("Normal (2nd order)"),
plot(fit_epan) + ggtitle("Epanechnikov (2nd order)"),
ncol = 2
)Recommendation: Use "Schennach2004" for
most applications as it adapts to any smoothness level.
You can manually specify the frequency range for Fourier analysis:
# Default (automatic)
fit_auto <- biasBound_density(X, h = 0.08)
# Custom range
fit_custom <- biasBound_density(X, h = 0.08, xi_lb = 2, xi_ub = 8)
grid.arrange(
plot(fit_auto, type = "ft") + ggtitle("Automatic Frequency Range"),
plot(fit_custom, type = "ft") + ggtitle("Custom Range [2, 8]"),
ncol = 1
)# Confidence intervals as matrix
ci <- confint(fit)
head(ci, 10)
#> lower upper
#> [1,] 0 0.03341978
#> [2,] 0 0.03341978
#> [3,] 0 0.03341978
#> [4,] 0 0.03341978
#> [5,] 0 0.03341978
#> [6,] 0 0.04274476
#> [7,] 0 0.05896604
#> [8,] 0 0.07625691
#> [9,] 0 0.09584398
#> [10,] 0 0.11785186
# Evaluation points
x_points <- fit$x
head(x_points)
#> [1] -0.1746134 -0.1508492 -0.1270850 -0.1033208 -0.0795566 -0.0557924
# Density estimates
f_hat <- fit$density
head(f_hat)
#> [1] 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.002945549The bias-bound approach yields narrower CIs than undersmoothing:
h_opt <- select_bandwidth(X, method = "cv", kernel.fun = "sinc")
# Bias-bound with optimal bandwidth
result_opt <- biasBound_density(X, h = h_opt, kernel.fun = "sinc")
# Undersmoothing (half optimal)
result_under <- biasBound_density(X, h = h_opt * 0.5, kernel.fun = "sinc")
grid.arrange(
plot(result_opt) + ggtitle(paste0("Bias Bound (h = ", round(h_opt, 3), ")")),
plot(result_under) + ggtitle(paste0("Undersmoothing (h = ", round(h_opt/2, 3), ")")),
ncol = 1
)The bias-bound approach provides valid coverage while maintaining efficient estimation.