corrselect: Correlation-Based and Model-Based Predictor Pruning
Source:R/corrselect-package.R
corrselect-package.RdProvides tools for reducing multicollinearity in predictor sets through association-based and model-based approaches. The package offers both fast greedy algorithms for quick pruning and exact graph-theoretic algorithms for exhaustive subset enumeration.
Association-Based Pruning
These functions identify variable subsets where all pairwise correlations or associations remain below a user-defined threshold:
corrPruneFast greedy pruning for numeric data
corrSelectExhaustive enumeration for numeric data frames
assocSelectExhaustive enumeration for mixed-type data (numeric, factor, ordered)
MatSelectDirect interface using a pre-computed correlation matrix
Model-Based Pruning
These functions use variance inflation factors (VIF) to iteratively remove collinear predictors from regression models:
modelPruneVIF-based pruning for lm, glm, lme4, and glmmTMB models
Algorithms
The exact enumeration functions (corrSelect, assocSelect,
MatSelect) use two graph-theoretic algorithms:
- Eppstein-Loffler-Strash (ELS)
Recommended when using
force_inconstraints- Bron-Kerbosch
Default algorithm, with optional pivoting for performance
Helpers
corrSubsetExtract specific subsets from results
CorrCombo-classS4 class holding enumeration results
Author
Maintainer: Gilles Colling gilles.colling051@gmail.com