pkgdown/mathjax-config.html

Skip to contents

corrselect 3.1.0

CRAN release: 2026-01-08

Bug Fixes

  • corrPrune: Fixed numeric-numeric pair handling in mixed-type data (was incorrectly using Cramer’s V instead of Pearson correlation)
  • corrPrune: Fixed numeric-ordered pair handling (now properly converts ordered to numeric for Spearman correlation)

Test Coverage Improvements

Coverage improved from 92% to 94%:

  • Added tests for optional package measures (bicor, distance, maximal) with proper skip_if_not_installed() guards
  • Added tests for lme4 and glmmTMB engines in modelPrune
  • Added chi-squared edge case tests (sparse contingency tables, NA handling)
  • Added VIF edge case tests (perfect collinearity, single predictor)
  • Added lexicographic tie-breaking tests with synthetic correlation structures
  • Added mixed-type data tests (numeric-ordered, ordered-ordered, factor-factor pairs)
  • Added condition_number criterion tests
  • findAllMaxSets.R now at 100% coverage
  • corrPrune.R now at 97% coverage

corrselect 3.0.7

New Features

corrPrune Enhancements

  • Grouped pruning: New by parameter computes association matrices per group and aggregates using the group_q quantile (default: 0.5 = median). Useful when correlations vary across experimental conditions or subpopulations.
  • Additional measures for numeric data:
    • bicor: Biweight midcorrelation (requires WGCNA package)
    • distance: Distance correlation (requires energy package)
    • maximal: Maximal information coefficient (requires minerva package)

modelPrune Enhancements

  • Condition number criterion: New criterion = "condition_number" option uses SVD-based condition indices for detecting multicollinearity. Higher values indicate greater collinearity.

Tests

  • Added comprehensive tests for grouped pruning functionality
  • Added tests for condition_number criterion
  • Added edge case tests for single-group and insufficient-rows scenarios

corrselect 3.0.4

Test Coverage Improvements

  • Removed dead C++ code (isValidAddition, isValidCombination) from utils.cpp/utils.h
  • Added edge case tests for ELS algorithm (force_in validation, threshold boundaries)
  • Added edge case tests for association methods (Cramer’s V sparse tables, eta edge cases)
  • Added tests for corrPrune lexicographic tiebreaker and factor handling
  • Added tests for modelPrune custom engine error handling and VIF edge cases
  • Test coverage improved from 91.86% to 93.44%

corrselect 3.0.3

JOSS Review Response

This release addresses reviewer feedback from the JOSS submission.

Documentation

  • paper.md: Strengthened comparison with caret::findCorrelation() to emphasize the key difference (single solution vs. all maximal subsets)
  • paper.md: Added explicit graph-theoretic context (maximal cliques / independent sets formulation)
  • paper.md: Clarified that Bron-Kerbosch and ELS algorithms are implemented natively in C++, not as wrappers around igraph
  • paper.md: Added note about NP-hard complexity and the recommendation to use exact mode only for p ≤ 100
  • paper.md: Added code snippet demonstrating the “all subsets” output
  • paper.bib: Added citations for igraph (Csardi & Nepusz, 2006) and FCBF (Yu & Liu, 2003)
  • README.md: Added CRAN installation instructions (install.packages("corrselect"))
  • README.md: Fixed mixed model example with suppressWarnings() to hide expected VIF computation warnings
  • quickstart vignette: Fixed GitHub repository reference (GillesColling → gcol33)

Testing

  • Added edge case test: identity matrix (all off-diagonals = 0) returns single subset with all variables
  • Added edge case test: perfect duplicates (r = 1.0) are correctly separated into different subsets
  • Added threshold boundary test for correlation exactly at threshold

Infrastructure

  • Added GitHub Actions workflow for cross-platform R CMD check (Ubuntu, macOS, Windows)
  • Added GitHub Actions workflow for test coverage reporting
  • Updated .gitignore to exclude build artifacts (*.Rcheck/, *.tar.gz, CRAN-SUBMISSION)

corrselect 3.0.2

CRAN release: 2025-11-29

CRAN Compliance

  • Single-quoted software names in DESCRIPTION (‘lme4’, ‘glmmTMB’) per CRAN policy

Documentation

  • Updated vignettes with improved examples and workflows

corrselect 3.0.1

Bug Fixes

  • modelPrune(): Fixed infinite loop when VIF computation encountered perfect multicollinearity
    • Added proper handling of Inf and NA VIF values in pruning loop
    • Clamped extreme R² values (> 0.9999) to prevent division by near-zero
    • Added safety checks to prevent removing all variables
  • modelPrune(): Fixed design matrix extraction for lme4 and glmmTMB engines
    • Now uses stats::model.matrix() for all engines (more robust)
    • Eliminated “Could not find columns” warnings
  • Test suite: All 261 tests pass with zero warnings (CRAN-compliant)

corrselect 3.0.0

Major Release: Predictor Pruning Toolkit

Version 3.0.0 represents a major expansion of corrselect from a specialized subset enumeration tool into a comprehensive predictor pruning toolkit. Fully backward compatible with 2.x - all existing code continues to work.

Major Features

New Functions

  • corrPrune(): High-level association-based predictor pruning
    • Model-free pruning using pairwise correlations or associations
    • Automatic measure selection (measure = "auto")
    • Supports exact mode (small p), greedy mode (large p), or auto-selection
    • force_in parameter to protect important predictors
    • Returns single pruned data.frame with pairwise associations ≤ threshold
  • modelPrune(): Model-based predictor pruning using diagnostics
    • VIF-based iterative removal of multicollinear predictors
    • Supports multiple engines: lm, glm, lme4, glmmTMB
    • Custom engine support: Define your own modeling backends (INLA, mgcv, brms, etc.)
    • Prunes fixed effects only (preserves random effects in mixed models)
    • force_in parameter for protecting important variables
    • Returns pruned data.frame with final fitted model

New C++ Backend

  • Fast deterministic greedy pruning algorithm
    • Polynomial-time complexity O(p² × k) vs exponential for exact search
    • Handles p > 100 efficiently
    • Deterministic tie-breaking for reproducibility
    • Used by corrPrune(mode = "greedy") and mode = "auto"

Enhancements

  • Exact methods (corrSelect(), assocSelect()) now integrate seamlessly with corrPrune()
  • Deterministic subset selection when multiple maximal sets exist
  • Improved error messages for threshold feasibility checks
  • Better handling of edge cases (single predictor, all correlated, etc.)
  • Custom engine interface for modelPrune(): Users can define custom modeling backends with fit and diagnostics functions, enabling integration with any R modeling package

Documentation

  • Five new comprehensive vignettes (~60 minutes of content):
    • Quick Start: 5-minute introduction to corrPrune() and modelPrune()
    • Complete Workflows: Real-world examples across 4 domains (ecology, social science, genomics, clinical)
    • Comparison with Alternatives: When to choose corrselect vs caret, Boruta, glmnet
    • Performance Benchmarks: Timing comparisons, scalability tests, and optimization guidelines
    • Advanced Topics: Algorithms, custom engines (INLA, mgcv), performance optimization, troubleshooting
  • Four new example datasets with full documentation (bioclim, survey, genes, longitudinal)
  • Updated README with quickstart examples and custom engine support
  • Full documentation for corrPrune() and modelPrune()
  • Usage examples for all modeling engines

Package Changes

  • Added lme4 and glmmTMB to Suggests (required for respective engines)
  • Version bumped to 3.0.0 (major feature release)
  • Updated package description to reflect expanded pruning functionality

Notes

  • No breaking changes: Version 3.0.0 is fully backward compatible with 2.0.1
  • For large predictor sets (p > 20), use corrPrune(mode = "auto") for best performance
  • Mixed model engines require optional packages: install with install.packages(c("lme4", "glmmTMB"))

corrselect 2.0.1

CRAN release: 2025-09-08

Bug Fixes

  • force_in in MatSelect() now correctly accepts character column names.
  • els now correctly lists all valid subsets when a single variable is forced in.
  • corrSelect() now displays an appropriate warning if only one variable remains after dropping unsupported columns.
  • Association matrix construction in assocSelect() now safely falls back to 0 for failed or meaningless associations (e.g. empty chi-squared tables due to sparse combinations or unused factor levels).

Features Added

  • assocSelect() now supports logical columns by automatically converting them to factors.

corrselect 2.0.0

Major Release: Mixed-Type Association Selection

Version 2.0.0 introduces support for mixed-type data through the new assocSelect() function, enabling subset selection on datasets containing numeric, factor, and ordered variables.

Major Features

  • assocSelect(): New function for mixed-type data frame interface
    • Handles numeric, factor, and ordered variables
    • Automatic association measure selection based on variable pair types
    • Supports Pearson, Spearman, Kendall correlations
    • Computes Eta-squared for numeric-factor pairs
    • Computes Cramér’s V for factor-factor pairs

Enhancements

  • Improved algorithm selection logic
  • Better handling of edge cases in subset enumeration
  • Enhanced documentation with examples for mixed-type workflows