pkgdown/mathjax-config.html

Skip to contents

couplr 1.0.6

CRAN release: 2026-01-20

Documentation

  • Added Overview section to algorithms vignette with audience and prerequisites
  • Fixed workflow diagram dark mode text handling in matching-workflows vignette
  • Improved SVG theme-awareness for multi-line text labels
  • Removed grid lines from matching-workflows plots for cleaner appearance
  • Added threshold labels to balance comparison plot

couplr 1.0.0

Major New Features (2025-11-19 Update)

Automatic Preprocessing and Scaling

The package now includes intelligent preprocessing to improve matching quality:

  • New auto_scale parameter in match_couples() and greedy_couples() enables automatic preprocessing
  • Variable health checks detect and handle problematic variables:
    • Constant columns (SD = 0) are automatically excluded with warnings
    • High missingness (>50%) triggers warnings
    • Extreme skewness (|skewness| > 2) is flagged
  • Smart scaling method selection analyzes data and recommends:
    • “robust” scaling using median and MAD (resistant to outliers)
    • “standardize” for traditional mean-centering and SD scaling
    • “range” for min-max normalization
  • New preprocess_matching_vars() function for manual preprocessing control
  • Categorical variable encoding for binary and ordered factors

Balance Diagnostics

Comprehensive tools to assess matching quality:

  • New balance_diagnostics() function computes multiple balance metrics:
    • Standardized differences: (mean_left - mean_right) / pooled_sd
    • Variance ratios: SD_left / SD_right
    • Kolmogorov-Smirnov tests for distribution comparison
    • Overall balance metrics (mean, max, % large imbalance)
  • Quality thresholds with interpretation:
    • |Std Diff| < 0.10: Excellent balance
    • |Std Diff| 0.10-0.25: Good balance
    • |Std Diff| 0.25-0.50: Acceptable balance
    • |Std Diff| > 0.50: Poor balance
  • Per-block statistics with quality ratings when blocking is used
  • balance_table() creates publication-ready formatted tables
  • Informative print methods with interpretation guides

Joined Matched Dataset Output

Create analysis-ready datasets directly from matching results:

  • New join_matched() function automates data preparation:
    • Joins matched pairs with original left and right datasets
    • Eliminates manual data wrangling after matching
    • Select specific variables via left_vars and right_vars parameters
    • Customizable suffixes (default: _left, _right) for overlapping columns
    • Optional metadata: pair_id, distance, block_id
    • Works with both optimal and greedy matching
  • Broom-style augment() method for tidymodels integration:
    • S3 method following broom package conventions
    • Sensible defaults for quick exploration
    • Supports all join_matched() parameters
  • Flexible output control:
    • include_distance - Include/exclude matching distance
    • include_pair_id - Include/exclude sequential pair IDs
    • include_block_id - Include/exclude block identifiers
    • Custom ID column support via left_id and right_id
    • Clean column ordering: pair_id → IDs → distance → block → variables

Precomputed and Reusable Distances

Performance optimization for exploring multiple matching strategies:

  • New compute_distances() function precomputes and caches distance matrices:
    • Compute distances once, reuse across multiple matching operations
    • Store complete metadata: variables, distance metric, scaling method, timestamps
    • Preserve original datasets for seamless integration with join_matched()
    • Enable rapid exploration of different matching parameters
    • Performance improvement: ~60% faster when trying multiple matching strategies
  • Distance objects (S3 class distance_object):
    • Self-contained: cost matrix, IDs, metadata, original data
    • Works with both match_couples() and greedy_couples()
    • Pass as first argument instead of datasets: match_couples(dist_obj, max_distance = 5)
    • Informative print and summary methods with distance statistics
  • Constraint modification via update_constraints():
    • Apply new max_distance or calipers without recomputing distances
    • Creates new distance object following copy-on-modify semantics
    • Experiment with different constraints efficiently
  • Backward compatible integration:
    • Modified function signatures: match_couples(left, right = NULL, vars = NULL, ...)
    • Automatically detects distance objects vs. datasets
    • All existing code continues to work unchanged

Parallel Processing

Speed up blocked matching with multi-core processing:

  • New parallel parameter in match_couples() and greedy_couples():
    • Enable with parallel = TRUE for automatic configuration
    • Specify plan with parallel = "multisession" or other future plan
    • Works with any number of blocks - automatically determines if beneficial
    • Gracefully falls back if future packages not installed
  • Powered by the future package:
    • Cross-platform support (Windows, Unix/Mac, clusters)
    • Respects user-configured parallel backends
    • Automatic worker management
    • Clean restoration of original plan after execution
  • Performance:
    • Best for 10+ blocks with 50+ units per block
    • Speedup scales with number of cores and complexity
    • Minimal overhead for small problems
  • Integration:
    • Works with all blocking methods (exact, fuzzy, clustering)
    • Compatible with distance caching from Step 4
    • Supports all matching parameters (constraints, calipers, scaling)

Fun Error Messages and Cost Checking

Like testthat, couplr makes errors light, memorable, and helpful with couple-themed messages:

  • New check_costs parameter (default: TRUE) in match_couples() and greedy_couples():
    • Automatically checks distance distributions before matching
    • Provides friendly, actionable warnings for common problems
    • Set to FALSE to skip checks in production code
  • Fun couple-themed error messages throughout the package:
    • 💔 “No matches made - can’t couple without candidates!”
    • 🔍 “Your constraints are too strict. Love can’t bloom in a vacuum!”
    • ✨ Helpful suggestions: “Try increasing max_distance or relaxing calipers”
    • 💖 Success messages: “Excellent balance! These couples are well-matched!”
  • Automatic problem detection:
    • Too many zeros: Warns about duplicates or identical values (>10% zero distances)
    • Extreme costs: Detects skewed distributions (99th percentile > 10x the 95th)
    • Many forbidden pairs: Warns when constraints eliminate >50% of valid pairs
    • Constant distances: Alerts when all distances are identical
    • Constant variables: Detects and excludes variables with no variation
  • New diagnostic function diagnose_distance_matrix():
    • Comprehensive analysis of cost distributions
    • Variable-specific problem detection
    • Actionable suggestions for fixes
    • Quality rating (good/fair/poor)
  • Emoji control: Disable with options(couplr.emoji = FALSE) if preferred
  • Philosophy: Errors should be less intimidating, more memorable, and provide clear guidance

New Functions

Documentation & Examples

  • examples/auto_scale_demo.R - 5 preprocessing demonstrations
  • examples/balance_diagnostics_demo.R - 6 balance diagnostic examples
  • examples/join_matched_demo.R - 8 joined dataset demonstrations
  • examples/distance_cache_demo.R - Distance caching and reuse examples
  • examples/parallel_matching_demo.R - 7 parallel processing examples
  • examples/error_messages_demo.R - 10 fun error message demonstrations
  • Complete implementation documentation (claude/IMPLEMENTATION_STEP1.md through STEP6.md)
  • All functions have full Roxygen documentation

Tests

  • Added 34+ new tests (10 for preprocessing, 11 for balance diagnostics, 13 for joined datasets, tests for distance caching)
  • All tests passing with full backward compatibility

Major Changes (Initial 1.0.0 Release)

Package Renamed: lapr → couplr

The package has been renamed from lapr to couplr to better reflect its purpose as a general pairing and matching toolkit.

couplr = Optimal pairing and matching via linear assignment

Clean 1.0.0 Release

First official stable release with clean, well-organized codebase.

New Organization

R Code

  • Eliminated 3 redundant files
  • Consistent morph_* naming prefix
  • Two-layer API: assignment() (low-level) + lap_solve() (tidy)
  • 10 well-organized files (down from 13)

C++ Code

  • Modular subdirectory structure:
    • src/core/ - Utilities and headers
    • src/interface/ - Rcpp exports
    • src/solvers/ - 14 LAP algorithms
    • src/gabow_tarjan/ - Gabow-Tarjan solver
    • src/morph/ - Image morphing

Features

Solvers

Hungarian, Jonker-Volgenant, Auction (3 variants), SAP/SSP, SSAP-Bucket, Cost-scaling, Cycle-cancel, Gabow-Tarjan, Hopcroft-Karp, Line-metric, Brute-force, Auto-select

High-Level

✅ Tidy tibble interface ✅ Matrix & data frame inputs
✅ Grouped data frames ✅ Batch solving + parallelization ✅ K-best solutions (Murty, Lawler) ✅ Rectangular matrices ✅ Forbidden assignments (NA/Inf) ✅ Maximize/minimize ✅ Pixel morphing visualization

API


Development history under “lapr” available in git log before v1.0.0.