pkgdown/mathjax-config.html

Skip to contents

couplr 1.3.0

New Features

Optimal Full Matching

  • full_match() gains method = "optimal" (new default) using a min-cost max-flow solver (Dijkstra + Johnson potentials) that finds the globally optimal group assignment minimizing total distance:
    • Standard lower bound transformation enforces min_controls per group
    • Automatic transposition when n_left > n_right
    • New C++ solver: solve_full_matching.cpp (self-contained MCMF)
    • method = "greedy" preserved for fast approximate matching

Vignette Updates

  • Getting Started: Added full matching section with full_match() example
  • Matching Workflows: New “Full Matching (Variable-Ratio Groups)” section covering optimal vs greedy, constraints, weights, and comparison table
  • Comparison: Updated feature table and all sections to reflect couplr’s full matching support (previously listed as “No”)

couplr 1.2.0

New Features

Full Matching

  • New full_match() function assigns every unit to a matched group with variable ratios (1:k or k:1):
    • Greedy group formation: match each left to nearest right, then assign remaining right units to nearest matched left
    • Caliper support: caliper (absolute) or caliper_sd (SD-based)
    • Control group size constraints: min_controls, max_controls
    • Weights inversely proportional to group size
    • Returns full_matching_result S3 class

Coarsened Exact Matching (CEM)

  • New cem_match() function implements coarsened exact matching:
    • Coarsens continuous variables into bins (Sturges, FD, Scott, or custom)
    • Exact matching on coarsened values with stratum-based weights
    • Support for categorical grouping variables via grouping parameter
    • Custom cutpoints per variable via cutpoints parameter
    • Returns cem_result S3 class with matched units and strata summary

Subclassification

  • New subclass_match() function divides units into propensity score strata:
    • Quantile-based stratification with configurable number of subclasses
    • Supports pre-computed PS, pre-fitted models, or formula interface
    • Target estimands: ATT, ATE, ATC with appropriate weighting
    • Returns subclass_result S3 class with subclass summary

Output Layer & Ecosystem Integration

  • New match_data() generic converts any couplr result to analysis-ready format with treatment, weights, subclass, and distance columns. Methods for all result types (matching, full, CEM, subclass).
  • New as_matchit() converter creates matchit-class objects from couplr results, enabling interop with cobalt, marginaleffects, and other MatchIt ecosystem packages.
  • cobalt bal.tab() methods for all couplr result types. Requires cobalt package (in Suggests).

Mahalanobis Distance Improvements

S3 Generics

New Functions


couplr 1.1.0

CRAN release: 2026-03-03

New Features

Ratio and Replacement Matching

  • k:1 ratio matching via ratio parameter in match_couples() and greedy_couples(). Matches k control units to each treated unit by replicating the cost matrix, then deduplicates assignments.
  • With-replacement matching via replace parameter. Each treated unit independently selects its nearest control, allowing controls to be reused across multiple treated units.

Propensity Score Matching

  • New ps_match() function wraps match_couples() with logistic regression:
    • Accepts a formula or pre-fitted glm object
    • Matches on the logit of propensity scores with a caliper
    • Default caliper: 0.2 SD of logit(PS) (Rosenbaum and Rubin recommendation)
    • Returns matching_result with PS model metadata

Cardinality Matching

  • New cardinality_match() function maximizes sample size subject to balance constraints:
    • Starts with a full optimal match, then iteratively prunes imbalanced pairs
    • Balance threshold via max_std_diff (default: 0.1 for excellent balance)
    • Configurable pruning speed with batch_fraction
    • Returns pruning diagnostics: iterations, pairs removed, final balance

Sensitivity Analysis

  • New sensitivity_analysis() function implements Rosenbaum bounds:
    • Tests sensitivity of matched comparisons to hidden bias
    • Uses Wilcoxon signed-rank statistic with upper/lower p-value bounds
    • Reports critical gamma (smallest gamma at which significance is lost)
    • S3 methods: print(), summary(), plot()

Visualization

New Functions

Tests

  • Added 58 new tests across 7 test files
  • All 4916 tests passing across platforms

couplr 1.0.7

Bug Fixes

  • Fixed undefined behavior (UB) in Gabow-Tarjan algorithm: replaced left bit-shift of potentially negative values with multiplication to avoid sanitizer errors on M1-SAN checks
  • Fixed namespace conflict with select() in vignettes by using explicit dplyr::select() to prevent masking by MASS or other packages

couplr 1.0.6

CRAN release: 2026-01-20

Documentation

  • Added Overview section to algorithms vignette with audience and prerequisites
  • Fixed workflow diagram dark mode text handling in matching-workflows vignette
  • Improved SVG theme-awareness for multi-line text labels
  • Removed grid lines from matching-workflows plots for cleaner appearance
  • Added threshold labels to balance comparison plot

couplr 1.0.0

Major New Features (2025-11-19 Update)

Automatic Preprocessing and Scaling

The package now includes intelligent preprocessing to improve matching quality:

  • New auto_scale parameter in match_couples() and greedy_couples() enables automatic preprocessing
  • Variable health checks detect and handle problematic variables:
    • Constant columns (SD = 0) are automatically excluded with warnings
    • High missingness (>50%) triggers warnings
    • Extreme skewness (|skewness| > 2) is flagged
  • Smart scaling method selection analyzes data and recommends:
    • “robust” scaling using median and MAD (resistant to outliers)
    • “standardize” for traditional mean-centering and SD scaling
    • “range” for min-max normalization
  • New preprocess_matching_vars() function for manual preprocessing control
  • Categorical variable encoding for binary and ordered factors

Balance Diagnostics

Comprehensive tools to assess matching quality:

  • New balance_diagnostics() function computes multiple balance metrics:
    • Standardized differences: (mean_left - mean_right) / pooled_sd
    • Variance ratios: SD_left / SD_right
    • Kolmogorov-Smirnov tests for distribution comparison
    • Overall balance metrics (mean, max, % large imbalance)
  • Quality thresholds with interpretation:
    • |Std Diff| < 0.10: Excellent balance
    • |Std Diff| 0.10-0.25: Good balance
    • |Std Diff| 0.25-0.50: Acceptable balance
    • |Std Diff| > 0.50: Poor balance
  • Per-block statistics with quality ratings when blocking is used
  • balance_table() creates publication-ready formatted tables
  • Informative print methods with interpretation guides

Joined Matched Dataset Output

Create analysis-ready datasets directly from matching results:

  • New join_matched() function automates data preparation:
    • Joins matched pairs with original left and right datasets
    • Eliminates manual data wrangling after matching
    • Select specific variables via left_vars and right_vars parameters
    • Customizable suffixes (default: _left, _right) for overlapping columns
    • Optional metadata: pair_id, distance, block_id
    • Works with both optimal and greedy matching
  • Broom-style augment() method for tidymodels integration:
    • S3 method following broom package conventions
    • Sensible defaults for quick exploration
    • Supports all join_matched() parameters
  • Flexible output control:
    • include_distance - Include/exclude matching distance
    • include_pair_id - Include/exclude sequential pair IDs
    • include_block_id - Include/exclude block identifiers
    • Custom ID column support via left_id and right_id
    • Clean column ordering: pair_id → IDs → distance → block → variables

Precomputed and Reusable Distances

Performance optimization for exploring multiple matching strategies:

  • New compute_distances() function precomputes and caches distance matrices:
    • Compute distances once, reuse across multiple matching operations
    • Store complete metadata: variables, distance metric, scaling method, timestamps
    • Preserve original datasets for seamless integration with join_matched()
    • Enable rapid exploration of different matching parameters
    • Performance improvement: ~60% faster when trying multiple matching strategies
  • Distance objects (S3 class distance_object):
    • Self-contained: cost matrix, IDs, metadata, original data
    • Works with both match_couples() and greedy_couples()
    • Pass as first argument instead of datasets: match_couples(dist_obj, max_distance = 5)
    • Informative print and summary methods with distance statistics
  • Constraint modification via update_constraints():
    • Apply new max_distance or calipers without recomputing distances
    • Creates new distance object following copy-on-modify semantics
    • Experiment with different constraints efficiently
  • Backward compatible integration:
    • Modified function signatures: match_couples(left, right = NULL, vars = NULL, ...)
    • Automatically detects distance objects vs. datasets
    • All existing code continues to work unchanged

Parallel Processing

Speed up blocked matching with multi-core processing:

  • New parallel parameter in match_couples() and greedy_couples():
    • Enable with parallel = TRUE for automatic configuration
    • Specify plan with parallel = "multisession" or other future plan
    • Works with any number of blocks - automatically determines if beneficial
    • Gracefully falls back if future packages not installed
  • Powered by the future package:
    • Cross-platform support (Windows, Unix/Mac, clusters)
    • Respects user-configured parallel backends
    • Automatic worker management
    • Clean restoration of original plan after execution
  • Performance:
    • Best for 10+ blocks with 50+ units per block
    • Speedup scales with number of cores and complexity
    • Minimal overhead for small problems
  • Integration:
    • Works with all blocking methods (exact, fuzzy, clustering)
    • Compatible with distance caching from Step 4
    • Supports all matching parameters (constraints, calipers, scaling)

Fun Error Messages and Cost Checking

Like testthat, couplr makes errors light, memorable, and helpful with couple-themed messages:

  • New check_costs parameter (default: TRUE) in match_couples() and greedy_couples():
    • Automatically checks distance distributions before matching
    • Provides friendly, actionable warnings for common problems
    • Set to FALSE to skip checks in production code
  • Fun couple-themed error messages throughout the package:
    • 💔 “No matches made - can’t couple without candidates!”
    • 🔍 “Your constraints are too strict. Love can’t bloom in a vacuum!”
    • ✨ Helpful suggestions: “Try increasing max_distance or relaxing calipers”
    • 💖 Success messages: “Excellent balance! These couples are well-matched!”
  • Automatic problem detection:
    • Too many zeros: Warns about duplicates or identical values (>10% zero distances)
    • Extreme costs: Detects skewed distributions (99th percentile > 10x the 95th)
    • Many forbidden pairs: Warns when constraints eliminate >50% of valid pairs
    • Constant distances: Alerts when all distances are identical
    • Constant variables: Detects and excludes variables with no variation
  • New diagnostic function diagnose_distance_matrix():
    • Comprehensive analysis of cost distributions
    • Variable-specific problem detection
    • Actionable suggestions for fixes
    • Quality rating (good/fair/poor)
  • Emoji control: Disable with options(couplr.emoji = FALSE) if preferred
  • Philosophy: Errors should be less intimidating, more memorable, and provide clear guidance

New Functions

Documentation & Examples

  • examples/auto_scale_demo.R - 5 preprocessing demonstrations
  • examples/balance_diagnostics_demo.R - 6 balance diagnostic examples
  • examples/join_matched_demo.R - 8 joined dataset demonstrations
  • examples/distance_cache_demo.R - Distance caching and reuse examples
  • examples/parallel_matching_demo.R - 7 parallel processing examples
  • examples/error_messages_demo.R - 10 fun error message demonstrations
  • Complete implementation documentation (claude/IMPLEMENTATION_STEP1.md through STEP6.md)
  • All functions have full Roxygen documentation

Tests

  • Added 34+ new tests (10 for preprocessing, 11 for balance diagnostics, 13 for joined datasets, tests for distance caching)
  • All tests passing with full backward compatibility

Major Changes (Initial 1.0.0 Release)

Package Renamed: lapr → couplr

The package has been renamed from lapr to couplr to better reflect its purpose as a general pairing and matching toolkit.

couplr = Optimal pairing and matching via linear assignment

Clean 1.0.0 Release

First official stable release with clean, well-organized codebase.

New Organization

R Code

  • Eliminated 3 redundant files
  • Consistent morph_* naming prefix
  • Two-layer API: assignment() (low-level) + lap_solve() (tidy)
  • 10 well-organized files (down from 13)

C++ Code

  • Modular subdirectory structure:
    • src/core/ - Utilities and headers
    • src/interface/ - Rcpp exports
    • src/solvers/ - 14 LAP algorithms
    • src/gabow_tarjan/ - Gabow-Tarjan solver
    • src/morph/ - Image morphing

Features

Solvers

Hungarian, Jonker-Volgenant, Auction (3 variants), SAP/SSP, SSAP-Bucket, Cost-scaling, Cycle-cancel, Gabow-Tarjan, Hopcroft-Karp, Line-metric, Brute-force, Auto-select

High-Level

✅ Tidy tibble interface ✅ Matrix & data frame inputs
✅ Grouped data frames ✅ Batch solving + parallelization ✅ K-best solutions (Murty, Lawler) ✅ Rectangular matrices ✅ Forbidden assignments (NA/Inf) ✅ Maximize/minimize ✅ Pixel morphing visualization

API


Development history under “lapr” available in git log before v1.0.0.