Divides units into K strata based on quantiles of the propensity score, then computes within-stratum weights for treatment effect estimation. This is a simple, transparent approach to propensity score adjustment that allows visual inspection of balance within each subclass.
Usage
subclass_match(
formula = NULL,
data = NULL,
treatment = NULL,
n_subclasses = 5L,
ps = NULL,
ps_model = NULL,
estimand = "ATT"
)Arguments
- formula
Formula for propensity score model (e.g.,
treatment ~ age + income). Ignored ifpsis provided.- data
Data frame containing all variables
- treatment
Character, name of the binary treatment column (0/1)
- n_subclasses
Integer, number of subclasses to create (default: 5). Cochran (1968) showed that 5 subclasses removes over 90\ a single covariate.
- ps
Optional pre-computed numeric vector of propensity scores (one per row in
data). If NULL, a logistic regression model is fit usingformula.- ps_model
Optional pre-fitted
glmobject for propensity scores- estimand
Target estimand:
"ATT"(default),"ATE", or"ATC"
Value
An S3 object of class c("subclass_result", "couplr_result")
containing:
- matched
Tibble with columns
id,side,subclass,ps,weight- subclass_summary
Tibble with per-subclass statistics: counts, mean PS, and overlap status
- info
List with
n_subclasses,estimand,n_left,n_right,method,vars
Details
The algorithm:
Estimate propensity scores via logistic regression (or use pre-computed scores)
Divide the propensity score distribution into K quantile-based strata
For each stratum, check overlap (both treated and control units present)
Compute within-stratum weights based on the target estimand:
ATT: Treated units get weight 1; control units get weight
n_treated_in_stratum / n_control_in_stratumATE: Both groups get weight proportional to stratum size relative to total sample
ATC: Control units get weight 1; treated units get weight
n_control_in_stratum / n_treated_in_stratum