Overview
This vignette serves three purposes:
Build intuition: Use pixel morphing as a visual analogy for assignment problems
Scale up: Demonstrate approximation strategies when exact LAP becomes infeasible
Scientific applications: Show how matching applies to ecology, physics, and chemistry
This vignette is different: Unlike the other couplr
documentation, it emphasizes understanding over doing.
If you’re looking to solve a matching problem today, start with
vignette("getting-started") or
vignette("matching-workflows"). Come here when you want to
understand why these algorithms work and when
approximations are appropriate.
Who This Vignette Is For
Audience: Advanced users, researchers, algorithm developers, curious minds
Prerequisites:
Familiarity with
lap_solve()(vignette("getting-started"))Basic complexity analysis (Big-O notation)
Interest in algorithm design or scientific computing
What You’ll Learn:
Why exact LAP becomes infeasible for large n
Three approximation strategies and their trade-offs
How matching problems appear in ecology, physics, and chemistry
Mathematical connections to optimal transport theory
Time to complete: 45-60 minutes (conceptual reading)
Documentation Roadmap
| Vignette | Focus | Difficulty |
|---|---|---|
| Getting Started | Basic LAP solving, API introduction | Beginner |
| Algorithms | Mathematical foundations, solver selection | Intermediate |
| Matching Workflows | Production matching pipelines | Intermediate |
| Pixel Morphing | Scientific applications, approximations | Advanced |
You are here: Pixel Morphing (Advanced)
Why Pixels?
Pixels provide an ideal testbed for understanding assignment problems:
Each pixel is an entity with measurable properties
Color = feature (what it looks like)
Position = spatial location (where it is)
The matching is visually verifiable: you can see if it worked
The same algorithms that morph images smoothly also track particles in physics, align molecules in chemistry, and match vegetation plots in ecology.
The General Matching Problem
Problem Formulation
Given two sets of entities and , find the optimal one-to-one correspondence by minimizing
where the cost combines feature similarity and spatial proximity:
Feature distance : domain-specific similarity
Ecology: Bray-Curtis dissimilarity between species vectors
Physics: difference in particle intensity or size
Chemistry: penalty for mismatched atom types
Images: Euclidean distance in RGB color space
Spatial distance : physical proximity
Ecology: geographic distance between plot centers
Physics: Euclidean distance accounting for predicted motion
Chemistry: 3D distance between atomic coordinates
Images: 2D pixel position distance
Weights balance feature matching vs. spatial coherence.
Computational Challenge
Exact solution: solve the full LAP.
Complexity: using Jonker-Volgenant
Feasible: up to (about images, or plots/particles/atoms)
Prohibitive: for ( images), runtime and memory become expensive
Real applications often involve
High-resolution images: pixels
Large ecological surveys: plots
Particle tracking: particles per frame
Molecular dynamics: atoms
We therefore need approximations that are much faster but still produce high-quality matchings.
Visual Illustration: Pixel Morphing
To make the abstract ideas concrete, we visualize them using image morphing where
entities = pixels
features = RGB color values
spatial position = coordinates
We first show the static input images (all at for display), then the animated morphs produced by different matching strategies.




The first pair are real photographs, the second pair are simple geometric shapes. Internally, all matching is computed on logical grids; we then upscale to purely for clearer display.
Exact Pixel Matching
The exact pixel morph uses a full LAP solution on a cost matrix. For each pair of pixels we compute
where color distances are normalized to (RGB in ) and spatial distances to using the image diagonal.


This yields an optimal one-to-one assignment of pixels. The resulting animations are smooth and artifact-free but require solving the full LAP.
Three Approximation Strategies
We now describe the three approximation strategies in detail. The animations above correspond directly to these methods.
Strategy 1: Feature Quantization
Core idea: reduce problem size by grouping entities with similar features, then match groups.
Mathematical Formulation
- Quantize features
Map continuous feature space to a finite palette
where (for example for ).
- Group by palette
Form groups and similarly for .
- Match groups
Solve a LAP between palette entries with costs
where is the palette color and the centroid position of group .
- Assign entities
Every entity in is assigned according to the group-to-group match.
Complexity Reduction
Original: for an LAP
Quantized: for the LAP plus group assignment
Speedup: approximately
For example, with (a image) and you get
times fewer LAP operations.
Quality Trade-offs
Advantages
Very large speedups for big
Preserves global structure (similar features stay together)
Produces smooth, band-like motion without large jumps
Disadvantages
Loses detail within each palette group
Quantization artifacts when is too small
May miss optimal local pairings between similar but distinct feature values
The corresponding GIFs are the color walk morphs shown earlier.
Strategy 2: Hierarchical Decomposition
Core idea: split the domain into smaller subproblems by spatial partitioning, solve subproblems, and combine.
Mathematical Formulation
- Spatial partitioning
Divide the domain into patches (for example so you get patches). Denote the subset of entities of in patch by
- Patch-level matching
Form patch representatives: centroid position and mean features per patch. Solve an LAP between patches, with costs defined using the same feature and spatial distances but now at patch level.
- Recursive refinement
Within each matched patch pair :
If (a threshold, e.g. ) solve the subproblem exactly.
Otherwise, partition that patch pair again and repeat.
- Combine solutions
Concatenate assignments from all leaf subproblems to obtain the global matching.
Complexity (Sketch)
With levels of decomposition (each level splitting into four patches), the work can be made close to in practice, compared to for a single full LAP. Intuitively, the LAPs near the leaves are very small, and the costly large LAP is replaced by a series of much smaller ones.
Quality Trade-offs
Advantages
Scales to very large (tens of thousands of entities)
Preserves local structure: nearby entities tend to be matched within the same spatial patch
No feature discretization, so feature precision is retained
Disadvantages
May miss globally optimal cross-patch matches
Quality depends on partitioning scheme and threshold
Possible boundary artifacts if important structure crosses patch boundaries
High-Level Algorithm
// Pseudocode for hierarchical LAP matching
FUNCTION match_hierarchical(region_A, region_B, threshold, level):
// Base case: region small enough for exact LAP
IF size(region_A) <= threshold THEN
cost ← compute_cost_matrix(region_A, region_B, α, β)
RETURN lap_solve(cost)
END IF
// Divide into 2×2 spatial grid (4 patches)
patches_A ← spatial_partition(region_A, grid = 2×2)
patches_B ← spatial_partition(region_B, grid = 2×2)
// Compute patch representatives
FOR each patch p DO
centroid[p] ← mean(positions in p)
mean_feature[p] ← mean(features in p)
END FOR
// Match patches using 4×4 LAP
patch_cost ← matrix(4, 4)
FOR i = 1 TO 4 DO
FOR j = 1 TO 4 DO
patch_cost[i, j] ← α·distance(mean_feature_A[i], mean_feature_B[j]) +
β·distance(centroid_A[i], centroid_B[j])
END FOR
END FOR
patch_assignment ← lap_solve(patch_cost)
// Recursively solve within matched patches
assignments ← []
FOR i = 1 TO 4 DO
j ← patch_assignment[i]
sub_assignment ← match_hierarchical(
patches_A[i],
patches_B[j],
threshold,
level + 1
)
assignments ← append(assignments, sub_assignment)
END FOR
RETURN concatenate(assignments)
END FUNCTION
The couplr implementation adds pragmatic details such as
normalization of color and spatial distances, conversion between
coordinates and raster indexing, and handling remainder patches when the
grid does not divide evenly.
Strategy 3: Resolution Reduction
Core idea: solve the LAP on a coarse grid, then lift/upscale the assignment to the full-resolution grid.
Mathematical Formulation
- Downscale
Reduce spatial resolution by a factor (for example ):
Now and each have entities.
- Solve at low resolution
Compute an exact LAP solution on the problem:
- Upscale assignment
Map the low-resolution assignment back to full resolution:
where each full-resolution entity inherits the assignment of its coarse cell.
Quality Trade-offs
Advantages
Very simple to implement
Exact LAP at the coarse level
Large speedups for moderate
Disadvantages
Loss of fine detail and blocky artifacts
Assignment is no longer a true permutation at pixel level (multiple fine pixels can map to the same coarse target)
Quality deteriorates quickly for larger
In practice, resolution reduction is most useful as a crude initialization step for very large problems ().
Strategy Comparison
| Approach | Speedup (vs. exact) | Quality | Best for |
|---|---|---|---|
| Exact LAP | Optimal | ||
| Feature quantization | Good global structure | Distinct feature groups | |
| Hierarchical | Good local structure | Large , strong spatial structure | |
| Resolution reduction | Moderate | Very large , rough initialization |
Practical rules of thumb
: use the exact LAP.
: feature quantization or a shallow hierarchy.
: hierarchical decomposition with 2-3 levels.
: combine resolution reduction with a hierarchical method.
Implementation Details of Exact Pixel Matching
We now spell out the exact LAP-based morph more concretely.
We again use the cost
The algorithm:
// Pseudocode for exact pixel matching
// Step 1: Compute full cost matrix (normalized)
n_pixels ← height × width
cost ← matrix(0, n_pixels, n_pixels)
FOR i = 1 TO n_pixels DO
FOR j = 1 TO n_pixels DO
// RGB color distance (normalized to [0, sqrt(3)])
color_dist ← sqrt((R_A[i] - R_B[j])^2 +
(G_A[i] - G_B[j])^2 +
(B_A[i] - B_B[j])^2) / (255 · sqrt(3))
// Spatial distance (normalized to [0, 1] by diagonal)
spatial_dist ← sqrt((x_A[i] - x_B[j])^2 +
(y_A[i] - y_B[j])^2) / diagonal_length
// Combined cost
cost[i, j] ← α · color_dist + β · spatial_dist
END FOR
END FOR
// Step 2: Solve with Jonker-Volgenant
assignment ← lap_solve(cost, method = "jv")
// Step 3: Generate morph frames by linear interpolation
FOR frame_idx = 1 TO n_frames DO
t ← frame_idx / n_frames // Time parameter in [0, 1]
FOR pixel_i = 1 TO n_pixels DO
j ← assignment[pixel_i] // Matched target pixel
// Interpolate position
x_new[pixel_i] ← (1 - t) · x_A[pixel_i] + t · x_B[j]
y_new[pixel_i] ← (1 - t) · y_A[pixel_i] + t · y_B[j]
// Keep source color (transport-only, no blending)
RGB_new[pixel_i] ← RGB_A[pixel_i]
END FOR
frames[frame_idx] ← render(x_new, y_new, RGB_new)
END FOR
The couplr implementation handles indexing, raster
layout, and shows or saves the resulting GIFs.
Approximate performance: up to about (10 000 pixels) on typical hardware is fine with the exact LAP.
Application to Scientific Domains
We now return from pixel morphs to the scientific settings that motivated them.
Ecology: Vegetation Plot Matching
Problem: match vegetation plots surveyed at time to plots at time to track community dynamics.
Feature distance: Bray-Curtis dissimilarity between species abundance vectors
where are abundances of species in plots and .
Spatial distance: geographic distance (e.g. in kilometers) between plot centers.
Exact solution for small studies ():
// Pseudocode for ecological plot matching
FOR i = 1 TO n_plots_t DO
FOR j = 1 TO n_plots_tplus DO
// Bray-Curtis dissimilarity for species composition
numerator ← sum over species s of |abundance_t[i, s] - abundance_tplus[j, s]|
denominator ← sum over species s of (abundance_t[i, s] + abundance_tplus[j, s])
bc_distance ← numerator / denominator
// Geographic distance (kilometers)
geo_distance ← sqrt((x_t[i] - x_tplus[j])^2 +
(y_t[i] - y_tplus[j])^2)
// Combined cost (α = 0.7 emphasizes species composition)
cost[i, j] ← 0.7 · bc_distance + 0.3 · (geo_distance / max_distance)
END FOR
END FOR
plot_correspondence ← lap_solve(cost)
For large studies () a hierarchical approach by region is more practical:
// Hierarchical decomposition by geographic region
// 1. Divide landscape into spatial grid (e.g. 10 km × 10 km cells)
regions_t ← spatial_partition(plots_t, grid_size = 10 km)
regions_tplus ← spatial_partition(plots_tplus, grid_size = 10 km)
// 2. Compute region representatives
FOR each region r DO
mean_composition[r] ← average species vector across plots in r
centroid[r] ← geographic center of r
END FOR
// 3. Match regions (small LAP: ~100 regions)
region_cost ← compute_cost(mean_composition, centroids, α = 0.7, β = 0.3)
region_assignment ← lap_solve(region_cost)
// 4. Within matched regions, solve plot-level LAP
full_assignment ← []
FOR r = 1 TO n_regions DO
r_matched ← region_assignment[r]
plots_A ← plots in regions_t[r]
plots_B ← plots in regions_tplus[r_matched]
// Local LAP (smaller problem, e.g. 50 × 50)
cost_local ← compute_plot_cost(plots_A, plots_B, α = 0.7, β = 0.3)
local_assignment ← lap_solve(cost_local)
full_assignment ← append(full_assignment, local_assignment)
END FOR
RETURN full_assignment
This allows tracking individual plot trajectories across time, distinguishing stable communities, successional trends, and invasion fronts.
Physics: Particle Tracking
Problem: track particles between frame and in experimental video.
Feature distance: differences in intensity, size, or shape.
Spatial distance: displacement relative to predicted motion:
where is the estimated velocity from previous frames.
We also impose a maximum displacement beyond which matches are physically implausible.
Exact solution (moderate ):
// Pseudocode for particle tracking with velocity prediction
// Initialize cost matrix as forbidden everywhere
cost ← matrix(Inf, n_particles_t, n_particles_tplus)
FOR i = 1 TO n_particles_t DO
// Predict position using previous velocity
x_predicted ← x_t[i] + v_x_t[i] · Δt
y_predicted ← y_t[i] + v_y_t[i] · Δt
FOR j = 1 TO n_particles_tplus DO
// Distance from predicted position
dx ← x_predicted - x_tplus[j]
dy ← y_predicted - y_tplus[j]
spatial_distance ← sqrt(dx^2 + dy^2)
// Only consider physically plausible matches
IF spatial_distance <= max_displacement THEN
// Feature similarity (intensity, size, etc.)
feature_distance ← |intensity_t[i] - intensity_tplus[j]|
// Combined cost
cost[i, j] ← α · feature_distance + β · spatial_distance
END IF
END FOR
END FOR
// Solve assignment (Inf entries are forbidden)
particle_tracks ← lap_solve(cost)
// Update velocities from assignments
FOR i = 1 TO n_particles_t DO
j ← particle_tracks[i]
velocity_new[i] ← (position_tplus[j] - position_t[i]) / Δt
END FOR
For dense tracking (), we can first cluster particles:
// Two-stage: clustering then local matching
// Stage 1: spatial clustering
clusters_t ← spatial_cluster(particles_t, radius = 2 · pixel_size)
clusters_tplus ← spatial_cluster(particles_tplus, radius = 2 · pixel_size)
// Compute cluster representatives
FOR each cluster c DO
centroid[c] ← mean position of particles in c
mean_intensity[c] ← mean intensity
mean_velocity[c] ← mean velocity (if available)
END FOR
// Match clusters
cluster_cost ← compute_cluster_similarity(clusters_t, clusters_tplus)
cluster_tracks ← lap_solve(cluster_cost)
// Stage 2: within matched clusters, track individual particles
full_tracks ← []
FOR c = 1 TO n_clusters DO
c_matched ← cluster_tracks[c]
particles_A ← particles in clusters_t[c]
particles_B ← particles in clusters_tplus[c_matched]
cost_local ← compute_particle_distance(
particles_A, particles_B,
max_displacement = 5,
α = 0.3,
β = 0.7
)
local_tracks ← lap_solve(cost_local)
full_tracks ← append(full_tracks, local_tracks)
END FOR
RETURN full_tracks
This yields efficient and robust trajectories even for very dense particle fields.
Chemistry: Molecular Conformation Alignment
Problem: align two conformations of the same molecule (e.g. a protein) with atoms to compute RMSD and analyze structural change.
Feature distance: strict element matching
Spatial distance: 3D Euclidean distance between atomic coordinates.
Exact LAP for small molecules:
// Pseudocode for molecular conformation alignment
n_atoms ← number of atoms in molecule
cost ← matrix(0, n_atoms, n_atoms)
FOR i = 1 TO n_atoms DO
FOR j = 1 TO n_atoms DO
// Enforce strict element type matching
IF element_type_A[i] ≠ element_type_B[j] THEN
cost[i, j] ← Inf
ELSE
dx ← x_A[i] - x_B[j]
dy ← y_A[i] - y_B[j]
dz ← z_A[i] - z_B[j]
cost[i, j] ← sqrt(dx^2 + dy^2 + dz^2)
END IF
END FOR
END FOR
// Solve alignment
alignment ← lap_solve(cost)
// Compute RMSD
sum_sq_dist ← 0
FOR i = 1 TO n_atoms DO
j ← alignment[i]
sum_sq_dist ← sum_sq_dist + cost[i, j]^2
END FOR
rmsd ← sqrt(sum_sq_dist / n_atoms)
For large biomolecules, we again use a hierarchical strategy, this time by secondary structure elements (helices, sheets, loops, etc.), aligning segments first and then atoms within matched segments.
Implementation Notes
Customizing Morph Duration
The morphing examples use default settings, but you can customize the number of frames and speed:
# From inst/scripts/generate_examples.R
generate_morph <- function(assignment, pixels_A, pixels_B,
n_frames = 30, # number of frames
frame_delay = 0.1) # delay between frames (seconds)
{
frames <- lapply(seq(0, 1, length.out = n_frames), function(t) {
interpolate_frame(t, assignment, pixels_A, pixels_B)
})
save_gif(frames, delay = frame_delay)
}Total animation duration is n_frames * frame_delay
seconds.
Using the Example Code
The morphing implementation is provided in
inst/scripts/generate_examples.R:
# View the source
example_script <- system.file("scripts", "generate_examples.R", package = "couplr")
file.show(example_script)
# Or source to use its helpers
source(example_script)
# Apply to your own data
my_cost <- build_cost_matrix(my_data_A, my_data_B)
my_assignment <- lap_solve(my_cost)Regenerating Examples
To regenerate all demo GIFs and PNGs:
source("inst/scripts/generate_examples.R")This will write the assets under inst/extdata.
Mathematical Foundation: Optimal Transport
The matching problems discussed here are discrete instances of optimal transport.
Kantorovich Relaxation
Kantorovich (1942) relaxed this to a transport plan on :
subject to marginal constraints on .
Further Reading
Optimal Transport Theory
Peyré, G., & Cuturi, M. (2019). Computational Optimal Transport. Foundations and Trends in Machine Learning.
Villani, C. (2008). Optimal Transport: Old and New. Springer.
Scientific Applications
Ecology
Anderson, M. J. et al. (2011). Navigating the multiple meanings of beta diversity. Ecology Letters.
Legendre, P., & Legendre, L. (2012). Numerical Ecology. Elsevier.
Physics
Adrian, R. J., & Westerweel, J. (2011). Particle Image Velocimetry. Cambridge University Press.
Crocker, J. C., & Grier, D. G. (1996). Methods of digital video microscopy. Journal of Colloid and Interface Science.
Chemistry
Kabsch, W. (1976). A solution for the best rotation to relate two sets of vectors. Acta Crystallographica.
Coutsias, E. A. et al. (2004). Using quaternions to calculate RMSD. Journal of Computational Chemistry.
Assignment Algorithms
Burkard, R., Dell’Amico, M., & Martello, S. (2009). Assignment Problems. SIAM.
For implementation details in this package see
vignette("algorithms").
Connection to couplr Workflows
The approximation strategies in this vignette become relevant when working with couplr’s practical matching functions.
When Do These Strategies Apply?
| Strategy | couplr Implementation | When to Use |
|---|---|---|
| Exact LAP | match_couples(method = "jv") |
n < 3,000 |
| Feature quantization | Implicit via scale = "robust"
|
Reduces effective feature space |
| Hierarchical | matchmaker(block_type = "cluster") |
n > 3,000, use blocking |
| Resolution reduction | Custom code | Very large n |
Practical Recommendations
For n < 3,000: Use match_couples()
with exact algorithms:
result <- match_couples(left, right, vars = c("x", "y", "z"), auto_scale = TRUE)For 3,000 < n < 10,000: Use blocking to create smaller subproblems:
blocks <- matchmaker(left, right, block_type = "cluster", n_blocks = 10)
result <- match_couples(blocks$left, blocks$right, vars = vars, block_id = "block_id")For n > 10,000: Use greedy matching:
result <- greedy_couples(left, right, vars = vars, strategy = "sorted")For n > 50,000: Combine strategies (blocking + greedy within blocks), or implement custom approximations using the techniques in this vignette.
Limitations of Approximation Strategies
Each approximation trades accuracy for speed. Know the failure modes:
| Strategy | Works Well When | Fails When |
|---|---|---|
| Feature quantization | Features cluster naturally | Continuous features, fine distinctions matter |
| Hierarchical | Spatial locality is meaningful | Optimal matches cross boundaries |
| Resolution reduction | Coarse structure suffices | Fine detail matters |
Summary
This vignette explored optimal matching through the lens of pixel morphing and scientific applications.
Key Ideas:
Assignment = matching: LAP finds optimal correspondences between two sets
Scalability matters: becomes prohibitive for
Three approximations: Feature quantization, hierarchical decomposition, resolution reduction
Same math, different domains: Pixels, particles, plots, and atoms all use the same algorithms
The same algorithms that morph images smoothly also track particles
in physics, align molecules in chemistry, and match vegetation plots in
ecology. Together, the methods in couplr let you move
between exact optimal matchings and principled approximations, depending
on problem size and accuracy requirements.
See Also
vignette("getting-started")- Basic LAP solvingvignette("algorithms")- Mathematical foundationsvignette("matching-workflows")- Production matching pipelines



