Unions the geometries within each by group into a single feature (the GIS
"Dissolve" tool), optionally summarising attributes. Unlike the streamed
per-batch verbs, dissolve needs every geometry of a group together to union
them, so it rides the partition tier: x is spilled once and routed into
one disjoint shard per group in a single bounded pass, then each shard is read
in and unioned with sf. Peak memory is the routing budget during the
pass, then one group's geometries while it is unioned – partition the input
on a key whose groups fit in memory. With no by, the whole layer dissolves
into one feature.
Usage
spatial_dissolve(
x,
by = NULL,
...,
geom = "geometry",
crs = NA,
.fun = NULL,
flush_rows = NULL
)Arguments
- x
A
vectra_node(fromtbl(),tbl_tiff(), any verb chain, ...). It is consumed by the stream.- by
Character vector of attribute columns to dissolve within: one output feature per distinct combination of their values.
NULL(default) dissolves the entire layer into a single feature.- ...
Further arguments passed to
sf::st_union()(e.g.is_coverage = TRUE).- geom
Name of the input geometry column holding hex-WKB or WKT strings. Default
"geometry". Ignored whencoordsis given.- crs
Coordinate reference system of the input geometry, in any form
sf::st_crs()accepts (EPSG integer, WKT, proj string). Defaults to the CRS the upstream node carries, or unknown.- .fun
Optional named list of attribute summaries. Each element is a function taking the group's data.frame and returning a length-1 value; the list name becomes the output column (e.g.
.fun = list(total = function(d) sum(d$pop))). DefaultNULLkeeps only thebycolumns and the dissolved geometry.- flush_rows
Transformed rows buffered before a spill flush. Larger values mean fewer, bigger temporary files. Defaults to
getOption("vectra.spatial_flush", 5e5).
Value
A vectra_node of one row per group – the by columns, any .fun
summaries, and the dissolved geometry – backed by temporary .vtr spills
removed when the node is garbage-collected, carrying the input CRS for
collect_sf().
Details
Geometry travels through the engine as hex-encoded WKB in a string column and
the CRS is carried on the returned node; use collect_sf() to materialize.
On projected or unprojected planar data each group is unioned natively on the
GEOS C API straight off the hex-WKB column; geographic coordinates with
spherical geometry on (sf::sf_use_s2()), or any extra sf::st_union()
arguments (e.g. is_coverage = TRUE), union through sf instead. The
sf package is an optional dependency (Suggests).
See also
spatial_overlay() to split overlaps apart rather than merge them,
offload() for the partition tier this rides on, collect_sf().
Examples
nc <- sf::st_read(system.file("shape/nc.shp", package = "sf"), quiet = TRUE)
nc$band <- nc$SID74 > 5 # an attribute to dissolve within
f <- tempfile(fileext = ".vtr")
write_vtr(data.frame(
band = nc$band, BIR74 = nc$BIR74,
geometry = sf::st_as_binary(sf::st_geometry(nc), hex = TRUE)
), f)
# Merge the counties into two features by `band`, summing births.
merged <- tbl(f) |>
spatial_dissolve(by = "band", crs = sf::st_crs(nc),
.fun = list(births = function(d) sum(d$BIR74)))
collect_sf(merged)
unlink(f)