This vignette covers practical workflows for common spatial analysis tasks: sharing grids across datasets, generating grid polygons, choosing resolution, and exporting to GIS formats.
One Grid, Many Datasets
The most powerful pattern in hexify is defining a grid once and reusing it across multiple datasets. This ensures spatial consistency and eliminates parameter repetition.
The Problem
You often have:
- Several independent datasets (observations, sensors, surveys)
- All in longitude/latitude coordinates
- Collected at different times or from different sources
You want to:
- Put everything on one common global grid
- Be sure the grids actually match
- Combine results later without subtle errors
The Solution: Shared Grid Objects
# Step 1: Define the grid once
# This is your shared spatial reference system
grid <- hex_grid(area_km2 = 5000)
print(grid)
#> HexGridInfo Specification
#> -------------------------
#> Aperture: 3
#> Resolution: 8
#> Area: 7773.97 km^2
#> Diagonal: 94.74 km
#> CRS: EPSG:4326
#> Total Cells: 65612
# Step 2: Create multiple datasets with different structures
set.seed(123)
# Dataset 1: Bird observations (note different column names)
bird_obs <- data.frame(
species = sample(c("Passer domesticus", "Turdus merula", "Parus major"), 200, replace = TRUE),
longitude = runif(200, -10, 30),
latitude = runif(200, 35, 60)
)
# Dataset 2: Mammal records
mammal_obs <- data.frame(
species = sample(c("Vulpes vulpes", "Meles meles", "Sciurus vulgaris"), 150, replace = TRUE),
lon = runif(150, -10, 30),
lat = runif(150, 35, 60)
)
# Dataset 3: Climate stations
climate_data <- data.frame(
station_id = paste0("WS", 1:50),
x = runif(50, -10, 30),
y = runif(50, 35, 60),
temp_c = rnorm(50, 12, 5)
)
# Step 3: Attach all datasets to the SAME grid
birds <- hexify(bird_obs, lon = "longitude", lat = "latitude", grid = grid)
mammals <- hexify(mammal_obs, lon = "lon", lat = "lat", grid = grid)
climate <- hexify(climate_data, lon = "x", lat = "y", grid = grid)
cat("Birds: ", nrow(as.data.frame(birds)), "observations in", n_cells(birds), "cells\n")
#> Birds: 200 observations in 182 cells
cat("Mammals:", nrow(as.data.frame(mammals)), "observations in", n_cells(mammals), "cells\n")
#> Mammals: 150 observations in 140 cells
cat("Climate:", nrow(as.data.frame(climate)), "stations in", n_cells(climate), "cells\n")
#> Climate: 50 stations in 49 cellsCombining at the Cell Level
Once data are hexified, longitude/latitude no longer matter for
analysis. The cell_id becomes the shared spatial key:
# Extract data frames with cell IDs
birds_df <- as.data.frame(birds)
birds_df$cell_id <- birds@cell_id
mammals_df <- as.data.frame(mammals)
mammals_df$cell_id <- mammals@cell_id
climate_df <- as.data.frame(climate)
climate_df$cell_id <- climate@cell_id
# Aggregate each dataset by cell
bird_richness <- aggregate(
species ~ cell_id,
data = birds_df,
FUN = function(x) length(unique(x))
)
names(bird_richness)[2] <- "bird_species"
mammal_richness <- aggregate(
species ~ cell_id,
data = mammals_df,
FUN = function(x) length(unique(x))
)
names(mammal_richness)[2] <- "mammal_species"
mean_temp <- aggregate(
temp_c ~ cell_id,
data = climate_df,
FUN = mean
)
names(mean_temp)[2] <- "mean_temp"
# Join datasets by cell_id - guaranteed to align because same grid
combined <- merge(bird_richness, mammal_richness, by = "cell_id", all = TRUE)
combined <- merge(combined, mean_temp, by = "cell_id", all = TRUE)
head(combined)
#> cell_id bird_species mammal_species mean_temp
#> 1 82 1 NA NA
#> 2 162 1 NA NA
#> 3 163 1 NA NA
#> 4 325 NA 1 NA
#> 5 487 1 NA NA
#> 6 6637 1 NA NAGrid Generation
hexify provides functions to generate grid polygons over regions for visualization and analysis.
Grid Over a Rectangular Region
# Create grid specification
grid <- hex_grid(area_km2 = 5000)
# Generate hexagons over Western Europe
europe_hexes <- grid_rect(c(-10, 35, 25, 60), grid)
# Get European countries for context
europe <- hexify_world[hexify_world$continent == "Europe", ]
ggplot() +
geom_sf(data = europe, fill = "gray95", color = "gray60") +
geom_sf(data = europe_hexes, fill = NA, color = "steelblue", linewidth = 0.4) +
coord_sf(xlim = c(-10, 25), ylim = c(35, 60)) +
labs(title = sprintf("Hexagonal Grid (~%.0f km² cells)", grid@area_km2)) +
theme_minimal()Grid Over a Polygon (Shapefile)
Clip a grid to any sf polygon boundary:
# Get France boundary
france <- hexify_world[hexify_world$name == "France", ]
# Generate grid covering mainland France
grid <- hex_grid(area_km2 = 2000)
france_grid <- grid_rect(c(-5, 41, 10, 52), grid)
# Clip grid to France boundary
france_grid_clipped <- st_intersection(france_grid, st_geometry(france))
#> Warning: attribute variables are assumed to be spatially constant throughout
#> all geometries
ggplot() +
geom_sf(data = france, fill = "gray95", color = "gray40", linewidth = 0.5) +
geom_sf(data = france_grid_clipped, fill = alpha("steelblue", 0.3),
color = "steelblue", linewidth = 0.3) +
coord_sf(xlim = c(-5, 10), ylim = c(41, 52)) +
labs(title = sprintf("Hexagonal Grid Clipped to France (~%.0f km² cells)", grid@area_km2)) +
theme_minimal()Global Grid
# Coarse global grid (be careful with fine grids - many cells!)
grid <- hex_grid(area_km2 = 500000)
global_hexes <- grid_global(grid)
ggplot() +
geom_sf(data = hexify_world, fill = "gray90", color = "gray70", linewidth = 0.2) +
geom_sf(data = global_hexes, fill = NA, color = "darkgreen", linewidth = 0.3) +
labs(title = sprintf("Global Hexagonal Grid (~%.0f km² cells)", grid@area_km2)) +
theme_minimal() +
theme(axis.text = element_blank(), axis.ticks = element_blank())Multi-Resolution Analysis
Analyze data at multiple spatial scales using different target areas.
# Sample data
set.seed(42)
observations <- data.frame(
species = sample(c("Species A", "Species B", "Species C"), 1000, replace = TRUE),
lon = runif(1000, -10, 30),
lat = runif(1000, 35, 60)
)
# Fine resolution (~100 km² cells)
grid_fine <- hex_grid(area_km2 = 100)
obs_fine <- hexify(observations, lon = "lon", lat = "lat", grid = grid_fine)
# Coarse resolution (~10000 km² cells)
grid_coarse <- hex_grid(area_km2 = 10000)
obs_coarse <- hexify(observations, lon = "lon", lat = "lat", grid = grid_coarse)
cat(sprintf("Fine resolution: %d unique cells (area: %.1f km²)\n",
n_cells(obs_fine), grid_fine@area_km2))
#> Fine resolution: 996 unique cells (area: 96.0 km²)
cat(sprintf("Coarse resolution: %d unique cells (area: %.1f km²)\n",
n_cells(obs_coarse), grid_coarse@area_km2))
#> Coarse resolution: 649 unique cells (area: 7774.0 km²)Scale-Dependent Patterns
# Extract data with cell IDs
fine_df <- as.data.frame(obs_fine)
fine_df$cell_id <- obs_fine@cell_id
coarse_df <- as.data.frame(obs_coarse)
coarse_df$cell_id <- obs_coarse@cell_id
# Species richness at each scale
richness_fine <- aggregate(species ~ cell_id, data = fine_df,
FUN = function(x) length(unique(x)))
richness_coarse <- aggregate(species ~ cell_id, data = coarse_df,
FUN = function(x) length(unique(x)))
cat(sprintf("Fine scale: mean %.2f species per cell\n", mean(richness_fine$species)))
#> Fine scale: mean 1.00 species per cell
cat(sprintf("Coarse scale: mean %.2f species per cell\n", mean(richness_coarse$species)))
#> Coarse scale: mean 1.33 species per cellSpatial Joins
Join datasets based on shared grid cells rather than proximity.
# Dataset 1: Weather stations
stations <- data.frame(
station_id = paste0("ST", 1:50),
lon = runif(50, -10, 30),
lat = runif(50, 35, 60),
temperature = rnorm(50, 15, 5)
)
# Dataset 2: Cities
cities <- data.frame(
city = c("Vienna", "Paris", "London", "Berlin", "Rome",
"Madrid", "Prague", "Warsaw", "Budapest", "Amsterdam"),
lon = c(16.37, 2.35, -0.12, 13.40, 12.50,
-3.70, 14.42, 21.01, 19.04, 4.90),
lat = c(48.21, 48.86, 51.51, 52.52, 41.90,
40.42, 50.08, 52.23, 47.50, 52.37)
)
# Use a coarse grid for joining disparate points
grid <- hex_grid(area_km2 = 50000)
# Hexify both datasets with the same grid
stations_hex <- hexify(stations, lon = "lon", lat = "lat", grid = grid)
cities_hex <- hexify(cities, lon = "lon", lat = "lat", grid = grid)
# Extract with cell IDs
stations_df <- as.data.frame(stations_hex)
stations_df$cell_id <- stations_hex@cell_id
cities_df <- as.data.frame(cities_hex)
cities_df$cell_id <- cities_hex@cell_id
# Join by cell_id
city_weather <- merge(
cities_df[, c("city", "cell_id")],
aggregate(temperature ~ cell_id, data = stations_df, FUN = mean),
by = "cell_id",
all.x = TRUE
)
city_weather
#> cell_id city temperature
#> 1 865 London NA
#> 2 1478 Madrid NA
#> 3 1482 Paris NA
#> 4 1484 Amsterdam NA
#> 5 1540 Berlin NA
#> 6 1567 Prague 21.64748
#> 7 1591 Rome NA
#> 8 1594 Vienna NA
#> 9 1621 Budapest 19.43432
#> 10 2240 Warsaw NAChoosing Resolution
By Target Area
Use hex_grid() with area_km2 to get the
closest available resolution:
# Target: 100 km² cells
grid_100 <- hex_grid(area_km2 = 100, aperture = 3)
cat(sprintf("Target ~100 km²: resolution %d (actual: %.1f km²)\n",
grid_100@resolution, grid_100@area_km2))
#> Target ~100 km²: resolution 12 (actual: 96.0 km²)
# Target: 1000 km² cells
grid_1000 <- hex_grid(area_km2 = 1000, aperture = 3)
cat(sprintf("Target ~1000 km²: resolution %d (actual: %.1f km²)\n",
grid_1000@resolution, grid_1000@area_km2))
#> Target ~1000 km²: resolution 10 (actual: 863.8 km²)
# Target: 10000 km² cells
grid_10000 <- hex_grid(area_km2 = 10000, aperture = 3)
cat(sprintf("Target ~10000 km²: resolution %d (actual: %.1f km²)\n",
grid_10000@resolution, grid_10000@area_km2))
#> Target ~10000 km²: resolution 8 (actual: 7774.0 km²)Resolution Table (Aperture 3)
| Resolution | # Cells | Cell Area (km²) | Spacing (km) |
|---|---|---|---|
| 0 | 12 | 42505468.5 | 7005.8 |
| 1 | 32 | 15939550.7 | 4290.2 |
| 2 | 92 | 5544191.5 | 2530.2 |
| 3 | 272 | 1875241.3 | 1471.5 |
| 4 | 812 | 628159.6 | 851.7 |
| 5 | 2.4K | 209730.9 | 492.1 |
| 6 | 7.3K | 69948.7 | 284.2 |
| 7 | 21.9K | 23320.5 | 164.1 |
| 8 | 65.6K | 7774.0 | 94.7 |
| 9 | 196.8K | 2591.4 | 54.7 |
| 10 | 590.5K | 863.8 | 31.6 |
| 11 | 1.8M | 287.9 | 18.2 |
| 12 | 5.3M | 96.0 | 10.5 |
| 13 | 15.9M | 32.0 | 6.1 |
| 14 | 47.8M | 10.7 | 3.5 |
| 15 | 143.5M | 3.6 | 2.0 |
Comparing Apertures
Different apertures offer different resolution steps:
target_area <- 1000 # km²
cat(sprintf("Target: ~%d km² cells\n\n", target_area))
#> Target: ~1000 km² cells
for (ap in c(3, 4, 7)) {
grid <- hex_grid(area_km2 = target_area, aperture = ap)
n_cells <- 10 * (ap^grid@resolution) + 2
cat(sprintf("Aperture %d: resolution %d -> %.1f km² (%s cells globally)\n",
ap, grid@resolution, grid@area_km2,
format(n_cells, big.mark = ",")))
}
#> Aperture 3: resolution 10 -> 863.8 km² (590,492 cells globally)
#> Aperture 4: resolution 8 -> 778.3 km² (655,362 cells globally)
#> Aperture 7: resolution 6 -> 433.5 km² (1,176,492 cells globally)| Aperture | Best For | Trade-offs |
|---|---|---|
| 3 | Fine resolution control, dggridR compatibility | Slowest cell growth |
| 4 | Power-of-2 scaling, GIS workflows | Moderate resolution steps |
| 7 | Rapid cell count growth, coarse analysis | Largest resolution jumps |
| 4/3 | Balance of 4’s fast start + 3’s fine control | More complex indexing |
Working with sf
Convert HexData to sf
# Hexify some data
grid <- hex_grid(area_km2 = 20000)
result <- hexify(cities, lon = "lon", lat = "lat", grid = grid)
# Convert to sf points (uses cell centers)
sf_points <- as_sf(result, geometry = "point")
class(sf_points)
#> [1] "sf" "data.frame"
# Convert to sf polygons (for choropleth maps)
sf_polys <- as_sf(result, geometry = "polygon")
class(sf_polys)
#> [1] "sf" "data.frame"
# Or generate polygons directly from cell IDs
unique_cells <- cells(result)
cell_polys <- cell_to_sf(unique_cells, grid)Visualize with ggplot2
europe <- hexify_world[hexify_world$continent == "Europe", ]
ggplot() +
geom_sf(data = europe, fill = "ivory", color = "gray70") +
geom_sf(data = cell_polys, fill = "steelblue", alpha = 0.5, color = "darkblue") +
coord_sf(xlim = c(-10, 25), ylim = c(35, 58)) +
labs(title = "European Cities - Hexagonal Grid") +
theme_minimal()Export to GIS Formats
Use sf’s st_write() to export grids for use in external
GIS software:
# Generate a grid
grid <- hex_grid(area_km2 = 10000)
europe <- grid_rect(c(-10, 35, 25, 60), grid)
# Export to various formats
st_write(europe, "europe_grid.gpkg", layer = "hexgrid") # GeoPackage
st_write(europe, "europe_grid.shp") # Shapefile
st_write(europe, "europe_grid.geojson") # GeoJSON
st_write(europe, "europe_grid.kml", layer = "hexgrid") # KML (Google Earth)Edge Cases
Handling NA Coordinates
data_with_na <- data.frame(
lon = c(16.37, NA, 2.35, 13.40),
lat = c(48.21, 48.86, NA, 52.52)
)
grid <- hex_grid(area_km2 = 1000)
result <- hexify(data_with_na, lon = "lon", lat = "lat", grid = grid)
#> Warning in hexify(data_with_na, lon = "lon", lat = "lat", grid = grid): 2
#> coordinate pairs contain NA values and will be skipped
# Check which rows have valid cell assignments
cat("Cell IDs:", result@cell_id, "\n")
#> Cell IDs: 126594 246 246 122466
cat("NA indicates invalid coordinates\n")
#> NA indicates invalid coordinatesPolar Regions
Coordinates with latitude > 89° may have projection artifacts. The grid remains valid, but polygon visualization can be distorted near poles.
Date Line
The date line (lon = ±180°) is handled correctly. Use
st_wrap_dateline() when visualizing polygons that cross
it:
# For polygons crossing the date line
wrapped <- st_wrap_dateline(
polygons,
options = c("WRAPDATELINE=YES", "DATELINEOFFSET=180"),
quiet = TRUE
)Function Summary
| Task | Function |
|---|---|
| Create grid specification | hex_grid(area_km2 = ...) |
| Assign points to cells | hexify(df, lon, lat, grid) |
| Get grid from HexData | grid_info(result) |
| Get unique cell IDs | cells(result) |
| Count cells | n_cells(result) |
| Extract data frame | as.data.frame(result) |
| Convert to sf | as_sf(result, geometry = "polygon") |
| Generate polygons | cell_to_sf(cell_ids, grid) |
| Grid over region | grid_rect(bbox, grid) |
| Global grid | grid_global(grid) |
| Coordinate conversion |
lonlat_to_cell(), cell_to_lonlat()
|
| Compare resolutions | hexify_compare_resolutions() |
See Also
-
vignette("quickstart")- Getting started with hexify -
vignette("visualization")- Plotting withplot(),hexify_heatmap() -
vignette("theory")- Mathematical foundations (ISEA projection, apertures)