27  Building a DSL

Suppose you write y ~ x1 + x2 and R does not add anything. The plus sign, which in every other context means addition, now means “include this predictor in the model.” The tilde computes nothing; it freezes two expressions into an object for later interpretation. You have been using this notation since your first linear regression, probably without asking the obvious question: how does R let ordinary operators mean something completely different depending on context? The answer runs through everything you have learned so far: non-standard evaluation, operator overloading, and S3 dispatch combine to let you rewrite what R’s syntax means.

You already have the pieces. quote() captures code as data (Section 26.1), substitute() captures the caller’s code (Section 26.3), S3 dispatch selects behavior by type (Section 24.1), and operators are just functions (Section 7.5). This chapter puts them together: you will dissect DSLs you already use, then build one from scratch.

A domain-specific language, or DSL, is a small language designed for one task. SQL queries data. Regular expressions match text. HTML structures documents. These are external DSLs with their own parsers, their own syntax, their own tooling. R excels at something different: internal DSLs, mini-languages embedded inside R itself, using R’s own syntax but giving it new meaning. The formula interface, ggplot2’s grammar of graphics, dplyr’s verb chains, data.table’s [i, j, by] notation: each lives inside R and plays by R’s rules (mostly). What makes R unusually good at this is non-standard evaluation and operator overloading: the ability to capture user expressions without evaluating them, reinterpret what operators mean, and dispatch behavior through S3.

By the mid-1960s, programmers were drowning in general-purpose syntax that forced every domain into the same mold. Peter Landin argued in 1966 that they should be able to build “a language appropriate to the problem.” Three decades later, Guy Steele’s keynote “Growing a Language” made the same case from the other side: a good language is one that users can extend until it fits the domain. R’s formula interface, which dates to S in the 1980s, predates both papers, but the instinct is the same: capture code, give it new meaning.

27.1 R’s existing DSLs

The test of a DSL is not cleverness of syntax but consistency of mapping: every domain concept corresponds to a language construct, and the correspondence holds without exceptions that the user must memorize. R’s existing DSLs each take a different approach to achieving this.

Formulas. y ~ x1 + x2 is R’s oldest DSL. The ~ operator creates a formula interface object storing two expressions (the left- and right-hand sides) plus the environment where it was created; it computes nothing. Inside the formula, + does not mean addition but “include this term,” * means “include main effects and their interaction,” and : means “interaction only.” This is code that means something completely different from its usual meaning, interpreted not by R’s evaluator but by model.matrix(). You explored this in Section 26.5. But formulas freeze structure at definition time. What if you want to build structure incrementally, adding pieces one at a time?

ggplot2. The + operator for ggplot objects does not add numbers. It layers graphical components:

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_smooth(method = "lm")

aes() captures column names as expressions without evaluating them, while +.gg (the S3 method for + on ggplot objects) takes the existing plot and attaches the new layer. This is the + operator overloading pattern in ggplot2. The plot builds incrementally, one + at a time, accumulating a list of layers, scales, and coordinates. This is an expression builder pattern: each + returns a new (or modified) plot object, and the final object is rendered only when printed. Incremental building solves the formula problem, but it introduces a new one: every + is a standalone layer, with no awareness of what the other layers are doing. What if the verbs themselves need to know about the data?

dplyr. filter(penguins, species == "Adelie") captures species == "Adelie" as an expression and evaluates it against the data frame using data masking (Section 26.3). The pipe operator chains verbs together, each doing one thing: filter rows, select columns, create new columns, summarize groups. Because the verbs are S3 generics, they dispatch on the data source. filter.data.frame works on data frames; filter.tbl_lazy generates SQL. Same syntax, different backend. That generality comes at a cost, though: each verb is a separate function call, and the pipe threading them together is syntactic, not semantic. Could you collapse the whole pipeline into a single expression?

data.table. DT[i, j, by] overloads the [ operator so that all three arguments are captured by substitute() and evaluated in the data.table’s scope. No pipe, no chain of function calls. Everything happens inside [, which is both concise and dense.

Each DSL makes a different trade-off: formulas redefine arithmetic operators inside a special object, ggplot2 overloads + on a class, dplyr uses data masking and generic dispatch, data.table overloads [. The technique varies, but the principle does not. Capture code, give it new meaning. What happens when you try this yourself?

Exercises

  1. Type ggplot2::aes at the console (no parentheses) and read the source. Where does it capture the user’s expressions? What function does it use?
  2. Look at the source of ggplot2’s internal "+.gg" function (use the triple-colon operator). What does + actually do to a ggplot object?
  3. Compare filter(df, x > 5) to df[df$x > 5, ]. What does dplyr’s version gain from NSE? What does it cost?

27.2 How aes() works

aes() deserves a close look because it demonstrates the full pattern in miniature: capture expressions, store them, evaluate them later in the right context.

mapping <- ggplot2::aes(x = wt, y = mpg, color = cyl)
mapping
#> Aesthetic mapping: 
#> * `x`      -> `wt`
#> * `y`      -> `mpg`
#> * `colour` -> `cyl`

What comes back is a list of quosures, each pairing an expression (wt, mpg, cyl) with the environment where it was written. The column names were never evaluated; they were captured and stored for later. But capturing is only half the problem. The expression still has to find the right data at the right time, and the quosure’s environment is the user’s environment, not the data frame.

mapping$x
#> <quosure>
#> expr: ^wt
#> env:  global
class(mapping$x)
#> [1] "quosure" "formula"
rlang::get_expr(mapping$x)
#> wt

When ggplot2 eventually builds the plot, it evaluates these quosures against the data frame, so wt is looked up in mtcars, not in the global environment. This separation of capture and evaluation is the mechanism that makes the entire system work: you write code that looks like it refers to columns directly, and the library quietly intercepts those references and redirects them to the data.

The +.gg method, meanwhile, simply takes the existing plot object, attaches the new component (a layer, a scale, a theme), and returns the modified plot. The plot is a list that grows with each +. Rendering happens only when you print it.

This two-phase design (build a description, then execute it) appears in SQL query builders, in compiler intermediate representations, and in ggplot2. The user describes what they want; the system decides how to produce it. But you do not need ggplot2’s sophistication to exploit the same pattern.

27.3 Building a unit-aware DSL

Here is a concrete problem: unit arithmetic is error-prone because adding meters to seconds is meaningless, yet nothing in base R stops you from doing it. A small DSL can enforce dimensional correctness at the language level, so that meters(5) + meters(3) returns 8 meters, meters(5) + seconds(3) throws an error, and meters(100) |> to("km") returns 0.1 km. The DSL is deliberately small, but it exercises every technique from the last three chapters: S3 classes, operator overloading, constructors, validation, and dispatch.

Start with the class. A unit value is a number paired with a unit string, using an S3 class:

new_unit <- function(value, unit) {
  structure(list(value = value, unit = unit), class = "unit_val")
}

print.unit_val <- function(x, ...) {
  cat(x$value, x$unit, "\n")
  invisible(x)
}

Then add constructor functions for common units:

meters  <- function(x) new_unit(x, "m")
seconds <- function(x) new_unit(x, "s")
kg      <- function(x) new_unit(x, "kg")
meters(5)
#> 5 m
seconds(3)
#> 3 s

These constructors are the DSL’s vocabulary, the words users will actually write. Each one creates a unit_val object, and the user never calls new_unit() directly. Vocabulary in place, what about grammar?

27.3.1 Operator overloading

Overload + so that it works on unit_val objects, but only when the units match:

"+.unit_val" <- function(a, b) {
  if (a$unit != b$unit) {
    stop(sprintf("cannot add %s and %s", a$unit, b$unit))
  }
  new_unit(a$value + b$value, a$unit)
}
meters(5) + meters(3)
#> 8 m
meters(5) + seconds(3)
#> Error in `+.unit_val`:
#> ! cannot add m and s

The error is immediate and readable: no silent coercion, no mysterious NA. The same pattern works for -, *, and /, though multiplication and division need to combine units (meters times seconds gives meter-seconds, for instance). For brevity, handle just addition and subtraction here:

"-.unit_val" <- function(a, b) {
  if (missing(b)) return(new_unit(-a$value, a$unit))  # unary negation
  if (a$unit != b$unit) {
    stop(sprintf("cannot subtract %s from %s", b$unit, a$unit))
  }
  new_unit(a$value - b$value, a$unit)
}

Note the missing(b) check. R uses - both as a binary operator (a - b) and a unary operator (-a), dispatching to the same method in both cases but leaving b missing when there is only one argument. Without this check, -meters(5) would blow up trying to access b$unit on something that does not exist.

meters(10) - meters(3)
#> 7 m
-meters(5)
#> -5 m

You now have a language for unit-safe arithmetic. meters(5) + meters(3) reads like a sentence and does the right thing; meters(5) + seconds(3) fails clearly and immediately. This is what a DSL buys you: correct code that is easy to write and incorrect code that is hard to ignore. But arithmetic on its own only gets you so far.

27.3.2 Unit conversion

You also need to move between scales. A lookup table maps unit pairs to conversion factors:

conversions <- list(
  "m_to_km"  = 0.001,
  "km_to_m"  = 1000,
  "s_to_min" = 1 / 60,
  "min_to_s" = 60,
  "kg_to_g"  = 1000,
  "g_to_kg"  = 0.001
)

to <- function(x, target) {
  if (!inherits(x, "unit_val")) {
    stop(sprintf("x must be a unit_val, got %s", class(x)[1]))
  }
  if (!is.character(target) || length(target) != 1) {
    stop("target must be a single character string")
  }
  # The DSL's operators maintain the unit_val invariant internally,
  # but to() is a boundary function: it accepts a string from the
  # user, so we validate here to keep invalid states out of the system.
  key <- paste0(x$unit, "_to_", target)
  factor <- conversions[[key]]
  if (is.null(factor)) {
    stop(sprintf("no conversion from %s to %s", x$unit, target))
  }
  new_unit(x$value * factor, target)
}
meters(1500) |> to("km")
#> 1.5 km
meters(5) |> to("min")
#> Error in `to()`:
#> ! no conversion from m to min

The pipe (|>) makes conversion read naturally: “take 1500 meters, convert to km.” And because every operation returns a unit_val, the DSL is composable; you can chain arithmetic and conversion without any glue code:

(meters(500) + meters(1000)) |> to("km")
#> 1.5 km

27.3.3 Comparison operators

Overload comparison so that units are checked there too:

">.unit_val" <- function(a, b) {
  if (a$unit != b$unit) stop(sprintf("cannot compare %s and %s", a$unit, b$unit))
  a$value > b$value
}

"==.unit_val" <- function(a, b) {
  if (a$unit != b$unit) stop(sprintf("cannot compare %s and %s", a$unit, b$unit))
  a$value == b$value
}
meters(10) > meters(5)
#> [1] TRUE
meters(5) == meters(5)
#> [1] TRUE

Every operator enforces the same invariant: you cannot mix units. The type system (S3 classes) and operator overloading work together to make invalid states unrepresentable, which is the whole point of designing a DSL rather than writing bare functions. But the unit system is small, almost toy-like. What techniques would you reach for when the domain gets more complex?

Exercises

  1. Add a *.unit_val method that combines units. meters(5) * seconds(2) should return 10 m*s. (Hint: paste the unit strings together with * as a separator.)
  2. Add a format.unit_val method so that paste("Distance:", meters(42)) produces "Distance: 42 m".
  3. Add celsius() and fahrenheit() constructors plus a to() conversion between them. This one is not a simple multiplication; you need an offset. How does that change the design of the conversions table?

27.4 Techniques for DSL construction

The unit DSL used three techniques: S3 classes, operator overloading, and constructor functions. More ambitious DSLs draw on a larger toolkit.

Operator overloading. R lets you define methods for +, -, *, /, [, [[, <, >, ==, |, &, and more. You can also create custom infix operators with the %op% syntax:

"%to%" <- function(x, target) to(x, target)
meters(1500) %to% "km"
#> 1.5 km

Custom infixes are useful when the built-in operators do not express the right meaning. The %>% pipe from magrittr is the most famous example; %in% is a base R one.

NSE for expression capture. If your DSL needs to capture column names, formulas, or unevaluated expressions, use substitute() (base R) or enquo() (rlang). The unit DSL did not need NSE because its inputs are plain values, but a query-builder DSL would:

where <- function(.data, expr) {
  e <- substitute(expr)
  rows <- eval(e, .data, parent.frame())
  .data[rows, ]
}
where(mtcars, mpg > 30)

That is a miniature version of what dplyr::filter() does: capture the expression, evaluate it against the data frame.

Formula interfaces. Use formulas when your DSL needs two-sided specifications. The ~ already prevents evaluation, and the formula carries its environment:

specify <- function(formula, data) {
  lhs <- all.vars(formula[[2]])
  rhs <- all.vars(formula[[3]])
  list(response = lhs, predictors = rhs)
}
specify(mpg ~ wt + hp, mtcars)

S3 dispatch as polymorphism. If your DSL should work on multiple data backends (data frames, databases, remote APIs), define your verbs as S3 generics so each backend gets its own method. This is exactly how dplyr supports data frames and SQL databases with the same syntax.

Expression builders. The ggplot2 approach: each operation returns an object, and the objects compose. The user builds up a description while execution is deferred until the end, separating specification from computation and making the DSL composable. When should you not use these techniques?

27.5 eval(parse(text = ...)): the anti-pattern

You will encounter code like this:

col_name <- "mpg"
eval(parse(text = paste0("mtcars$", col_name)))

It constructs R code as a string, parses it into an expression, and evaluates it. It works. It is also almost always wrong.

The problems are concrete. First, injection: if col_name comes from user input, a string like "mpg; system('rm -rf /')" gets parsed and executed. Second, debugging: when the generated code errors, the traceback points at the parsed text, not your source file. Third, tooling: static analysis, linting, and IDE autocompletion cannot see inside a string.

The alternative is to work with expressions as data structures, not strings. Everything you learned in Chapter 26 avoids eval(parse(text = ...)):

col_name <- "mpg"
mtcars[[col_name]]
#>  [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3 15.2
#> [15] 10.4 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3 26.0 30.4
#> [29] 15.8 19.7 15.0 21.4

For more complex cases, rlang::sym() converts a string to a symbol, and !! injects it:

col <- rlang::sym("mpg")
dplyr::summarise(mtcars, mean_val = mean(!!col))
TipOpinion

If you find yourself writing eval(parse(text = ...)), stop and ask whether the problem can be solved with [[, rlang::sym(), call2(), or do.call(). In nearly every case, it can. The string-based approach is a last resort for code generation tasks where you are producing R scripts to be run later, not for interactive computation.

Exercises

  1. Given the expression eval(parse(text = paste0("mean(mtcars$", col, ")"))), rewrite it without eval(parse(...)). Use [[ or rlang::sym().
  2. Write a function safe_select(data, col_name) that takes a string column name and returns that column. Do it without eval(parse(text = ...)).

27.6 Design principles

A DSL is an interface, and the same mistakes that make function APIs hard to use make DSLs worse because the user has fewer escape hatches when things go sideways.

Keep the surface small. A DSL with fifty special operators is a burden, not a language. The formula interface has about six operators (~, +, *, :, -, I()). ggplot2 has a handful of geom types and a +. Resist adding syntax, because each new operator is a concept the user must internalize before they can read anyone else’s code.

Make errors readable. When units do not match, the error says “cannot add m and s.” When a conversion does not exist, it says “no conversion from m to min.” Good error messages name the things the user wrote, not the implementation details, which is Chapter 25 applied to DSL design.

Make it composable. Every expression in the DSL should return something usable as input to the next expression. meters(5) + meters(3) returns a unit_val that you can pass to to() or combine with more values; ggplot2’s + returns a plot object ready for more layers. Composability comes from consistent types: operations on unit_val always return unit_val. Notice the tension with “keep the surface small”: real composability often demands more operators, not fewer. If you can add unit values but not multiply them, the DSL is small but users will hit a wall the moment they need to compute velocity from distance and time. You have to decide which wall is worse: the cognitive load of a larger surface, or the frustration of an expressive dead end. There is no universal answer; the domain decides.

Do not surprise. If your DSL overloads +, it should still be associative: (a + b) + c should equal a + (b + c). If it redefines *, users will expect it to distribute over +. Violating mathematical expectations creates confusion even when the behavior is internally consistent.

Document the grammar. A DSL has a grammar: what expressions are valid, what they mean, how they compose. Write it down, even informally. “A unit_val can be combined with + or - if units match. Use to() to convert between units. Constructors: meters(), seconds(), kg().” Five sentences. That is the complete grammar, and a user who reads it can write any valid expression in the DSL without further instruction. Whether five sentences will still be enough after your DSL has lived in production for a year is a different question.

27.7 Putting it all together

The chapter’s DSL used five ingredients:

  1. S3 class (unit_val) to represent domain objects.
  2. Constructor functions (meters(), seconds(), kg()) to provide vocabulary.
  3. Operator overloading (+.unit_val, -.unit_val) to give arithmetic domain-specific meaning.
  4. Validation (unit mismatch errors) to enforce domain rules.
  5. Composability (every operation returns a unit_val) to make expressions chainable.

A more ambitious DSL would add NSE to capture column names or expressions, use S3 generics to dispatch across backends, and build expression objects that defer evaluation. ggplot2 uses all of these. The progression from “simple class with overloaded operators” to “full expression-builder DSL with lazy evaluation and multiple backends” is a continuum, and you do not need the full machinery for every problem. Start with the simplest technique that makes your interface read naturally; add complexity only when the domain demands it.

The formula interface from the opening of this chapter, where + stopped meaning addition and started meaning “include this predictor” — you now know exactly how that trick works: an S3 class, a method for +, and a function that interprets the captured expression in a new context. R gave the formula system no special privileges. The same tools are available to you. When you find yourself writing repetitive, mechanical code, consider whether a small DSL, even just a class with a few overloaded operators, would let you express the same ideas more directly.

Exercises

  1. Extend the unit DSL with a summary.unit_val method that, given a list of unit_val objects (all with the same unit), returns the min, max, and mean. Validate that all units match.
  2. Design (on paper, no code required) a DSL for describing file-processing pipelines: read a CSV, filter rows, rename columns, write output. What would the verbs be? What class would they operate on? How would they compose?
  3. Read the source of htmltools::tag(). How does it build HTML from R function calls? Which of the techniques from this chapter does it use?
  4. Pick a repetitive task from your own work. Sketch a five-function DSL that would make it more concise. What class would you define? What operators would you overload?